Hyperparameters have no effect

Chondron · March 13, 2023, 6:21am

Greetings.

As part of my MSc research project I’m using DL4J in conjunction with wekaDeepLearning4J (version 1.7.2) to perform a land classification project. I’m using the Dl4jMlpClassifier amd Dl4jResNet50 model.

I’m trying to “tune” the various hyperparameters such as those pertaining to early stopping, dropout, optimisation algorithm, etc. The issue I have is that none of them have any effect on the results. The only one that does is the mini-batch parameter set on the ImageInstanceIterator, which has pretty dramatic effects even with small changes.

Looking at the D4LJ code I came across this comment in org.deeplearning4j.ui.module.train.TrainModule class

//TODO: Maybe L1/L2, dropout, updater-specific values etc

Does the library support these hyperparameters, and if so why might my model seem to ignore them?

Thanks in advance!

agibsonccc · March 13, 2023, 9:07am

@Chondron yes we do have L1/L2. Do you have some code I can look at? I"m not familiar with all the aspects of the wekadl4j project but I’m happy to take a look.

Chondron · March 13, 2023, 10:59am

Hello. Thanks for replying so promptly. I’m not doing it in code - this was the next step if I couldn’t find a solution (which I’d rather avoid really). I’m setting up/calling the model via the command line/class. See below. I have around 1200 images and the L1 setting of 0.001 (in this instance) doesn’t change the output. Must admit, I’m more interesting in changing the optimization algorithm from SGD to LBFGS and the selecting Gaussian dropout so much as the L1/L2 regularisation factors. Are you suggesting these aren’t implemented as yet?

Thanks again.

weka.classifiers.functions.Dl4jMlpClassifier -S 1 -cache-mode MEMORY -early-stopping “weka.dl4j.earlystopping.EarlyStopping -maxEpochsNoImprovement 0 -valPercentage 0.0” -normalization “Standardize training data” -iterator “weka.dl4j.iterators.instance.ImageInstanceIterator -channelsLast false -height 224 -imagesLocation /home/adventure/MSc/_AAA_Research/data/test_dataset/all -numChannels 3 -width 224 -bs 8” -iteration-listener “weka.dl4j.listener.EpochListener -eval true -n 5” -layer “weka.dl4j.layers.OutputLayer -lossFn "weka.dl4j.lossfunctions.LossMCXENT " -nOut 2 -activation "weka.dl4j.activations.ActivationSoftmax " -name "Output layer"” -logConfig “weka.core.LogConfiguration -append true -dl4jLogLevel WARN -logFile /home/adventure/wekafiles/wekaDeeplearning4j.log -nd4jLogLevel INFO -wekaDl4jLogLevel INFO” -config “weka.dl4j.NeuralNetConfiguration -biasInit 0.0 -biasUpdater "weka.dl4j.updater.Sgd -lr 0.001 -lrSchedule \"weka.dl4j.schedules.ConstantSchedule -scheduleType EPOCH\"" -dist "weka.dl4j.distribution.Disabled " -dropout "weka.dl4j.dropout.Disabled " -gradientNormalization None -gradNormThreshold 1.0 -l1 0.001 -l2 NaN -minimize -algorithm STOCHASTIC_GRADIENT_DESCENT -updater "weka.dl4j.updater.Adam -beta1MeanDecay 0.9 -beta2VarDecay 0.999 -epsilon 1.0E-8 -lr 0.001 -lrSchedule \"weka.dl4j.schedules.ConstantSchedule -scheduleType EPOCH\"" -weightInit XAVIER -weightNoise "weka.dl4j.weightnoise.Disabled "” -numEpochs 10 -numGPUs 1 -averagingFrequency 10 -prefetchSize 24 -queueSize 0 -zooModel "weka.dl4j.zoo.Dl4jResNet50 -channelsLast false -pretrained IMAGENET"F

agibsonccc · March 13, 2023, 11:46am

@Chondron apologies I don’t know either way. Could you tell me what weka version you’re using? I can try checking the source code to see what will/won’t work. Thanks!

Chondron · March 13, 2023, 12:17pm

It’s weka 3.9.6 and wekaDeepLearning4J 1.7.2 (both the most recent versions I think). It looks like the deepLearning4J backend is version 1.0.0-beta7 (judging by the jar files).

Thanks!

agibsonccc · March 13, 2023, 12:34pm

@Chondron beta7 is 2 years old. Those are way out of date. Let me take a look at the wekadeeplearning4j project though. From the looks of it the command line parameter is already there.
You may just be having normal tuning issues.

Could you elaborate more on your dataset and the like? I’d like to see if it’s possible to upgrade the dl4j version there as well.

Chondron · March 13, 2023, 1:06pm

Not sure what you mean by elaborate on the dataset. It’s 1200 224x224 images covering (roughly) 20x20m patches. I’m pretty sure the various hyperparameter settings I’ve experimented with should do something to the results.

The deepLearning4J jars are the ones that come with the wekaDeepLearning4J. I’ll see if i can get hold of newer jars and see if they don’t break the weka bits. Probably won’t be until tomorrow though now.

Cheers!

agibsonccc · March 13, 2023, 9:33pm

@Chondron I wonder if the weka code does any scaling…if it doesn’t then it may not matter how you tune. Going off the guide here: GitHub - Waikato/wekaDeeplearning4j: Weka package for the Deeplearning4j java library

You should be able to create your images a certain way. Ensure you can create them as scaled. (eg: normalize 0 to 1).

Chondron · March 14, 2023, 6:55am

Unfortunately, I’ve no idea what you mean by this.

agibsonccc · March 14, 2023, 7:03am

@Chondron I mean scaling your data zero to 1. Do you know if it does that before training the mdoel?

Chondron · March 14, 2023, 11:10am

I’ve got the jars for 1.0.0-M2.1 - which appears to be the latest release? Seems to work (so far) so I’m having a play. Seems about 4x slower though, which is a bit concerning.

Chondron · March 14, 2023, 11:13am

Sorry, I’m obviously being thick - but scaling my satellite images patches to between one and zero? I really don’t know what that refers to and I’ve not come across anything in the various hyperparameters that seems to match this. Do you have any reference to something that explains? Unless you’re referring to an activation function or something…

Cheers.

agibsonccc · March 14, 2023, 11:23am

@Chondron pixel scaling. 0-255 → 0 to 1.
Almost every neural network has a form of image normalization built in to it to ensure the input data isn’t too large for a network to learn a pattern.

Chondron · March 14, 2023, 12:02pm

Cheers. I’ll expand my understanding by reading around this, and also try see what Weka does (I’d kind of hope it handles this sort of thing OK).

Chondron · March 14, 2023, 12:09pm

Actually, think about it I already knew that! Clearly having a slow neuron day.

Topic		Replies	Views
Low accuracy compared to model trained with Keras DL4J	8	771	August 21, 2020
Classifier indefinitely skips prediction on certain classes DL4J	13	299	February 16, 2023
Error running dl4j library in android studio DL4J	42	561	June 9, 2022
Changing Learning rate	6	1384	July 18, 2020
Lower accuracy for a simple model trained by DL4J than Keras DL4J	8	568	October 14, 2020

Hyperparameters have no effect

Related topics