Differences in Results for optimized model

JanHolzweber · April 6, 2022, 12:44pm

I am trying to optimize my model with arbiter according to the dl4j/arbiter documentation documentation.

Here is the code I use for the optimization:

ContinuousParameterSpace momentum = new ContinuousParameterSpace(0.0001, 0.1);
		ContinuousParameterSpace learningRate = new ContinuousParameterSpace(0.0001, 0.1);
		ContinuousParameterSpace dropoutSpace = new ContinuousParameterSpace(0, 0.25);
		EvaluationScoreFunction scoreFunction = new EvaluationScoreFunction(Metric.ACCURACY);
		MaxTimeCondition[] terminationConditions = { new MaxTimeCondition(5, TimeUnit.MINUTES) };

		MultiLayerSpace hyperparameterSpace = new MultiLayerSpace.Builder()
				// These next few options: fixed values for all models
				.seed(1234567890).weightInit(WeightInit.XAVIER).l2(0.0001)
				.updater(new NesterovsSpace(learningRate, momentum))
				.addLayer(new DenseLayerSpace.Builder().nIn(50).activation(Activation.SOFTMAX).nOut(47).build())
				.addLayer(new OutputLayerSpace.Builder().nOut(2).lossFunction(LossFunction.KL_DIVERGENCE).build())
				.numEpochs(10).backpropType(BackpropType.Standard).dropOut(dropoutSpace).build();

		MyDataSource.setDataSets(trainData, testData);

		OptimizationConfiguration config = new OptimizationConfiguration.Builder()
				.candidateGenerator(generateRandomSearchGenerator(hyperparameterSpace))
				.dataSource(MyDataSource.class, null).scoreFunction(scoreFunction)
				.modelSaver(getSaver("files/ML/Optimize")).terminationConditions(terminationConditions).build();

		LocalOptimizationRunner runner = new LocalOptimizationRunner(config, new MultiLayerNetworkTaskCreator());
		runner.addListeners(new MyStatusListener());
		runner.execute();
		System.out.println(runner.bestScore());

As a result for this particular optimization I am getting a score 0.65 for the accuracy.

Neverthless, if I afterwards load the model and try to evaluate it on the same test data, I get a different result.

========================Evaluation Metrics========================
 # of classes:    2
 Accuracy:        0.5000
 Precision:       0.5000
 Recall:          1.0000
 F1 Score:        0.6667
Precision, recall & F1: reported for positive class (class 1 - "1") only

Warning: 1 class was never predicted by the model and was excluded from average precision
Classes excluded from average precision: [0]

=========================Confusion Matrix=========================
  0  1
-------
  0 73 | 0 = 0
  0 73 | 1 = 1

Confusion matrix format: Actual (rowClass) predicted as (columnClass) N times
==================================================================

I am not sure what exactly causes this.

agibsonccc · April 6, 2022, 12:49pm

@JanHolzweber could you try saving the model in arbiter from beta7 and loading it in a newer version and seeing if there is still a problem?

JanHolzweber · April 6, 2022, 1:00pm

I am working with dl4j in version 1.0.0-M1.1 and Arbiter in beta7

should I use Arbiter also in a newer version?

I am loading and evaluating the model like this:

	MultiLayerNetwork model = ModelSerializer.restoreMultiLayerNetwork(chooseModel());
		MyDataSource.setDataSets(trainData, testData);
		MyDataSource src = new MyDataSource();
		Evaluation eval = model.evaluate(src.testData());
		System.out.println(eval.stats());

agibsonccc · April 6, 2022, 1:07pm

It could be due to the different versions then.

Arbiter hasn’t seen updates in a while and won’t have new code written for it anytime soon but old models should still work.

I wouldn’t think about this too much. I can’t name the particular cause without running your code.

Could you tell me a bit about how you are running it? If you are running on cpu I know that dataset loading had issues with shuffling on gpu and that was fixed in M1.1.

If that’s the case ensure that your datasets are pre saved and you aren’t doing anything fancy. Shuffling your data should only need to happen once.

JanHolzweber · April 6, 2022, 1:15pm

I am running it on the cpu. I am loading the datasets from .csv files without any shuffling (as you mentioned there is a problem when the random setting is on)

Loading the data looks like this:

	int nrOfFeatures = 50;
			int batchSize = 10;

			FileSplit inputSplit = new FileSplit(new File(trainFile));

			RecordReader rr = new CSVRecordReader(';');
			rr.initialize(inputSplit);
			Schema schema = new Schema.Builder().addColumnCategorical("Error", "0", "1")
					.addColumnsDouble(generateColumsnNames(nrOfFeatures)).build();

			DataAnalysis analysis = AnalyzeLocal.analyze(schema, rr);
			Builder builder = new TransformProcess.Builder(schema);
			for (int i = 0; i < nrOfFeatures; i++)
				builder = builder.normalize(String.valueOf(i), Normalize.MinMax, analysis);
			TransformProcess transformProcess = builder.build();

			TransformProcessRecordReader trainRR = new TransformProcessRecordReader(rr, transformProcess);
			trainRR.initialize(inputSplit);
			train = new RecordReaderDataSetIterator.Builder(trainRR, batchSize).classification(0, 2).build();

			TransformProcessRecordReader testRR = new TransformProcessRecordReader(new CSVRecordReader(';'),
					transformProcess);
			testRR.initialize(new FileSplit(new File(testFile)));

			test = new RecordReaderDataSetIterator.Builder(testRR, batchSize).classification(0, 2).build();

And the data itself looks like this

JanHolzweber · April 6, 2022, 1:18pm

So if I would like to use Arbiter and not have any version issues, I should downgrade dl4j also to the beta7 version?

agibsonccc · April 7, 2022, 12:08am

@JanHolzweber generally new versions mean bug fixes.
Arbiter just relies on dl4j underneath. If you encounter a bug in the training it’ll be because of dl4j. Arbiter itself doesn’t do anything special there.

JanHolzweber · April 7, 2022, 12:28pm

A short update:

I got it to work, but only with specific updater functions

models where the Nesterov updater function was chosen, I was not able to load again or reproduce the results the optimization gave me

as of now, I was able to load models with the Adam updater function

I have to admit, I have no idea why the one works and the other does not

Topic		Replies	Views
Arbiter performance for HP tuning Arbiter	0	558	April 15, 2020
About the Arbiter category Arbiter	1	918	March 19, 2020
Low accuracy compared to model trained with Keras DL4J	8	771	August 21, 2020
Arbiter and 3d convolution Arbiter	1	371	September 15, 2021
Https://github.com/deeplearning4j/ relevant?	1	292	December 13, 2021

Differences in Results for optimized model

Related topics