Reproducibility question

Hello,

i create a MultiLayerConfiguration using

new NeuralNetConfiguration.Builder()
                .seed(0)
                .activation(Activation.LEAKYRELU)
                .weightInit(WeightInit.RELU)
                .updater(new Adam(0.001))
                .l2(0.000000010000)
       .... (4 DenseLayer + 1 OutputLayer, config ti's constant between every test)

and i expected that since the seed it’s equals, if i re-run the simulation (given the data are the same) the output will be consistent, but they arent.

test and train are exactly the asme, on the first run i got

=========================Confusion Matrix=========================
  0  1  2
----------
 92 28  5 | 0 = 0
 26 39  1 | 1 = 1
  6  0  8 | 2 = 2

but if i rerun the number are a bit different, like

=========================Confusion Matrix=========================
  0  1  2
----------
 97 24  4 | 0 = 0
 27 38  1 | 1 = 1
  4  1  9 | 2 = 2

of course they sum up to the same value :slight_smile: but that’s expected.
do you have any idea for why the results are a bit different every time i re-run it? shouldn’t be setting the same seed make them consistent?

How exactly are you loading your train and test data? The most likely reason for this is that there is randomization in the training data loading, so you don’t get exactly the same order of data in your mini batches.

i don’t use batch, nor loader, i load them fro memory.
i also checked twice that featureMatrix, labelMatrix are the same

    double[][] featureMatrix = new double[size][features];
    double[][] labelMatrix = new double[size][Config.classVal.size()];

    for (int q = 0; q < size; q++) {
        for (int attr = 0; attr < features; attr++) {
            featureMatrix[q][attr] = calculated.get(q).value(attr);
        }
        labelMatrix[q] = new double[Config.classVal.size()];
        int indexOf = Config.classVal.indexOf(calculated.get(q).stringValue(features));
        labelMatrix[q][indexOf] = 1;
    }


    INDArray trainingIn = Nd4j.create(featureMatrix);
    INDArray label = Nd4j.create(labelMatrix);

    return new DataSet(trainingIn, label);

then the data are fitted in that way

    DataNormalization normalizer = new NormalizerStandardize();
    normalizer.fit(this.train);
    normalizer.transform(this.train);
    normalizer.transform(this.test);

but again, i also checked with a simple this.train.toString().hashCode that the string it’s equals after each run, and so it is.

dataset basically are exactly the same at every run of the code

i created a post on github withmore details and an example

For completeness:

.seed(0)

This is the problem here. I’m not entirely sure why, but for some reason the 0 seed is being handled as if no seed has been set. If you put anything else in there, you will get a reproducible training run (I typically use something like 0xBEEF , 0xC0FFEE , 0x5EED ).