LSTM->Rnnoutput shape problems - help please

Hello:

I have set up a LSTM following the ICU example, but I am getting an error:

java.lang.IllegalArgumentException: Labels and preOutput must have equal shapes: got shapes [43334, 34] vs [43334, 2]

I have set up my data in different folders:

- train/features: has numbered CSV files with time series (each about 850 lines).  Total files: 130
- train/labels: has numbered CSV files with one number (0 or 1) that represents the label for each of the timeseries in features.  Each file name (a number) corresponds to the file in features.

- test/features: numbered CSV files with test time series (each about 850 lines).  Total files: 15
- train/labels: numbered CSV files with the labels for the files above.

My code is:

int miniBatchSize = 60;
    SequenceRecordReader trainFeatures = new CSVSequenceRecordReader(1);
	
    trainFeatures.initialize(
        new NumberedFileInputSplit(
            trainFolder.resolve("features").toString() + File.separator + "%d.csv", 0, totTrain-1));

    SequenceRecordReader trainLabels = new CSVSequenceRecordReader();

    trainLabels.initialize(
        new NumberedFileInputSplit(
            trainFolder.resolve("labels").toString() + File.separator + "%d.csv", 0, totTrain-1));

    DataSetIterator trainData =
        new SequenceRecordReaderDataSetIterator(
            trainFeatures,
            trainLabels,
            miniBatchSize,
            numLabels,
            false            ,SequenceRecordReaderDataSetIterator.AlignmentMode.ALIGN_END);
    DataNormalization normalizer = new NormalizerMinMaxScaler();

    normalizer.fit(trainData); // Collect training data statistics
    trainData.reset();
    trainData.setPreProcessor(normalizer);    

    SequenceRecordReader testFeatures = new CSVSequenceRecordReader(1);

    testFeatures.initialize(
        new NumberedFileInputSplit(
            testFolder.resolve("features").toString() + File.separator + "%d.csv", 0, totTest-1));

    SequenceRecordReader testLabels = new CSVSequenceRecordReader();

    testLabels.initialize(
        new NumberedFileInputSplit(
            testFolder.resolve("labels").toString() + File.separator + "%d.csv", 0, totTest-1));

    DataSetIterator testData =
        new SequenceRecordReaderDataSetIterator(
            testFeatures,
            testLabels,
            miniBatchSize,
            numLabels,
            false,
            SequenceRecordReaderDataSetIterator.AlignmentMode.ALIGN_END);

    testData.setPreProcessor(normalizer);

    // Count all records
    int trainRecords = countRecords(trainFeatures);

    log.info("The train dataset has {} records", trainRecords);

    // Count all records
    int testRecords = countRecords(testFeatures);

    log.info("The test dataset has {} records", testRecords);

    // Find the label index (called response)
    int numClasses = DayBuilder.AI_CLASSES.length; // precent

   
    int netSize = numLabels * numClasses * 4;
   
    log.info(
        "Build model: Classes: {}, Attributes: {}, NetSize: {}, batchSize: {}",
        numClasses,
        numLabels,
        netSize,
        miniBatchSize);

    MultiLayerConfiguration conf =
        new NeuralNetConfiguration.Builder()
            .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
            .seed(System.currentTimeMillis())
            .weightInit(WeightInit.XAVIER)
            .updater(new AMSGrad.Builder().learningRate(0.01).build())
            .l2(0.0005)
            .list()
            .layer(
                0,
                new LSTM.Builder()
                    .nIn(trainData.inputColumns())
                    .nOut(netSize)
                    .weightInit(WeightInit.XAVIER)
                    .gateActivationFunction(Activation.SOFTSIGN)
                    .updater(new AMSGrad.Builder().learningRate(0.01).build())
                    .dropOut(0.8)
                    .build())
    
            .layer(
                1,
                new RnnOutputLayer.Builder()
                    .nIn(netSize)
                    .nOut(numClasses)
                    .activation(Activation.SOFTMAX)
                    .lossFunction(LossFunctions.LossFunction.RECONSTRUCTION_CROSSENTROPY)
                    .build())
            .backpropType(BackpropType.Standard)
            .setInputType(InputType.recurrent(trainData.inputColumns(), numClasses))
            .build();

    MultiLayerNetwork net = new MultiLayerNetwork(conf);

    net.init();
        String str =
        "Test set evaluation at epoch %d: Accuracy = %.2f, F1 = %.2f, False Positive = %.2f, False Negative = %.2f";
    int epochs = 150;
    org.nd4j.evaluation.classification.Evaluation evaluation = null;
    double best = -1;
    int bestEpoch = -1;
    MultiLayerNetwork bestModel = null;
    INDArray bestResults = null;

    for (int e = 0; e < epochs; e++) {
      net.fit(trainData);
      evaluation = net.evaluate(testData);
      log.info(
          String.format(
              str,
              e,
              evaluation.accuracy(),
              evaluation.f1(),
              evaluation.falsePositiveRate(),
              evaluation.falseNegativeRate()));
      if ((bestEpoch == -1) || (best < evaluation.accuracy())) {
        best = evaluation.accuracy();
        bestEpoch = e;
        bestModel = net.clone();
        testData.reset();
        bestResults = net.output(testData);
      }
      testData.reset();
      trainData.reset();
    }

   ...

It breaks at the first fit() call, with the Exception:

java.lang.IllegalArgumentException: Labels and preOutput must have equal shapes: got shapes [43334, 34] vs [43334, 2]
	at org.nd4j.common.base.Preconditions.throwEx(Preconditions.java:633) ~[nd4j-common-1.0.0-M1.1.jar:na]
	at org.nd4j.linalg.lossfunctions.impl.LossKLD.computeGradient(LossKLD.java:80) ~[nd4j-api-1.0.0-M1.1.jar:1.0.0-M1.1]
	at org.deeplearning4j.nn.layers.BaseOutputLayer.getGradientsAndDelta(BaseOutputLayer.java:172) ~[deeplearning4j-nn-1.0.0-M1.1.jar:na]
	at org.deeplearning4j.nn.layers.BaseOutputLayer.backpropGradient(BaseOutputLayer.java:144) ~[deeplearning4j-nn-1.0.0-M1.1.jar:na]
	at org.deeplearning4j.nn.layers.recurrent.RnnOutputLayer.backpropGradient(RnnOutputLayer.java:72) ~[deeplearning4j-nn-1.0.0-M1.1.jar:na]
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.calcBackpropGradients(MultiLayerNetwork.java:1983) ~[deeplearning4j-nn-1.0.0-M1.1.jar:na]
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.computeGradientAndScore(MultiLayerNetwork.java:2798) ~[deeplearning4j-nn-1.0.0-M1.1.jar:na]
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.computeGradientAndScore(MultiLayerNetwork.java:2741) ~[deeplearning4j-nn-1.0.0-M1.1.jar:na]
	at org.deeplearning4j.optimize.solvers.BaseOptimizer.gradientAndScore(BaseOptimizer.java:174) ~[deeplearning4j-nn-1.0.0-M1.1.jar:na]
	at org.deeplearning4j.optimize.solvers.StochasticGradientDescent.optimize(StochasticGradientDescent.java:61) ~[deeplearning4j-nn-1.0.0-M1.1.jar:na]
	at org.deeplearning4j.optimize.Solver.optimize(Solver.java:52) ~[deeplearning4j-nn-1.0.0-M1.1.jar:na]
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.fitHelper(MultiLayerNetwork.java:1752) ~[deeplearning4j-nn-1.0.0-M1.1.jar:na]
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.fit(MultiLayerNetwork.java:1673) ~[deeplearning4j-nn-1.0.0-M1.1.jar:na]

Can someone please explain what I am doing wrong?

Thanks,

Juan

Dismiss this, I had an error on the trainData/trainTest creation. By error I had numLabels where it should be numClasses.