Hello. I was looking for a java NN framework in order to more smooth than predict one time series values based on a dozen or so other time series. I’ve stumbled on SANNET which looked promising but unfortunately the GRU/LSTM are badly coded because there was no way to use them with time series (the example by the author was creating a layer for each date).
I decided to try out DLJ4, but after a week of struggling against the different versions and documentation it seems to me that the LSTM isn’t working also. There are too few examples, most of them written in older versions and based on these I tried a few approaches and still coildn’t make it work.
A basic approach was: I create the inputs in the form of inputData = new double[5000][10][20] (5000 samples - you call them examples ! - 10 time series and a window of 20 dates). The output should have the form: outputData = double[5000][1] becasue I only have one output. So
double[][][] inputData = createInputData(inputs, window);
double[][] outputData = createOutputData(outputs, window);
List list = new ArrayList(inputData.length);
for (int i=0; i<inputData.length; i++) list.add(new Pair(inputData[i], outputData[i]));
INDArrayDataSetIterator trainIter = new INDArrayDataSetIterator(list, 500);
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder().list().layer(0, new LSTM.Builder().nIn(inputs.length).nOut(50).activation(Activation.TANH).build()).layer(1, new RnnOutputLayer.Builder().nIn(50).nOut(outputs.length)
.activation(Activation.LEAKYRELU).lossFunction(LossFunctions.LossFunction.MSE).build()).build();
MultiLayerNetwork model = new MultiLayerNetwork(conf);
model.init();
int epochs = 1000;
for (int i=0; i<epochs; i++)
{
model.fit(trainIter);
if (i % 100 == 0) System.out.println(model.output(Nd4j.create(inputData)));
trainIter.reset();
}
But nothing happens. I have the same output, and the training takes a fraction of a second. Somthing is wrong.
I tried another approach:
CollectionSequenceRecordReader inputData = createInputList(inputs, window);
CollectionSequenceRecordReader outputData = createOutputList(outputs, window);
SequenceRecordReaderDataSetIterator trainIter = new SequenceRecordReaderDataSetIterator(inputData, outputData, 500, 1, true);
With inputData and outputData having the same logic as before (Many-to-One you call it), and I get an arror message. He didn’t like the Many-To-One approach, output should have have a history window of 20, which is not what LSTM should do:
Exception in thread “main” java.lang.IllegalStateException: Sequence lengths do not match for RnnOutputLayer input and labels:Arrays should be rank 3 with shape [minibatch, size, sequenceLength] - mismatch on dimension 2 (sequence length) - input=[500, 50, 20] vs. label=[500, 1, 1]
at org.nd4j.common.base.Preconditions.throwStateEx(Preconditions.java:639)
at org.nd4j.common.base.Preconditions.checkState(Preconditions.java:337)
at org.deeplearning4j.nn.layers.recurrent.RnnOutputLayer.backpropGradient(RnnOutputLayer.java:59)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.calcBackpropGradients(MultiLayerNetwork.java:1984)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.computeGradientAndScore(MultiLayerNetwork.java:2799)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.computeGradientAndScore(MultiLayerNetwork.java:2742)
at org.deeplearning4j.optimize.solvers.BaseOptimizer.gradientAndScore(BaseOptimizer.java:174)
at org.deeplearning4j.optimize.solvers.StochasticGradientDescent.optimize(StochasticGradientDescent.java:61)
at org.deeplearning4j.optimize.Solver.optimize(Solver.java:52)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.fitHelper(MultiLayerNetwork.java:1753)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.fit(MultiLayerNetwork.java:1674)
at tests.DL4JTests.testLSTM(DL4JTests.java:117)
What have I done wrong ?