Retrain lstm model

expect {0.1, 0.11, 0.12, 0.13} get {0.087, 0.0777, 0.067, 0.0577 } ???
why?

@Test
public void test3() throws InterruptedException {
    MultiLayerNetwork net = createNeuralNet();
    double[] v = new double[]{0.01, 0.02, 0.03, 0.04, 0.05, 0.06};
    fitValues(net, v);
    extractedOut(net, v); //expect {0.07, 0.08, 0.09, 0.1, 0.11, 0.12} get {0.069, 0.079, 0.088, 0.097, 0.10, 0.11, 0.117} it`s ok
    double[] vn = new double[]{0.05, 0.06};
    extractedOut(net, vn); // expect {0.07, 0.08, 0.09} get {0.0299, 0.0398, 0.0496}  ???
    v = new double[]{0.07, 0.08, 0.09};  // retrain (add data no model)
    fitValues(net, v);
    extractedOut(net, v); // expect {0.1, 0.11, 0.12, 0.13} get {0.087, 0.0777, 0.067, 0.0577 }  ???
}

private void extractedOut(MultiLayerNetwork net, double[] v) {
    double[] p = nextValues(net, v);
    System.out.print(p[p.length-1]);
    Arrays.stream(nextValues(net, p)).forEach(e->System.out.print(", " + e));
    net.rnnClearPreviousState();
    System.out.println(" ");
}

public void fitValues(MultiLayerNetwork net, double... v) {
    double[] firstPeriod = Arrays.copyOf(v, v.length-1);
    INDArray y0 = Nd4j.create(firstPeriod);
    double[][][] data_1 = new double[][][]{{y0.toDoubleVector()}};
    INDArray data1 = Nd4j.create(data_1);

    double[] firstShiftPeriod = Arrays.copyOfRange(v, 1, v.length);
    final INDArray y0s = Nd4j.create(firstShiftPeriod);
    double[][][] data_1s = new double[][][]{{y0s.toDoubleVector()}};
    INDArray data1s = Nd4j.create(data_1s);

    long t = System.currentTimeMillis();
    int epoch = 1024;
    while (epoch > 0) {
        net.fit(data1, data1s);
        epoch--;
    }
    t = System.currentTimeMillis() - t;
    System.out.println("Time: " + (t / 1000) + " sec. and " + (t - t / 1000) + " msec.");
}

public double[] nextValues(MultiLayerNetwork net, double... v) {
    double[] firstPeriod = Arrays.copyOf(v, v.length);
    INDArray y0 = Nd4j.create(firstPeriod);
    double[][][] data_1 = new double[][][]{{y0.toDoubleVector()}};
    INDArray data1 = Nd4j.create(data_1);
    INDArray out = net.rnnTimeStep(data1);
    return out.get(NDArrayIndex.indexesFor(0L, 0L)).toDoubleVector();
}

public MultiLayerNetwork createNeuralNet() {
    MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
            .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
            .seed(12345)
            .weightInit(WeightInit.XAVIER)
            .updater(new AdaGrad(0.005))
            //.updater(new Nesterovs(MathFunctionsModel.learningRate, 0.9))
            .list()
            .layer(0, new LSTM.Builder()
                    .activation(Activation.TANH)
                    .nIn(1)
                    .nOut(100)
                    .gradientNormalization(GradientNormalization.ClipElementWiseAbsoluteValue)
                    .gradientNormalizationThreshold(10)
                    .build())
            .layer(1, new LSTM.Builder()
                    .activation(Activation.TANH)
                    .nIn(100)
                    .nOut(100)
                    .gradientNormalization(GradientNormalization.ClipElementWiseAbsoluteValue)
                    .gradientNormalizationThreshold(10)
                    .build())
            .layer(2, new LSTM.Builder()
                    .activation(Activation.TANH)
                    .nIn(100)
                    .nOut(100)
                    .gradientNormalization(GradientNormalization.ClipElementWiseAbsoluteValue)
                    .gradientNormalizationThreshold(10)
                    .build())
            .layer(3, new RnnOutputLayer.Builder(LossFunctions.LossFunction.MSE)
                    .activation(Activation.TANH)
                    .nIn(100)
                    .nOut(1)
                    .gradientNormalization(GradientNormalization.ClipElementWiseAbsoluteValue)
                    .gradientNormalizationThreshold(10)
                    .build())
            .backpropType(BackpropType.TruncatedBPTT)
            .tBPTTLength(100)
            .build();
    MultiLayerNetwork net = new MultiLayerNetwork(conf);
    net.init();
    return net;
}

@Vlad-Karpov could you elaborate on what you are asking exactly? Some code and a 2 word sentence doesn’t really give us much to go on. I can guess that you’re asking about why those are the expected values?

Why do you expect those values specifically? Where is this code from?

I create a lstm network and train it time series.
Then I predict next time serias values, and its work correct. When come new data of time serias I need to add them in network, and I do that, but prediction after this add is wrong. I wrote this code, its my little test case.

What you are seeing is something that is known as catastrophic forgetting.

As you fit only on the new data for another 1024 epochs, you are essentially overfitting on the new data.

You will need to include the old data in your training data set to mitigate that issue.

Really?

You whant to say that if my model holds weights in links between nodes and states in lstm recurent nodes for a 5 years and if I get new data for yesterday I have to teach my model from sсratch?

Are you serious?

This is not magic, it is plain math.

You are changing the weights when you train the model.

When you are training it only with new data, it will change the weights to fit just the new data. It effectively treats the existing weights as just initialization.

@Vlad-Karpov please try to keep it professional. You have someone giving their time at no cost to you to answer your question. If I see condescending responses again, I’ll just ban you from the forums.

Paul was describing a real concept that’s already in the literature: [1612.00796] Overcoming catastrophic forgetting in neural networks

Try to give someone who’s donating their time to help you a bit of credit and reply with an open mind. It will make the community as a whole better.

Thanks for answer.
And sorry for “condescending”, my english so poor. I have not wanted offend nobody.

@Vlad-Karpov thanks for following up! I just wanted to make sure we cleared that up. Sometimes folks come in and think there’s some underlying sarcasm when we are just trying to answer their question.

@treo was explaining a mechanic to you. Usually terms like that have specific names that come from papers like what I linked you.

Try to ask what something is if it’s not clear, sometimes when we answer questions we assume the person might either google it or know what the term is and that might prevent the intended answer from getting through.

Thanks for being understanding! I appreciate you being receptive to feedback.