How can I fix my model which seemingly isnt training well?

fletchni · January 26, 2021, 11:24pm

First of all, I have gone through the suggested materials(both the visualization and troubleshooting pages on deeplearning4j’s site).

To start off I will describe my model. I have built a RNN with an LSTM layer, which I am using on a time series data set. The config for my model is:
MultiLayerConfiguration config = new NeuralNetConfiguration.Builder() .miniBatch(true) .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT) .updater(new Adam(.0001)) .weightInit(WeightInit.XAVIER) .list() .layer(0, new LSTM.Builder().activation(Activation.TANH).nIn(1).nOut(10).build()) .layer(1, new RnnOutputLayer.Builder(LossFunctions.LossFunction.MSE) .activation(Activation.TANH) .nIn(10).nOut(1).build()) .backpropType(BackpropType.TruncatedBPTT) .tBPTTLength(100) .build();
In the config, I have tried many things like changing the learning rate, changing activation functions,etc and the same results that I show later still occur. My dataset, like I said before is a time series dataset and is multivariate. I use the DataNormalization class to normalize the input and have tried both standardization and minmaxscalar, both give close to the same results. Here is the analysis breakdown from analyize local: Imgur: The magic of the Internet . %Change is my label, and sentiment is my feature.

So the problem I’m having is that my model will not train well on my data, and won’t overfit even if I run it for an insane amount of epochs, and the training error evaluation results are a bit odd. The training error results in both the Deeplearning4j visualization UI and the score listener show a pattern of sorts. Specifically in the UI, the bottom two graphs look much rougher and quite strange compared to what was on the website. Album of result screenshots: Imgur: The magic of the Internet . Clearly the network does not learn well, and although the prediction fluctuates a bit, doesnt match the real points at all. My question is, what should I change in my model to fix what is going in in the UI and listener results? I feel as though I correctly normalize the data, so why is the lack of training and weird results happening? Thank you.

agibsonccc · January 27, 2021, 12:39am

@fletchni could you include your whole pipeline? It would be nice to know if you normalized your data and it’s within expected normal training data. Depending on your data it’s normal to normalize your label output as well.

fletchni · January 27, 2021, 1:35am

By pipeline do you mean the starting data aswell? Here is my method that trains the model, the only thing before it is downloading the data and removing a column from the dataset.
public class Dl4jModel {

        private File outputPath;
        private int labelIndex = 0;
        private int miniBatchSize = 32;
        private static final Logger log = LoggerFactory.getLogger(Dl4jModel.class);


        public void train() throws Exception {


            SequenceRecordReader trainReader = new CSVSequenceRecordReader(0, ",");
            trainReader.initialize(new FileSplit(new File("Processedtrain.CSV")));

            //numPossible labels not used since regression.
            DataSetIterator trainIter = new SequenceRecordReaderDataSetIterator(trainReader, miniBatchSize, -1, 0, true);


            /*SequenceRecordReader testReader = new CSVSequenceRecordReader(0, ",");
            testReader.initialize(new FileSplit(FileSystemConfig.testFile));
            DataSetIterator testIter = new SequenceRecordReaderDataSetIterator(testReader, miniBatchSize, -1, 0, true);

             */



            DataNormalization dataNormalization = new NormalizerMinMaxScaler(-1,1);
            dataNormalization.fitLabel(true);
            dataNormalization.fit(trainIter);

            //testIter.setPreProcessor(dataNormalization);
            trainIter.setPreProcessor(dataNormalization);



            DataSet trainData = trainIter.next();
            //DataSet testData = testIter.next();

            trainIter.reset();
            //testIter.reset();

            System.out.println(trainData.sample(1));
            System.out.println(" ");
            //System.out.println(testData.sample(1));

            //Initialize the user interface backend
            UIServer uiServer = UIServer.getInstance();
            //Configure where the network information (gradients, score vs. time etc) is to be stored. Here: store in memory.
            StatsStorage statsStorage = new InMemoryStatsStorage();         //Alternative: new FileStatsStorage(File), for saving and loading later
            //Attach the StatsStorage instance to the UI: this allows the contents of the StatsStorage to be visualized
            uiServer.attach(statsStorage);
            //Then add the StatsListener to collect this information from the network, as it trains

            MultiLayerConfiguration config = new NeuralNetConfiguration.Builder()
                    .miniBatch(true)
                    .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
                    .updater(new Adam(.00001))
                    .weightInit(WeightInit.XAVIER)
                    .list()
                    .layer(0, new LSTM.Builder().activation(Activation.TANH).nIn(1).nOut(10).build())
                    .layer(1, new RnnOutputLayer.Builder(LossFunctions.LossFunction.MSE)
                            .activation(Activation.TANH)
                            .nIn(10).nOut(1).build())
                    .backpropType(BackpropType.TruncatedBPTT)
                    .tBPTTLength(100)
                    .build();

            MultiLayerNetwork model = new MultiLayerNetwork(config);
            model.init();

            model.addListeners(new ScoreIterationListener(10));


            model.addListeners(new StatsListener(statsStorage));

            int numEpochs = 200;
            model.fit(trainIter, numEpochs);

            INDArray timeSeriesFeatures = trainData.getFeatures();
            INDArray timeSeriesOutput = model.output(timeSeriesFeatures);

            dataNormalization.revertLabels(timeSeriesOutput);
            dataNormalization.revert(trainData);

            compareResults(timeSeriesOutput, trainData);

            /*ModelSerializer.writeModel(model, new File("C:\\Users\\Nicholas\\Desktop\\STOCKPRACTICE\\model.txt"),true );
            FileOutputStream fos = new FileOutputStream(new File(FileSystemConfig.normFile));
            ObjectOutputStream oos = new ObjectOutputStream(fos);
            oos.writeObject(dataNormalization);
            oos.close();

             */
        }}

That is the method that deals specifically with training the model. If it is fine with you, i will also link my github repo: https://github.com/njfletcher/Creative-Thesis/blob/master/src/main/java/com/StockPrediction/Dl4jModel.java , just in case you think there is something else that needs to be adressed.

Topic		Replies	Views
Question regarding the LSTM training and data format DL4J	3	554	April 28, 2021
How to interpret training evaluation? Tuning Help	0	421	December 24, 2020
How to prepare time series data for LSTM? DL4J	1	519	May 11, 2022
Weird results from my LSTM prediction Tuning Help	0	729	December 27, 2020
Cant find out how to fix DL4JInvalidInputException DL4J	5	40	February 21, 2025

How can I fix my model which seemingly isnt training well?

Related topics