First of all, I have gone through the suggested materials(both the visualization and troubleshooting pages on deeplearning4j’s site).
To start off I will describe my model. I have built a RNN with an LSTM layer, which I am using on a time series data set. The config for my model is:
MultiLayerConfiguration config = new NeuralNetConfiguration.Builder() .miniBatch(true) .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT) .updater(new Adam(.0001)) .weightInit(WeightInit.XAVIER) .list() .layer(0, new LSTM.Builder().activation(Activation.TANH).nIn(1).nOut(10).build()) .layer(1, new RnnOutputLayer.Builder(LossFunctions.LossFunction.MSE) .activation(Activation.TANH) .nIn(10).nOut(1).build()) .backpropType(BackpropType.TruncatedBPTT) .tBPTTLength(100) .build();
In the config, I have tried many things like changing the learning rate, changing activation functions,etc and the same results that I show later still occur. My dataset, like I said before is a time series dataset and is multivariate. I use the DataNormalization class to normalize the input and have tried both standardization and minmaxscalar, both give close to the same results. Here is the analysis breakdown from analyize local: Imgur: The magic of the Internet . %Change is my label, and sentiment is my feature.
So the problem I’m having is that my model will not train well on my data, and won’t overfit even if I run it for an insane amount of epochs, and the training error evaluation results are a bit odd. The training error results in both the Deeplearning4j visualization UI and the score listener show a pattern of sorts. Specifically in the UI, the bottom two graphs look much rougher and quite strange compared to what was on the website. Album of result screenshots: Imgur: The magic of the Internet . Clearly the network does not learn well, and although the prediction fluctuates a bit, doesnt match the real points at all. My question is, what should I change in my model to fix what is going in in the UI and listener results? I feel as though I correctly normalize the data, so why is the lack of training and weird results happening? Thank you.