How to prepare time series data for LSTM?

Hello everyone.

I’m using the supervised learning method with an LSTM network to predict forex prices. To achieve this I’m using deeplearning4j library but I doubt several points of my implementation.

I turned off the mini batch feature, then I created many trading indicators from forex data. I created a dataset iterator which iterates over time series data from start to end. On every iteration the dataset iterator returns a dataset containing a single input data. This way I iterate over all data that I have.

The first problem is that at the start of each epoch, previous inputs from the previous epoch interfere with the output of the current prediction.

The second problem is that I don’t know how to normalize my data. At first when I used the given prices as they were, my model scores were too small so after some epochs it started to output the same value. I considered to use only fractional part of prices (1,723455034 – 7234) and now the scores are too large (160542.18602891127).

Do you have any ideas to solve this problems?

Neural net configuration

public static MultiLayerNetwork buildNetwork(int nIn, int nOut, int windowSize) {

		MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
				.seed(System.currentTimeMillis())
				.optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
				.weightInit(WeightInit.XAVIER)
				.updater(Updater.RMSPROP)
				.miniBatch(false)
				.l2(25e-4)
				.list()
				.layer(0, new LSTM.Builder()
						.nIn(nIn)
						.nOut(256)
						.activation(Activation.TANH)
						.gateActivationFunction(Activation.HARDSIGMOID)
						.dropOut(0.2)
						.build())
				.layer(1, new LSTM.Builder()
						.nIn(256)
						.nOut(256)
						.activation(Activation.TANH)
						.gateActivationFunction(Activation.HARDSIGMOID)
						.dropOut(0.2)
						.build())
				.layer(2, new DenseLayer.Builder()
						.nIn(256)
						.nOut(32)
						.activation(Activation.RELU)
						.build())
				.layer(3, new RnnOutputLayer.Builder()
						.nIn(32)
						.nOut(nOut)
						.activation(Activation.IDENTITY)
						.lossFunction(LossFunctions.LossFunction.MSE)
						.build())
				.backpropType(BackpropType.TruncatedBPTT)
				.tBPTTForwardLength(windowSize)
				.tBPTTBackwardLength(windowSize)
				.build();


		MultiLayerNetwork network = new MultiLayerNetwork(conf);
		network.init();
		return network;
	}

Dataset Iterator

	@Override
	public DataSet next() {

		INDArray observationArray = Nd4j.create(new int[]{1 , this.featureSize, this.windowSize}, 'f');
		INDArray labelArray = Nd4j.create(new int[]{1, PREDICTION_VALUES_SIZE, this.windowSize}, 'f');

		int windowStartOffset = this.seriesIndex;
		int windowEndOffset = windowStartOffset + this.windowSize;

		for (int windowOffset = windowStartOffset; windowOffset < windowEndOffset; windowOffset++) {

			int windowIndex = windowOffset - windowStartOffset;

			for (int featureIndex = ZERO_INDEX; featureIndex < this.featureSize; featureIndex++) {

				observationArray.putScalar(
						new int[]{ZERO_INDEX, featureIndex, windowIndex},
						this.dataProvider.data(windowOffset, featureIndex)
				);
			}
			labelArray.putScalar(new int[]{ZERO_INDEX, ZERO_INDEX, windowIndex},
					this.dataProvider.pip(windowOffset + this.predictionStep)
			);
		}
		seriesIndex++;
		return new DataSet(observationArray, labelArray);
	}

Training

    public static final int EPOCHS = 500;
	public static final int WINDOW_SIZE = 20;
	public static final int PREDICTION_STEP = 1;

	public static void prepare(String network, String dataset) throws IOException {

		TradingDataProvider provider = new TradingDataProvider(CommonFileTools.loadSeries(dataset));

		TradingDataIterator dataIterator = new TradingDataIterator(provider, WINDOW_SIZE, PREDICTION_STEP);
		MultiLayerNetwork net = LSTMNetworkFactory.buildNetwork(dataIterator.inputColumns(), dataIterator.totalOutcomes(), WINDOW_SIZE);

		long start;
		for (int i = 0; i < EPOCHS; i++) {
			start = System.currentTimeMillis();
			net.fit(dataIterator);

			logger.info("Epoch: {}, Score: {}, Duration: {} ms", i + 1, net.score(),
					System.currentTimeMillis() - start
			);
		}

		File locationToSave = new File(network);
		ModelSerializer.writeModel(net, locationToSave, true);
		logger.info("Model saved");
		System.exit(0);
	}

@hayk-avetisyan sorry missed this one.

Normalization for numerical data like this is typically done via either scaling zero to one or centering with zero mean unit variance.

Regarding previous inputs, could you clarify a bit? Scanning your code it doesn’t look like you do anything from normal. Are you talking about something the model itself is doing?

Either way maybe take a look at the UCI example as a starting point there: deeplearning4j-examples/UCISequenceClassification.java at bc1bac672faec222afd2b424aa00eaa008c59beb · eclipse/deeplearning4j-examples · GitHub

Commenting on your model architecture a bit, try different updaters (adam is a good default). RELU in an LSTM also seems a bit out of place. I would focus on looking at different activation functions as well.

I would also learn what setInputType is. If you call that instead, you don’t have to manually configure the n in for every layer.