LSTM - All inputs yield the same output

TwistedTea · January 31, 2023, 8:43pm

Hello,

I’m trying to create a simple LSTM, with 2 input features and a timeseries length of 1. I’m having a strange issue however; after training the network, inputting test data yields the same, arbitrary result regardless of the input values. My code is shown below.

public class LSTMRegression {
	public static final int inputSize = 2,
							lstmLayerSize = 4,
							outputSize = 1;
	
	public static final double learningRate = 0.001;

	public static void main(String[] args) {
		int miniBatchSize = 29;
		
		MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
                .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
                .updater(new Sgd(learningRate))
                .list()
                .layer(0, new LSTM.Builder().nIn(inputSize).nOut(lstmLayerSize)
                        .weightInit(WeightInit.XAVIER)
                        .activation(Activation.IDENTITY).build())
                .layer(1, new LSTM.Builder().nIn(lstmLayerSize).nOut(lstmLayerSize)
                        .weightInit(WeightInit.XAVIER)
                        .activation(Activation.SIGMOID).build())
                .layer(2, new LSTM.Builder().nIn(lstmLayerSize).nOut(lstmLayerSize)
                        .weightInit(WeightInit.XAVIER)
                        .activation(Activation.SIGMOID).build())
                .layer(3, new RnnOutputLayer.Builder(LossFunctions.LossFunction.MSE)
                        .weightInit(WeightInit.XAVIER)
                        .activation(Activation.IDENTITY)
                        .nIn(lstmLayerSize).nOut(outputSize).build())
                
                .backpropType(BackpropType.TruncatedBPTT)
                .tBPTTForwardLength(miniBatchSize)
                .tBPTTBackwardLength(miniBatchSize)
                .build();
		
		var network = new MultiLayerNetwork(conf);
		
		network.init();
		network.fit(getTrain());
		
		System.out.println(network.output(getTest()));
	}
	
	public static INDArray getTest() {
		double[][][] test = new double[][][]{
            {{20}, {203}},
            {{16}, {183}},
            {{20}, {190}},
            {{18.6}, {193}},
            {{18.9}, {184}},
            {{17.2}, {199}},
            {{20}, {190}},
            {{17}, {181}},
            {{19}, {197}},
            {{16.5}, {198}},
            ...
		};
		
		INDArray input = Nd4j.create(test);
		
		return input;
	}
	
	public static DataSet getTrain() {
		double[][][] inputArray = {
			{{18.7}, {181}},
			{{17.4}, {186}},
			{{18}, {195}},
			{{19.3}, {193}},
			{{20.6}, {190}},
			{{17.8}, {181}},
			{{19.6}, {195}},
			{{18.1}, {193}},
			{{20.2}, {190}},
			{{17.1}, {186}},
			...
		};
		
		double[][] outputArray = {
				{3750},
				{3800},
				{3250},
				{3450},
				{3650},
				{3625},
				{4675},
				{3475},
				{4250},
				{3300},
				...
		};
		
		INDArray input = Nd4j.create(inputArray);
		INDArray labels = Nd4j.create(outputArray);
		
		return new DataSet(input, labels);
	}
}

Here’s an example of that last line’s output:

[[[0.4380]], 

 [[0.4380]], 

 [[0.4380]], 

 [[0.4380]], 

 [[0.4380]], 

 [[0.4380]], 

 [[0.4380]], 

 [[0.4380]], 

 [[0.4380]], 

 [[0.4380]],
 ...
]

So far I’ve tried changing the updater (it was previously Adam), the activation function (previously it was ReLU), and the learning rate, all with similar results.

I’ve also previously normalized the data, which unsurprisingly didn’t change the fact that every output was the same. Because of this, I chose not to include the normalization for readability (however please let me know if you’d like to see the code and/or outputs with normalization).

Thank you.

agibsonccc · February 1, 2023, 4:07am

@TwistedTea I’m already interacting with you on stack overflow. Sorry the system flagged your post as spam for some reason. I’ve removed that and will link your cross post here: java - LSTM in DL4J - All output values are the same - Stack Overflow

Topic		Replies	Views
A post in "Weird results from my LSTM prediction" requires staff attention DL4J	2	410	January 11, 2021
Weird results from my LSTM prediction Tuning Help	0	730	December 27, 2020
LSTM->Rnnoutput shape problems - help please DL4J	1	496	August 26, 2021
Imported Keras LSTM layer mismatch DL4J	18	1501	February 14, 2020
Invalid input in LSTM prediction DL4J	0	230	June 27, 2022

LSTM - All inputs yield the same output

Related topics