Imported Keras LSTM layer mismatch

It only looks weird because Keras and DL4J use a slightly different memory layout for their LSTM weights.

As you know, lstms are a bit more complicated than just y_t = h(W*x+RW*y_(t-1)+b) (which is a SimpleRNN), and therefore they have more logical weights than just W and RW.

However, both Keras and DL4J pack those additional weights into those two matrices.

In Keras the order is i, f, c, o:

while in DL4J the order is c,f,o,i:

So that’s why there is a difference in the outputs.