It only looks weird because Keras and DL4J use a slightly different memory layout for their LSTM weights.
As you know, lstms are a bit more complicated than just y_t = h(W*x+RW*y_(t-1)+b) (which is a SimpleRNN), and therefore they have more logical weights than just W and RW.
However, both Keras and DL4J pack those additional weights into those two matrices.
In Keras the order is i, f, c, o:
while in DL4J the order is c,f,o,i:
So that’s why there is a difference in the outputs.