i tried some lstm model to solve some nlp tasks like pos tagging. i created an multidatasetiterator which gives on size batches. the sequences have variable length. Everything worked under beta7 and after an update to SNAPSHOT i get the error:
java.lang.IllegalStateException: Sequence lengths do not match for RnnOutputLayer input and labels:Arrays should be rank 3 with shape [minibatch, size, sequenceLength] - mismatch on dimension 2 (sequence length) - input=[16, 800, 63] vs. label=[16, 56, 63]
at org.nd4j.common.base.Preconditions.throwStateEx(Preconditions.java:638)
at org.nd4j.common.base.Preconditions.checkState(Preconditions.java:337)
at org.deeplearning4j.nn.layers.recurrent.RnnOutputLayer.backpropGradient(RnnOutputLayer.java:59)
at org.deeplearning4j.nn.graph.vertex.impl.LayerVertex.doBackward(LayerVertex.java:148)
at org.deeplearning4j.nn.graph.ComputationGraph.calcBackpropGradients(ComputationGraph.java:2772)
at org.deeplearning4j.nn.graph.ComputationGraph.computeGradientAndScore(ComputationGraph.java:1381)
at org.deeplearning4j.nn.graph.ComputationGraph.computeGradientAndScore(ComputationGraph.java:1341)
at org.deeplearning4j.optimize.solvers.BaseOptimizer.gradientAndScore(BaseOptimizer.java:174)
at org.deeplearning4j.optimize.solvers.StochasticGradientDescent.optimize(StochasticGradientDescent.java:61)
at org.deeplearning4j.optimize.Solver.optimize(Solver.java:52)
at org.deeplearning4j.nn.graph.ComputationGraph.fitHelper(ComputationGraph.java:1165)
at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1115)
at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1082)
it mentioned the sequencelength do not match. but for me it looks fine with the shapeinfo given also by the exception text. can there be a problem with with the variable length. i wrapped my iterator to use batches inside the iteratormultidatasetiterator.
@thomas mind giving me a reproducer I can just run with a feed forward so I can compare on beta7 and SNAPSHOTS? Intended input data in ndarray form with a .output call is fine.
That should help separate out the multi dataset iterator from troubleshooting the network.