I’m using DL4J 1.0.0-beta6 for a Sequence Classification problem.
I’m now trying to combine LSTM with Dense layers with no success. It seems the ‘labelMask’ won’t behave well.
Dimensions: 18 mini-batches x 5 classes x 120 time-steps
First (minimal) approach, using ‘LastTimeStep’ to project [18, 5, 120] onto [18, 5]:
.list()
.layer(0, new LastTimeStep(new LSTM.Builder()
//config
.build()))
.layer(1, new OutputLayer.Builder()
//config
.build())
.build()
However, I cannot get this to work. Output for 2 different SequenceRecordReaderDataSetIterator.AlignmentMode:
EQUAL_LENGTH: “Labels and preOutput must have equal shapes: got shapes [18, 5, 1] vs [18, 5]”
Second (minimal) approach, using ‘RnnToFeedForwardPreProcessor’:
.list()
.layer(0, new LSTM.Builder()
//config
.build())
.layer(1, new OutputLayer.Builder()
//config
.build())
.inputPreProcessor(1, new RnnToFeedForwardPreProcessor())
.build()
Unfortunately, I don’t get this to work either and encounter similar exceptions:
EQUAL_LENGTH: “Incorrect number of arguments for permute function: got arguments [0, 2, 1] for rank 2 array. Number of arguments must equal array rank”
Any pointers are greatly appreciated. Just let me know if you need more info. Thank you so much!
PS. Just a pure input → LSTM → RnnOutputLayer network (with AlignmentMode.ALIGN_END) works fine.
IN this case you don’t even need a label mask. It is used to mask out the timesteps in a sequence output.
As you have only a single output when you are using an output layer like that at the end the mask array doesn’t make sense.
I understand, but how can achieve that? Apologies, but I couldn’t find any examples of this and I’m rather stuck.
I even tried supplying null as AlignmentMode to the SequenceRecordReaderDataSetIterator to no avail (it simply defaults to ALIGN_START in the underlying RecordReaderMultiDataSetIterator).
It looks like the issue arises when the SequenceRecordReaderDataSetIterator creates the MultiDataSet (i.e. a minibatch). There, it simply checks for itself whether there should be a ‘labelMask’ or not:
RecordReaderMultiDataSetIterator:615
for (List<List<Writable>> c : list) {
if (c.size() < maxTSLength)
needMaskArray = true;
}
Since I’m doing classification I’ve got 120 timesteps in the feature file but only 1 timestep in the label file. Because of this, a label mask gets created in RecordReaderMultiDataSetIterator#convertFeaturesOrLabels which is then passed onto the network. As you’re saying, this doesn’t make sense for the LastTimeStep wrapper leading to a dimension mismatch.
Is there a way to solve this for pure seq classification or do I really need to also provide 120 labels?
You are right, when creating a sequence it does always create it in a sequence format. But, there is a simple way to solve that: LabelLastTimeStepPreProcessor
Just set this as the preprocessor on the iterator and it should take care of the problem.
I think the trick was the combination of using a LabelLastTimeStepPreProcessor and wrapping the LSTM layer in a LastTimeStep layer, such that: Input -> LabelLastTimeStepPreProcessor -> LastTimeStep( LSTM ) -> OutputLayer.
Am I understanding it correctly that this basic setup will more or less be equivalent to Input -> LSTM -> RnnOutputLayer? That is, we are still using the full sequence, effectively unrolling the LSTM and hooking up the last step to the next (2D) layer?
Thank you for the confirmation and thank you so much for the help! This is really exciting!
All in all this is what I did, in case someone else stumbles upon this in the future:
1: Set up the data set iterator with a LabelLastTimeStepPreProcessor (this modifies the labelMask shape for use with a 2D output layer instead of an RNN output layer):