Hey,
i tried to build my own computation graph to get trained by a bertiterator. My BertIterator init looks like this now:
BertIterator iter = new BertIterator.Builder()
.tokenizer(tokenizerFactory)
.lengthHandling(LengthHandling. FIXED_LENGTH , sentlen)
.minibatchSize(10)
.sentenceProvider(provider)
.task(Task. UNSUPERVISED )
.vocabMap(tokenizerFactory.getVocab())
.featureArrays(BertIterator.FeatureArrays. INDICES_MASK )
.masker( new BertMaskedLMMasker( new Random(12345), 0.2, 0.5, 0.5))
.unsupervisedLabelFormat(BertIterator.UnsupervisedLabelFormat. RANK3_NCL )
.maskToken(“[MASK]”)
.build();
I have an EmbeddingSequenceLayer for the input layer …
new EmbeddingSequenceLayer.Builder() .nIn(vocabsize) .nOut(756) .build();
and an RnnOutputLayer as my output:
new RnnOutputLayer.Builder() .nOut(vocabsize) .dataFormat(RNNFormat.NCW) .activation(Activation.SOFTMAX) .build()
When i try to fit some text examples i got the following error:
Exception in thread "main" java.lang.IllegalArgumentException: Op.X must have same data type as Op.Y: X.datatype=FLOAT, Y.datatype=INT at org.nd4j.common.base.Preconditions.throwEx(Preconditions.java:633) at org.nd4j.common.base.Preconditions.checkArgument(Preconditions.java:134) at org.nd4j.linalg.api.ops.BaseBroadcastOp.validateDataTypes(BaseBroadcastOp.java:200) at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:889) at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:879) at org.nd4j.linalg.factory.Broadcast.mul(Broadcast.java:149) at org.deeplearning4j.nn.layers.feedforward.embedding.EmbeddingSequenceLayer.backpropGradient(EmbeddingSequenceLayer.java:64) at org.deeplearning4j.nn.graph.vertex.impl.LayerVertex.doBackward(LayerVertex.java:148) at org.deeplearning4j.nn.graph.ComputationGraph.calcBackpropGradients(ComputationGraph.java:2772)
I tried some parameters with the RNNFormat and DataType Parameter for the embedding layer (set to int and to float but no change so far. Can anybody give me a hint what i am doing wrong?
Can you file an issue along with a minimal reproducer (we don’t need your actual data, or model, just something that will consistently trigger the bug)?