BertIterator for own model to train

thomas · October 9, 2021, 9:30am

Hey,
i tried to build my own computation graph to get trained by a bertiterator. My BertIterator init looks like this now:

BertIterator iter = new BertIterator.Builder()
.tokenizer(tokenizerFactory)
.lengthHandling(LengthHandling. FIXED_LENGTH , sentlen)
.minibatchSize(10)
.sentenceProvider(provider)
.task(Task. UNSUPERVISED )
.vocabMap(tokenizerFactory.getVocab())
.featureArrays(BertIterator.FeatureArrays. INDICES_MASK )
.masker( new BertMaskedLMMasker( new Random(12345), 0.2, 0.5, 0.5))
.unsupervisedLabelFormat(BertIterator.UnsupervisedLabelFormat. RANK3_NCL )
.maskToken(“[MASK]”)
.build();

I have an EmbeddingSequenceLayer for the input layer …

new EmbeddingSequenceLayer.Builder() .nIn(vocabsize) .nOut(756) .build();

and an RnnOutputLayer as my output:

new RnnOutputLayer.Builder() .nOut(vocabsize) .dataFormat(RNNFormat.NCW) .activation(Activation.SOFTMAX) .build()

When i try to fit some text examples i got the following error:

Exception in thread "main" java.lang.IllegalArgumentException: Op.X must have same data type as Op.Y: X.datatype=FLOAT, Y.datatype=INT at org.nd4j.common.base.Preconditions.throwEx(Preconditions.java:633) at org.nd4j.common.base.Preconditions.checkArgument(Preconditions.java:134) at org.nd4j.linalg.api.ops.BaseBroadcastOp.validateDataTypes(BaseBroadcastOp.java:200) at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:889) at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:879) at org.nd4j.linalg.factory.Broadcast.mul(Broadcast.java:149) at org.deeplearning4j.nn.layers.feedforward.embedding.EmbeddingSequenceLayer.backpropGradient(EmbeddingSequenceLayer.java:64) at org.deeplearning4j.nn.graph.vertex.impl.LayerVertex.doBackward(LayerVertex.java:148) at org.deeplearning4j.nn.graph.ComputationGraph.calcBackpropGradients(ComputationGraph.java:2772)

I tried some parameters with the RNNFormat and DataType Parameter for the embedding layer (set to int and to float but no change so far. Can anybody give me a hint what i am doing wrong?

Best regards

Thomas

treo · October 11, 2021, 12:57pm

This looks like a bug.

Can you file an issue along with a minimal reproducer (we don’t need your actual data, or model, just something that will consistently trigger the bug)?

thomas · October 11, 2021, 2:09pm

Thanks for your reply i opened an new issue:

github.com/eclipse/deeplearning4j

EmbeddingSequenceLayer not working with BertIterator

opened 02:08PM - 11 Oct 21 UTC

closed 10:21PM - 18 Oct 21 UTC

thomas-trendsoft

#### Issue Description I try to build a EmbeddingSequenceLayer training with …some BertIterator. My expectation was that the model would do a fit. The current behavious is an DataType Exception, see below on additional information. #### Version Information Please indicate relevant versions, including, if relevant: * Deeplearning4j version = M1.1 * Platform information = MacOS * CUDA = no #### Additional Information Code: `public` static void main(String[] args) throws IOException, InterruptedException { int sentlen = 100; LabeledSentenceProvider provider; provider = ... some provider ... BertWordPieceTokenizerFactory tokenizerFactory = new BertWordPieceTokenizerFactory(new File("./vocab.txt"), false, true, Charsets.UTF_8); BertIterator iter = new BertIterator.Builder() .tokenizer(tokenizerFactory) .lengthHandling(LengthHandling.FIXED_LENGTH, sentlen) .minibatchSize(10) .sentenceProvider(provider) .task(Task.UNSUPERVISED) .vocabMap(tokenizerFactory.getVocab()) .featureArrays(BertIterator.FeatureArrays.INDICES_MASK) .masker(new BertMaskedLMMasker(new Random(12345), 0.2, 0.5, 0.5)) .unsupervisedLabelFormat(BertIterator.UnsupervisedLabelFormat.RANK3_NCL) .maskToken("[MASK]") .build(); int vocabsize = tokenizerFactory.getVocab().size(); HashMap<String,InputPreProcessor> preproc = new HashMap<>(); preproc.put("output", new RnnToFeedForwardPreProcessor(RNNFormat.NCW)); ComputationGraphConfiguration.GraphBuilder builder = new NeuralNetConfiguration.Builder() .seed(42345) .l2(0.0001) .weightInit(WeightInit.XAVIER) .updater(new Adam(0.0015)) .graphBuilder(); builder.setInputTypes(InputType.recurrent(vocabsize, sentlen, RNNFormat.NCW)); builder.addInputs("token"); System.out.println("VOCAB size: " + vocabsize); // embedding tokens layer builder.addLayer("emb", new EmbeddingSequenceLayer.Builder() .nIn(vocabsize) .nOut(756) .build() , "token"); // try a single transfer block first // attention multi head builder.addLayer("attention1",new SelfAttentionLayer.Builder() .nIn(756) .nOut(756) .nHeads(2) .projectInput(true) .build() , "emb"); // feed forward builder.addLayer("ffint1", new LSTM.Builder() .nOut(756) .build(), "attention1"); // make it output builder.addLayer("output", new RnnOutputLayer.Builder() .nOut(vocabsize) .dataFormat(RNNFormat.NCW) .activation(Activation.SOFTMAX) .build(), "ffint1"); builder.setOutputs("output"); //ComputationGraphConfiguration ComputationGraph model = new ComputationGraph(builder.build()); model.fit(iter); } ` StackTrace: `Exception in thread "main" java.lang.IllegalArgumentException: Op.X must have same data type as Op.Y: X.datatype=FLOAT, Y.datatype=INT at org.nd4j.common.base.Preconditions.throwEx(Preconditions.java:633) at org.nd4j.common.base.Preconditions.checkArgument(Preconditions.java:134) at org.nd4j.linalg.api.ops.BaseBroadcastOp.validateDataTypes(BaseBroadcastOp.java:200) at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:889) at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:879) at org.nd4j.linalg.factory.Broadcast.mul(Broadcast.java:149) at org.deeplearning4j.nn.layers.feedforward.embedding.EmbeddingSequenceLayer.backpropGradient(EmbeddingSequenceLayer.java:64) at org.deeplearning4j.nn.graph.vertex.impl.LayerVertex.doBackward(LayerVertex.java:148) at org.deeplearning4j.nn.graph.ComputationGraph.calcBackpropGradients(ComputationGraph.java:2772) at org.deeplearning4j.nn.graph.ComputationGraph.computeGradientAndScore(ComputationGraph.java:1381) at org.deeplearning4j.nn.graph.ComputationGraph.computeGradientAndScore(ComputationGraph.java:1341) at org.deeplearning4j.optimize.solvers.BaseOptimizer.gradientAndScore(BaseOptimizer.java:174) at org.deeplearning4j.optimize.solvers.StochasticGradientDescent.optimize(StochasticGradientDescent.java:61) at org.deeplearning4j.optimize.Solver.optimize(Solver.java:52) at org.deeplearning4j.nn.graph.ComputationGraph.fitHelper(ComputationGraph.java:1165) at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1115) at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1082) ` #### Contributing Sorry no fix, only a big thanks for your great work.

Topic		Replies	Views
SelfAttention Token Training Example DL4J	2	258	January 5, 2023
BertIterator produces NPE while training on GPU DL4J	2	641	June 5, 2020
Importing BERT fails with Unable to find name dataType for op name: "tensorarrayv3" SameDiff	8	481	February 3, 2022
Bert fail to train on ner task SameDiff	5	507	October 13, 2021
Bert Model in DL4j - for text similarity DL4J	5	274	December 24, 2023

BertIterator for own model to train

Related topics