So while training I converted each unique words to index then fed the indexes of each sentence to a sequence of shape (1,10) to the embedding layer to output an array of each word’s vector.
But I’ve been able to prepare data for prediction but I have been hit by this error:
Exception in thread "main" org.nd4j.linalg.exception.ND4JIllegalStateException: New shape length doesn't match original length: [288] vs [48]. Original shape: [1, 48] New Shape: [1, 288]
at org.nd4j.linalg.api.ndarray.BaseNDArray.reshape(BaseNDArray.java:3804)
at org.nd4j.linalg.api.ndarray.BaseNDArray.reshape(BaseNDArray.java:3749)
at org.nd4j.linalg.api.ndarray.BaseNDArray.reshape(BaseNDArray.java:3872)
at org.nd4j.linalg.api.ndarray.BaseNDArray.reshape(BaseNDArray.java:4099)
at org.deeplearning4j.preprocessors.KerasFlattenRnnPreprocessor.preProcess(KerasFlattenRnnPreprocessor.java:49)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.outputOfLayerDetached(MultiLayerNetwork.java:1299)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2467)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2430)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2421)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2408)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.predict(MultiLayerNetwork.java:2270)
So I had to print out my imported model to see what I missed:
===============================================================================================
LayerName (LayerType) nIn,nOut TotalParams ParamsShape
===============================================================================================
embedding_12 (EmbeddingSequenceLayer) 50000,128 6,400,000 W:{50000,128}
conv1d_12 (Convolution1DLayer) 128,48 30,768 b:{48}, W:{48,128,5,1}
global_max_pooling1d_12 (GlobalPoolingLayer) -,- 0 -
dropout_24 (DropoutLayer) -,- 0 -
dropout_25 (DropoutLayer) -,- 0 -
dense_12 (DenseLayer) 288,7 2,023 W:{288,7}, b:{7}
dense_12_loss (LossLayer) -,- 0 -
-----------------------------------------------------------------------------------------------
Total Parameters: 6,432,791
Trainable Parameters: 6,432,791
Frozen Parameters: 0
===============================================================================================
There’s a lot of differences to the model i trained in python:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_12 (Embedding) (None, 10, 128) 6400000
conv1d_12 (Conv1D) (None, 6, 48) 30768
global_max_pooling1d_12 (Gl (None, 48) 0
obalMaxPooling1D)
dropout_24 (Dropout) (None, 48) 0
flatten_12 (Flatten) (None, 48) 0
dropout_25 (Dropout) (None, 48) 0
dense_12 (Dense) (None, 7) 343
=================================================================
Total params: 6,431,111
Trainable params: 6,431,111
Non-trainable params: 0
The model configuration:
model = Sequential()
model.add(Embedding(vocab_size, embedding_dim, input_length=train_padded.shape[1]))
model.add(Conv1D(48, 5, activation='relu', padding='valid'))
model.add(GlobalMaxPooling1D())
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dropout(0.5))
model.add(Dense(7, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
So I noticed the Flatten layer was missing in the summary from DL4J imported model. Anything I am doing wrong??