How to predict with an imported Keras model

I have trained an exported a model in python using Keras.

vocab_size = 50000
embedding_dim = 100
model = Sequential()
model.add(Embedding(vocab_size, embedding_dim, input_length=train_padded.shape[1]))

model.add(Conv1D(48, 7, activation='relu', padding='valid'))
model.add(GlobalMaxPooling1D())
model.add(Dropout(0.5))

model.add(Flatten())
model.add(Dropout(0.5))

model.add(Dense(7, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

Then I imported:

val modelPath = File("C:/Users/Cherr/Desktop/Development/transaction_classification.h5")
    val model: MultiLayerNetwork = KerasModelImport.importKerasSequentialModelAndWeights(modelPath.absolutePath)

So I can predict in python using:

seq = tokenizer.texts_to_sequences('transfer to new customer')
padded = pad_sequences(seq, maxlen=max_length, padding=padding_type, truncating=trunc_type)
pred = model.predict(padded)

I have created a padded array with word_index of each words but dunno how to transform that array into NDArray for prediction.
Any idea what I am doing wrong??

@Cherrio-LLC I feel like I’m missing something here. Could you tell me what tokenizer you’re using? We have examples for NLP but it depends on what your input word vectors are.
You can search around in here for word2vec and other ones: GitHub - deeplearning4j/deeplearning4j-examples: Deeplearning4j Examples (DL4J, DL4J Spark, DataVec)

Once I know what your source of word vectors is I can give you more concrete advice.

So while training I converted each unique words to index then fed the indexes of each sentence to a sequence of shape (1,10) to the embedding layer to output an array of each word’s vector.

But I’ve been able to prepare data for prediction but I have been hit by this error:

Exception in thread "main" org.nd4j.linalg.exception.ND4JIllegalStateException: New shape length doesn't match original length: [288] vs [48]. Original shape: [1, 48] New Shape: [1, 288]
	at org.nd4j.linalg.api.ndarray.BaseNDArray.reshape(BaseNDArray.java:3804)
	at org.nd4j.linalg.api.ndarray.BaseNDArray.reshape(BaseNDArray.java:3749)
	at org.nd4j.linalg.api.ndarray.BaseNDArray.reshape(BaseNDArray.java:3872)
	at org.nd4j.linalg.api.ndarray.BaseNDArray.reshape(BaseNDArray.java:4099)
	at org.deeplearning4j.preprocessors.KerasFlattenRnnPreprocessor.preProcess(KerasFlattenRnnPreprocessor.java:49)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.outputOfLayerDetached(MultiLayerNetwork.java:1299)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2467)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2430)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2421)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2408)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.predict(MultiLayerNetwork.java:2270)

So I had to print out my imported model to see what I missed:

===============================================================================================
LayerName (LayerType)                          nIn,nOut    TotalParams   ParamsShape           
===============================================================================================
embedding_12 (EmbeddingSequenceLayer)          50000,128   6,400,000     W:{50000,128}         
conv1d_12 (Convolution1DLayer)                 128,48      30,768        b:{48}, W:{48,128,5,1}
global_max_pooling1d_12 (GlobalPoolingLayer)   -,-         0             -                     
dropout_24 (DropoutLayer)                      -,-         0             -                     
dropout_25 (DropoutLayer)                      -,-         0             -                     
dense_12 (DenseLayer)                          288,7       2,023         W:{288,7}, b:{7}      
dense_12_loss (LossLayer)                      -,-         0             -                     
-----------------------------------------------------------------------------------------------
            Total Parameters:  6,432,791
        Trainable Parameters:  6,432,791
           Frozen Parameters:  0
===============================================================================================

There’s a lot of differences to the model i trained in python:

_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding_12 (Embedding)    (None, 10, 128)           6400000   
                                                                 
 conv1d_12 (Conv1D)          (None, 6, 48)             30768     
                                                                 
 global_max_pooling1d_12 (Gl  (None, 48)               0         
 obalMaxPooling1D)                                               
                                                                 
 dropout_24 (Dropout)        (None, 48)                0         
                                                                 
 flatten_12 (Flatten)        (None, 48)                0         
                                                                 
 dropout_25 (Dropout)        (None, 48)                0         
                                                                 
 dense_12 (Dense)            (None, 7)                 343       
                                                                 
=================================================================
Total params: 6,431,111
Trainable params: 6,431,111
Non-trainable params: 0

The model configuration:

model = Sequential()
model.add(Embedding(vocab_size, embedding_dim, input_length=train_padded.shape[1]))

model.add(Conv1D(48, 5, activation='relu', padding='valid'))
model.add(GlobalMaxPooling1D())
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dropout(0.5))
model.add(Dense(7, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

So I noticed the Flatten layer was missing in the summary from DL4J imported model. Anything I am doing wrong??

@Cherrio-LLC No flatten isn’t a “layer” in this case. If you’re ever curious about how something works the library is open source :slight_smile: you can look at what it’s doing. You can see what we do here:

Dl4j’s api uses pre processors for certain kinds of layers. So no you’re fine.

Could you show an end to end example of where you attempt to perform inference? Please also include some dummy data that represents the shape of your input. This part is important because I need to see what your interpretation of the inputs/outputs are first.

@Cherrio-LLC your stackoverflow post gave me what I needed. This was actually a bug and has been fixed: Fixes Global Pooling collapse dimension cases with rnns in keras import. by agibsonccc · Pull Request #9769 · deeplearning4j/deeplearning4j · GitHub

Snapshots will go up soon so you can use this. Thanks!

Okay. Thanks. How will I know when it’s up??

What is the snapshot version? I saw 1.0.0-SNAPSHOT. Which I don’t believe is the latest.

@Cherrio-LLC it actually is. We are working towards. 1.0 and currently on milestones. Please do let me know if anything else comes up.

@agibsonccc The latest snapshot build is 09/09/2022 is that correct?

I added the snapshot:

implementation "org.deeplearning4j:deeplearning4j-core:1.0.0-SNAPSHOT"

So when I run the code, I get the error:

Exception in thread "main" java.lang.ExceptionInInitializerError
	at org.deeplearning4j.nn.modelimport.keras.Hdf5Archive.readDataSet(Hdf5Archive.java:295)
	at org.deeplearning4j.nn.modelimport.keras.Hdf5Archive.readDataSet(Hdf5Archive.java:109)
	at org.deeplearning4j.nn.modelimport.keras.utils.KerasModelUtils.importWeights(KerasModelUtils.java:335)
	at org.deeplearning4j.nn.modelimport.keras.KerasSequentialModel.<init>(KerasSequentialModel.java:154)
	at org.deeplearning4j.nn.modelimport.keras.KerasSequentialModel.<init>(KerasSequentialModel.java:55)
	at org.deeplearning4j.nn.modelimport.keras.utils.KerasModelBuilder.buildSequential(KerasModelBuilder.java:326)
	at org.deeplearning4j.nn.modelimport.keras.KerasModelImport.importKerasSequentialModelAndWeights(KerasModelImport.java:218)
	at utils.nn.TestNNKt.main(TestNN.kt:449)
	at utils.nn.TestNNKt.main(TestNN.kt)
Caused by: java.lang.RuntimeException: java.lang.NullPointerException
	at org.nd4j.linalg.factory.Nd4j.initWithBackend(Nd4j.java:5208)
	at org.nd4j.linalg.factory.Nd4j.initContext(Nd4j.java:5073)
	at org.nd4j.linalg.factory.Nd4j.<clinit>(Nd4j.java:299)
	... 9 more
Caused by: java.lang.NullPointerException
	at org.nd4j.linalg.factory.Nd4j.initWithBackend(Nd4j.java:5164)
	... 11 more

I think the snapshot is not up yet.

@agibsonccc is the latest snapshot build up yet?

@Cherrio-LLC here’s the latest build of the relevant code: .github/workflows/build-deploy-cross-platform.yml · deeplearning4j/deeplearning4j@028559b · GitHub could you try again?

@agibsonccc Been getting the following error since i switched to the snapshot:

Exception in thread "main" java.lang.ExceptionInInitializerError
	at org.deeplearning4j.nn.modelimport.keras.Hdf5Archive.readDataSet(Hdf5Archive.java:295)
	at org.deeplearning4j.nn.modelimport.keras.Hdf5Archive.readDataSet(Hdf5Archive.java:109)
	at org.deeplearning4j.nn.modelimport.keras.utils.KerasModelUtils.importWeights(KerasModelUtils.java:335)
	at org.deeplearning4j.nn.modelimport.keras.KerasSequentialModel.<init>(KerasSequentialModel.java:154)
	at org.deeplearning4j.nn.modelimport.keras.KerasSequentialModel.<init>(KerasSequentialModel.java:55)
	at org.deeplearning4j.nn.modelimport.keras.utils.KerasModelBuilder.buildSequential(KerasModelBuilder.java:326)
	at org.deeplearning4j.nn.modelimport.keras.KerasModelImport.importKerasSequentialModelAndWeights(KerasModelImport.java:218)

@Cherrio-LLC hmm…could you post the full stack trace? There should be something at the bottom that reveals the actual problem.

@agibsonccc this is the full stacktrace

Exception in thread "main" java.lang.ExceptionInInitializerError
	at org.deeplearning4j.nn.modelimport.keras.Hdf5Archive.readDataSet(Hdf5Archive.java:295)
	at org.deeplearning4j.nn.modelimport.keras.Hdf5Archive.readDataSet(Hdf5Archive.java:109)
	at org.deeplearning4j.nn.modelimport.keras.utils.KerasModelUtils.importWeights(KerasModelUtils.java:335)
	at org.deeplearning4j.nn.modelimport.keras.KerasSequentialModel.<init>(KerasSequentialModel.java:154)
	at org.deeplearning4j.nn.modelimport.keras.KerasSequentialModel.<init>(KerasSequentialModel.java:55)
	at org.deeplearning4j.nn.modelimport.keras.utils.KerasModelBuilder.buildSequential(KerasModelBuilder.java:326)
	at org.deeplearning4j.nn.modelimport.keras.KerasModelImport.importKerasSequentialModelAndWeights(KerasModelImport.java:218)
	at utils.nn.TestNNKt.main(TestNN.kt:449)
	at utils.nn.TestNNKt.main(TestNN.kt)
Caused by: java.lang.RuntimeException: java.lang.NullPointerException
	at org.nd4j.linalg.factory.Nd4j.initWithBackend(Nd4j.java:5208)
	at org.nd4j.linalg.factory.Nd4j.initContext(Nd4j.java:5073)
	at org.nd4j.linalg.factory.Nd4j.<clinit>(Nd4j.java:299)
	... 9 more
Caused by: java.lang.NullPointerException
	at org.nd4j.linalg.factory.Nd4j.initWithBackend(Nd4j.java:5164)
	... 11 more

@agibsonccc noticed if I down grade from the snapshot version, it doesn’t throw that error.

@agibsonccc Any update on the error??

@Cherrio-LLC sorry I wasn’t able to prioritize this because it’s hard to act on. I"m not really sure what your specific local issue is. This feels like a fairly standard snapshots repository issue specific to your system. Could you show how you have snapshots configured? Maybe you’re hooking in to the old repository.

It’s been like this for a few releases not but we migrated from oss.sonatype.org to the newer s01.oss.sonatype.org. Could you clarify you’re even getting snapshots from the right location first?

BLASDelegator is a few weeks old now and shouldn’t have any issues. The only thing I can do is try re kicking the snapshots build off again and seeing what that does. I’ve done that just now and hope that resolves it.

Sorry this is just a hard to reproduce/troubleshoot issue since I don’t know enough about your local system to know why certain modules are missing.

You can always clone the source and just run mvn clean install -rf :nd4j which is a pure java only build not involving any c++ bits so that shouldn’t be hard to deal with as well.

Thank you for your help so far. I’m trying to update the backend to the snapshot version. But i’m getting the following error. Using gradle by the way.

Could not find nd4j-native-1.0.0-SNAPSHOT-windows-x86_64.jar (org.nd4j:nd4j-native:1.0.0-SNAPSHOT:20220916.020818-2347).
Searched in the following locations:
    https://s01.oss.sonatype.org/content/repositories/snapshots/org/nd4j/nd4j-native/1.0.0-SNAPSHOT/nd4j-native-1.0.0-20220916.020818-2347-windows-x86_64.jar

My gradle config:

    implementation 'org.deeplearning4j:deeplearning4j-nn:1.0.0-SNAPSHOT'
    implementation 'org.nd4j:nd4j-native:1.0.0-SNAPSHOT'
    implementation 'org.nd4j:nd4j-native:1.0.0-SNAPSHOT:windows-x86_64'