Can not import models from keras

Hi.I’m tring to import a model from keras.The model is from spleeter,and I’ve exported the model to a json config file and a .h5 file with all the weights.
However I met this problem:/usr/lib/jvm/java-11-openjdk-amd64/bin/java -SNAPSHOT/nd4j-parameter-server-client-1.0.0-20200511.103637-6451.jar Hello ----TENSORFLOW ----TENSORFLOW ----TENSORFLOW ----TENSORFLOW ----TENSORFLOW ----TENSORFLOW ----TENSORFLOW Exception in thread "main" org.deeplearning4j.nn.conf.inputs.InvalidInputTypeException: Invalid input: MergeVertex cannot merge CNN activations of different width/heights:first [channels,width,height] = [256,256,128], input 1 = [256,512,256] at org.deeplearning4j.nn.conf.graph.MergeVertex.getOutputType(MergeVertex.java:208) at org.deeplearning4j.nn.modelimport.keras.layers.core.KerasMerge.getOutputType(KerasMerge.java:153) at org.deeplearning4j.nn.modelimport.keras.KerasModel.inferOutputTypes(KerasModel.java:304) at org.deeplearning4j.nn.modelimport.keras.KerasModel.<init>(KerasModel.java:179) at org.deeplearning4j.nn.modelimport.keras.KerasModel.<init>(KerasModel.java:96) at org.deeplearning4j.nn.modelimport.keras.utils.KerasModelBuilder.buildModel(KerasModelBuilder.java:307) at org.deeplearning4j.nn.modelimport.keras.KerasModelImport.importKerasModelConfiguration(KerasModelImport.java:319) at Hello.main(Hello.java:62)

The model is exported from keras with the tf backend,and it can be imported back to keras with python. Here is the json file of the model.Could you please help me with this problem?Why can’t I import the model?

What DL4J version have you tried using? Also, please try snapshots, as it has a very much improved keras import.

How to configure snapshots

Yes,I’m using the SNAP-SHOT now. Also ,I change the keras backend to theano,instead of tensorflow.And now I got this exceptions:

----TENSORFLOW
----TENSORFLOW
----TENSORFLOW
----TENSORFLOW
----TENSORFLOW
----TENSORFLOW
----TENSORFLOW
Exception in thread "main" org.deeplearning4j.exception.DL4JInvalidConfigException: Invalid configuration for layer (idx=-1, name=conv2d_1, type=ConvolutionLayer) for width dimension:  Invalid input configuration for kernel width. Require 0 < kW <= inWidth + 2*padW; got (kW=5, inWidth=2, padW=0)
Input type = InputTypeConvolutional(h=1024,w=2,c=512,NCHW), kernel = [5, 5], strides = [2, 2], padding = [0, 0], layer size (output channels) = 16, convolution mode = Same
	at org.deeplearning4j.nn.conf.layers.InputTypeUtil.getOutputTypeCnnLayers(InputTypeUtil.java:394)
	at org.deeplearning4j.nn.conf.layers.ConvolutionLayer.getOutputType(ConvolutionLayer.java:194)
	at org.deeplearning4j.nn.modelimport.keras.layers.convolutional.KerasConvolution2D.getOutputType(KerasConvolution2D.java:148)
	at org.deeplearning4j.nn.modelimport.keras.KerasModel.inferOutputTypes(KerasModel.java:304)
	at org.deeplearning4j.nn.modelimport.keras.KerasModel.<init>(KerasModel.java:179)
	at org.deeplearning4j.nn.modelimport.keras.KerasModel.<init>(KerasModel.java:96)
	at org.deeplearning4j.nn.modelimport.keras.utils.KerasModelBuilder.buildModel(KerasModelBuilder.java:307)
	at org.deeplearning4j.nn.modelimport.keras.KerasModelImport.importKerasModelConfiguration(KerasModelImport.java:319)
	at Hello.main(Hello.java:62)

The model is exported by keras,and I’ve also changed the tf.keras.layers to keras.layers.I think there is nothing related with the tensorflow.But in the java console I got the first lines of “TENSORFLOW”.And then following some exceptions.You can see my model code here
There’s nothing wrong with the model,since it is from the spleeter
What’s more,the keras version is 2.2.4,and the theano version is 1.0.4.

You should not need to do that with snapshot versions. They are supposed to work fine with tf.keras.

Edit:

You are not printing that yourself? Can you share the code you are using to load the model?

Just the same as code of the beginner guide.I did nothing except :
ComputationGraph computationGraph = KerasModelImport.importKerasModelAndWeights("kerasModels/spleeter.h5",false);

or
ComputationGraphConfiguration configuration = KerasModelImport.importKerasModelConfiguration("kerasModels/model_config.json",false);

There’s nothing else.The nd4j,dl4j,datavec version:

<nd4j.version>1.0.0-SNAPSHOT</nd4j.version>
<dl4j.version>1.0.0-SNAPSHOT</dl4j.version>
<datavec.version>1.0.0-SNAPSHOT</datavec.version>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>

Have you also added the snapshot repository and imported those changes to your ide? If you are using the newest version of intellij, it doesn’t reimport changes to the pom.xml automatically anymore.

Yes,I’ve applied all this changes.emmm,I will create a new project based on maven to have a test.Later I’ll post here

So, 2 issues here:
(a) printing “----TENSORFLOW” on import
(b) exception

The printing was just some debug code that managed to slip through to master recently - has been removed here: Keras import - remove debug lines (println/log) [WIP] by AlexDBlack · Pull Request #459 · KonduitAI/deeplearning4j · GitHub
Thanks for flagging that.

As for the exception: It’s a mismatch between NCHW and NHWC format - i.e., the network thinks the net is in NCHW format when it’s actually in NHWC format. Not sure yet what the cause is, but I’ll take a look.

Thanks for your reply.In fact I’m totally unfamaliar with the detail of the network,and also unfamiliar with AI.My goal is just to deploy the network to do some work.According to the totorials of dl4j and the layers of the model in keras I used,I think it’s Ok to deploy in Java environmentl via dl4j and nd4j. If you need information about the model for debug,I’m glad to post it here,and it’s open source totally.Thanks.

OK, so the problem here turned out to be these reused fields:

conv_activation_layer = LeakyReLU(0.2)
deconv_activation_layer = ReLU()

Keras is actually treating these as shared layers, instead of what you’d expect which is just duplicating them. Technically it’s a valid model but as an architecture having one activation layer used multiple times doesn’t make too much sense in practice.

So there are 2 simple possible workarounds here:
(a) replace these definitions with a lambda (so that a new activation layer instance will be created on each call:

conv_activation_layer = lambda x: LeakyReLU(0.2)(x)
deconv_activation_layer = lambda x: ReLU()(x)

Or replace these sorts calls: rel5 = conv_activation_layer(batch5)
with something like this: rel5 = LeakyRelu(0.2)(batch5)

At some point we’ll try to work around this for shared activation layers, but we aren’t supporting shared layers with parameters for now (but can throw a proper/better exception).

1 Like

Ok I get what you mean.Just like in Java,the reused layers point to one unit of memory.In fact,I should new a new layer to avoid sharing the same layer,yes?
Thanks ,let me hava a try.

Good job,this works.But will this change the structure of the model?

It will change the topology of your keras model, but results will be the same.

1 Like