Not sure if cudnn was used during inference

I was trying to set up a gradle project with semantic segmentation functionality within. The UNet model was trained in Python-Keras environment, and I tried to import the .h5 model file in the gradle project and do the inference job. Everything worked fine except for the inference speed, so I was looking for the problem, especially for the cudnn library usage.

As mentioned in, we can check for cudnn by looking at the convolutionHelper class. So I loaded the model and added the extra two lines:
String modelPath = “/data/model/unet_0803.h5”;

ComputationGraph model = KerasModelImport.importKerasModelAndWeights(modelPath, false);

LayerHelper h = model.getLayer(0).getHelper();

System.out.println("Layer helper: " + (h == null ? null : h.getClass().getName()));

The layer 0 is a convolution layer, but the layer helper is null, while I was expecting a org.deeplearning4j.cuda.convolution.CudnnConvolutionHelper. Do you know what did I miss here? The dl4j related dependencies in build.gradle is like:

def dl4jVersion = 1.0.0-M1.1
compile “org.nd4j:nd4j-cuda-11.2-platform:${dl4jVersion}”
compile(group: ‘org.deeplearning4j’, name: ‘deeplearning4j-core’, version: dl4jVersion){
exclude group: ‘org.bytedeco’, module: ‘opencv-platform’
exclude group: ‘org.bytedeco’, module: ‘leptonica-platform’
exclude group: ‘org.nd4j’, module: ‘nd4j-base64’
compile “org.deeplearning4j:deeplearning4j-cuda-11.2:${dl4jVersion}”

Thanks a lot!

For posterity I’ll comment here as well as on the issue. You need to have cudnn included. It’s not by default due to binary size trade offs users want to make sometimes. We give users the option of a cudnn backend. This is documented on the website:

1 Like

Thanks for your reply!
I added the cuda/cudnn related dependencies as this link suggested.

But cudnn just could not load:
MSVC: 192930038
STD version: 201703L
CUDA: 11.2.142

Any further suggestions on that? Or do you have any other workarounds on inference acceleration? I only have a 8G 3070 GPU so I think it would not support multi-threaded inference…

I got the cudnn loaded on the backend with correct dependency configuration. But still cannot initialize the convolutionHelper. The helper is initialized in ConvolutionLayer by following code:

    void initializeHelper() {
        helper = HelperUtils.createHelper(CUDA_CNN_HELPER_CLASS_NAME,
                ConvolutionHelper.class, layerConf().getLayerName(), dataType

It seems the code casts org.deeplearning4j.cuda.convolution.CudnnConvolutionHelper to interface org.deeplearning4j.nn.layers.convolution.ConvolutionHelper, which will raise exception:

java.lang.NoSuchMethodException: org.deeplearning4j.cuda.convolution.CudnnConvolutionHelper.<init>([Ljava.lang.Object;)
at java.base/java.lang.Class.getConstructor0(
at java.base/java.lang.Class.getDeclaredConstructor(
at org.deeplearning4j.common.config.DL4JClassLoading.createNewInstance(
at org.deeplearning4j.common.config.DL4JClassLoading.createNewInstance(
at org.deeplearning4j.nn.layers.HelperUtils.createHelper(

@agibsonccc Could you give me some insights on the helper initialization? Many thanks!

@abcxys Again, that’s the old way and will be removed later. Please quit trying to use the old dl4j way and please quit confusing the old way and the new way.

Sorry for being direct here, but I would prefer to focus on debugging the nd4j way of loading cudnn by itself without anything involving the old dependencies.
As I mentioned, the nd4j way automatically works in c++. If that shows you have cudnn, then you should be calling in to cudnn fine.

We’ll be adding more debugging statements next release to the actual execution itself to make this a bit easier under a flag. That way all you have to do is turn a debug mode on to verify this. Thanks for understanding.

@agibsonccc Sorry for the redundant questions. My last question is: so as long as the backend build information shows “HAVE_CUDNN”, I don’t need to care whether the layer helper loads anything or just null, right?

@abcxys Thanks again for understanding! Yes you should be fine. I just want you to focus on the right things :slight_smile: the dl4j wrapper I know is confusing (hence why we show a clear distinction with the old and the new way in the docs).

It’s just a waste of your time to focus on that hence me being a bit more straight with that.

I’m aware that the c++ internals aren’t too transparent hence why we want to add the execution calls in there for both end users and even just the core maintainers who need to debug performance issues.

Edit: Also for performance reasons if you’re finding a difference you’re welcome to test both the dl4j backend and the nd4j backend and see if there’s any difference. If there is, then that’s a regression and we should do something about that. I’m insisting on the new backend because there’s nothing happening at the java level anymore and it’s less code to debug.

The problem with the old helpers is we do a bunch of reflection and other error prone configuration to get the right calls working.

Doing everything at the c++ level allows us to both debug things in the same language (c/c++ and c/c++ cuda) as well as allow for less problems overall. The nd4j backend is the only way that you can use the
new calls from samediff for example. Everything was hard coded in to the old dl4j layers before.