I’m still struggling to make the MultiGpuLenetMnistExample code work using GPUs.
After compiling it using the dl4j-cuda-specific-examples-1.0.0-beta6-bin.jar and dl4j-spark-1.0.0-beta6-bin.jar as part of the CLASSPATH, I got an error about adding nd4j backend on my classpath:
14:38:23,151 WARN ~ Skipped [JCublasBackend] backend (unavailable): java.lang.UnsatisfiedLinkError: /home/dl4j/.javacpp/cache/dl4j-spark-1.0.0-beta6-bin.jar/org/bytedeco/cuda/linux-x86_64/libjnicudart.so: libcudart.so.10.2: cannot open shared object file: No such file or directory
Exception in thread “main” java.lang.ExceptionInInitializerError
at org.deeplearning4j.nn.conf.NeuralNetConfiguration$Builder.seed(NeuralNetConfiguration.java:579)
at MultiGpuLenetMnistExample.main(MultiGpuLenetMnistExample.java:70)
Caused by: java.lang.RuntimeException: org.nd4j.linalg.factory.Nd4jBackend$NoAvailableBackendException: Please ensure that you have an nd4j backend on your classpath. Please see: http://nd4j.org/getstarted.html
at org.nd4j.linalg.factory.Nd4j.initContext(Nd4j.java:5131)
at org.nd4j.linalg.factory.Nd4j.(Nd4j.java:226)
… 2 more
Caused by: org.nd4j.linalg.factory.Nd4jBackend$NoAvailableBackendException: Please ensure that you have an nd4j backend on your classpath. Please see: http://nd4j.org/getstarted.html
at org.nd4j.linalg.factory.Nd4jBackend.load(Nd4jBackend.java:218)
at org.nd4j.linalg.factory.Nd4j.initContext(Nd4j.java:5128)
… 3 more
I read the referred website for getting started with nd4j, but there is no explanation as to which jar file to add in the CLASSPATH. By the way, for diverse reasons I do not use any IDE for my coding/compiling/executing in Java. The explanations are provided only for IntelliJ or Eclipse.
What you are seeing is simply that it can not load the cuda runtime. It is usually installed when you install the CUDA driver, so I guess you didn’t do that.
thanks for the note. I have installed the cuda driver (copied the libcudnn.so, libcudnn.so.7, libcudnn.so.7.4.1, libcudnn_static.a into the /usr/local/cuda/lib64/ dir as specified) and re-installed the deeplearning-examples making the changes in the pom.xml to specify the cuda I have installed:
nd4j-cuda-10.0-platform
...
After compiling and executing the MultiGpuLenetMnistExample code, I still got the same error as before:
09:59:19,295 INFO ~ Loaded [JCublasBackend] backend
09:59:27,555 INFO ~ Number of threads used for linear algebra: 32
09:59:27,594 INFO ~ Number of threads used for OpenMP BLAS: 0
09:59:27,604 INFO ~ Backend used: [CUDA]; OS: [Linux]
09:59:27,604 INFO ~ Cores: [24]; Memory: [26.7GB];
09:59:27,604 INFO ~ Blas vendor: [CUBLAS]
09:59:27,623 INFO ~ ND4J CUDA build version: 10.0.130
09:59:27,624 INFO ~ CUDA device 0: [GeForce GTX 1070]; cc: [6.1]; Total memory: [8513978368]
09:59:27,625 INFO ~ CUDA device 1: [GeForce GTX 1070]; cc: [6.1]; Total memory: [8513978368]
09:59:27,674 INFO ~ Starting MultiLayerNetwork with WorkspaceModes set to [training: ENABLED; inference: ENABLED], cacheMode set to [NONE]
09:59:28,570 INFO ~ cuDNN not found: use cuDNN for better GPU performance by including the deeplearning4j-cuda module. For more information, please refer to: https://deeplearning4j.org/docs/latest/deeplearning4j-config-cudnn
java.lang.ClassNotFoundException: org.deeplearning4j.nn.layers.convolution.CudnnConvolutionHelper
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at org.deeplearning4j.nn.layers.convolution.ConvolutionLayer.initializeHelper(ConvolutionLayer.java:77)
at org.deeplearning4j.nn.layers.convolution.ConvolutionLayer.(ConvolutionLayer.java:69)
at org.deeplearning4j.nn.conf.layers.ConvolutionLayer.instantiate(ConvolutionLayer.java:169)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.init(MultiLayerNetwork.java:717)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.init(MultiLayerNetwork.java:607)
at MultiGpuLenetMnistExample.main(MultiGpuLenetMnistExample.java:105)
09:59:28,630 INFO ~ Creating new AveragingTraining instance
09:59:28,633 INFO ~ Using workspaceMode ENABLED for training
09:59:28,636 INFO ~ Creating asynchronous prefetcher…
09:59:28,641 INFO ~ Starting ParallelWrapper training round…
09:59:36,749 INFO ~ Using workspaceMode ENABLED for training
09:59:36,754 INFO ~ Creating asynchronous prefetcher…
09:59:36,755 INFO ~ Starting ParallelWrapper training round…
It seems that I need to declare an environmental variable to let the system know about the cuda. I’m not using an IDE. How should I specifiy these dependencies?
It is already using CUDA, but in order to make most use of cuDNN, you will also want to add the deeplearning4j-cuda dependency as directed on the cudnn docs page.
Note: this is an additional dependency and not a replacement for nd4j-cuda.
OK, thanks for clarifying this; it seemed to me that such declaration was part of the IDE…, where exactly should I add this dependency? If it is in the pom.xml file, where to? As an xml file, it has a structure, so below which node should I add the dependency declaration?