Good afternoon!
I try to run the example Nd4jEx14_Normalizers and get the following error:
o.n.l.f.Nd4jBackend - Loaded [JCublasBackend] backend
o.n.n.NativeOpsHolder - Number of threads used for linear algebra: 32
o.n.l.a.o.e.DefaultOpExecutioner - Backend used: [CUDA]; OS: [Windows 10]
o.n.l.a.o.e.DefaultOpExecutioner - Cores: [4]; Memory: [3,6GB];
o.n.l.a.o.e.DefaultOpExecutioner - Blas vendor: [CUBLAS]
o.n.l.j.JCublasBackend - ND4J CUDA build version: 10.2.89
o.n.l.j.JCublasBackend - CUDA device 0: [GeForce RTX 3090]; cc: [8.6]; Total memory: [25769803776]
Exception in thread “main” java.lang.RuntimeException: cudaGetSymbolAddress(…) failed; Error code: [13]
at org.nd4j.linalg.jcublas.ops.executioner.CudaExecutioner.createShapeInfo(CudaExecutioner.java:2162)
at org.nd4j.linalg.api.shape.Shape.createShapeInformation(Shape.java:3280)
at org.nd4j.linalg.api.ndarray.BaseShapeInfoProvider.createShapeInformation(BaseShapeInfoProvider.java:74)
at org.nd4j.jita.constant.ProtectedCudaShapeInfoProvider.createShapeInformation(ProtectedCudaShapeInfoProvider.java:92)
at org.nd4j.jita.constant.ProtectedCudaShapeInfoProvider.createShapeInformation(ProtectedCudaShapeInfoProvider.java:73)
at org.nd4j.linalg.jcublas.CachedShapeInfoProvider.createShapeInformation(CachedShapeInfoProvider.java:42)
at org.nd4j.linalg.api.ndarray.BaseNDArray.(BaseNDArray.java:181)
at org.nd4j.linalg.api.ndarray.BaseNDArray.(BaseNDArray.java:327)
at org.nd4j.linalg.jcublas.JCublasNDArray.(JCublasNDArray.java:127)
at org.nd4j.linalg.jcublas.JCublasNDArrayFactory.createUninitialized(JCublasNDArrayFactory.java:177)
at org.nd4j.linalg.factory.Nd4j.createUninitialized(Nd4j.java:4339)
at org.nd4j.linalg.factory.Nd4j.rand(Nd4j.java:2787)
at org.nd4j.examples.advanced.operations.Nd4jEx14_Normalizers.main(Nd4jEx14_Normalizers.java:43)
Process finished with exit code 1
When I had the latest version of Cuda ToolKit installed, there was another error message like: library not found cudart32_102.dll
At first I tried CUDA 11.1 with the “Download cuDNN v8.0.4 (September 28th, 2020), for CUDA 11.1” from website nvidea
then, seeing that deeplearning4j can use CUDA not above 10.2, I installed CUDA version 10.2, and respectively "Download cuDNN v8.0.4 (September 28th, 2020), for CUDA 10.2 " from the nvidea website
From the link you suggested, I saw:
Note there are multiple combinations of sudden and CUDA supported. At this time the following combinations are supported by Deeplearning4j:
CUDA Version 10.2
cuDNN Version 7.6
And I downloaded and added cuDNN v7. 6. 5
unfortunately, nothing changed and the same error remained
Yeah trying random versions of cuda isn’t the right thing to do. Could you show us your pom.xml first? The version you’re using with dl4j should match the installed version.
The installed cudnn version also needs to match up.
<groupId>org.deeplearning4j</groupId>
<artifactId>nd4j-ndarray-examples</artifactId>
<version>1.0.0-beta7</version>
<name>ND4J Examples operating on ndarrays</name>
<description>Working with NDArrays</description>
<properties>
<dl4j-master.version>1.0.0-beta7</dl4j-master.version>
<!-- Change the nd4j.backend property to nd4j-cuda-X-platform to use CUDA GPUs -->
<nd4j.backend>nd4j-cuda-10.2-platform</nd4j.backend>
<!--<nd4j.backend>nd4j-native</nd4j.backend>-->
Maybe the new GeForce 3090 video card with the new 456.71 video card driver can’t work with the old Cuda Toolkit 10.2? And new CUDA Toolkit 11.1 can’t work with dl4j? ((
@Vladimir anything with cuda in any framework (not just dl4j) requires matched versions. Cuda is not backwards compatible. It sounds like you have a broken cuda setup (multiple versions?). Please install cuda 10.2 if you want to use that cuda version with dl4j. If you want to use cuda 11.0 use the associated version in dl4j.
Ok! But I tried it initially! Now.
I deleted all CUDA installations
reinstalled CUDA 10.2
Installed the cudnn-10.2-windows10-x64-v8. 0. 4. 30
The result is the same:
o.n.l.f.Nd4jBackend - Loaded [JCublasBackend] backend
o.n.n.NativeOpsHolder - Number of threads used for linear algebra: 32
o.n.l.a.o.e.DefaultOpExecutioner - Backend used: [CUDA]; OS: [Windows 10]
o.n.l.a.o.e.DefaultOpExecutioner - Cores: [4]; Memory: [3,6GB];
o.n.l.a.o.e.DefaultOpExecutioner - Blas vendor: [CUBLAS]
o.n.l.j.JCublasBackend - ND4J CUDA build version: 10.2.89
o.n.l.j.JCublasBackend - CUDA device 0: [GeForce RTX 3090]; cc: [8.6]; Total memory: [25769803776]
Exception in thread “main” java.lang.RuntimeException: cudaGetSymbolAddress(…) failed; Error code: [13]
at org.nd4j.linalg.jcublas.ops.executioner.CudaExecutioner.createShapeInfo(CudaExecutioner.java:2162)
at org.nd4j.linalg.api.shape.Shape.createShapeInformation(Shape.java:3280)
at org.nd4j.linalg.api.ndarray.BaseShapeInfoProvider.createShapeInformation(BaseShapeInfoProvider.java:74)
at org.nd4j.jita.constant.ProtectedCudaShapeInfoProvider.createShapeInformation(ProtectedCudaShapeInfoProvider.java:92)
at org.nd4j.jita.constant.ProtectedCudaShapeInfoProvider.createShapeInformation(ProtectedCudaShapeInfoProvider.java:73)
at org.nd4j.linalg.jcublas.CachedShapeInfoProvider.createShapeInformation(CachedShapeInfoProvider.java:42)
at org.nd4j.linalg.api.ndarray.BaseNDArray.(BaseNDArray.java:181)
at org.nd4j.linalg.api.ndarray.BaseNDArray.(BaseNDArray.java:327)
at org.nd4j.linalg.jcublas.JCublasNDArray.(JCublasNDArray.java:127)
at org.nd4j.linalg.jcublas.JCublasNDArrayFactory.createUninitialized(JCublasNDArrayFactory.java:177)
at org.nd4j.linalg.factory.Nd4j.createUninitialized(Nd4j.java:4339)
at org.nd4j.linalg.factory.Nd4j.rand(Nd4j.java:2787)
at org.nd4j.examples.advanced.operations.Nd4jEx14_Normalizers.main(Nd4jEx14_Normalizers.java:43)
Output from C:\ProgramData\NVIDIA Corporation\CUDA Samples\v10.2\bin\win64\Debug>deviceQuery
deviceQuery Starting…
…
…
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.1, CUDA Runtime Version = 10.2, NumDevs = 1
Result = PASS
How I can use cuda 11.0 with dp4j?
@Vladimir not sure why we’re having this conversation… you don’t rebuild dl4j, you just use the appropriate version we provide. We already build the binaries for you. You can specify different cuda versions. The issue is what I alrady described, you have a broken cuda install issue somewhere. That’s generally all it is.
and I have same error, also Im downloaded the source of javacpp-presets and assembled redist libraries for CUDA 11.1 and cuDNN 8.0, it dont help me - I have still error(( Exception in thread “main” java.lang.RuntimeException: cudaGetSymbolAddress(…) failed; Error code: [13]
Now I sure about I need to download source of deeplearning4j, use script change-cuda-versions.sh in root directory of source, change the version (Im changed script, its easly) of cuda and cudnn, and assemble version of dl4j for cuda 11.1, like nd4j-cuda-11.1-platform
a similar problem was recently for tensorfow, while I was building it (for several days) the community released a new version with cuda 11, and it works in python, although I don’t like python itself, and I want to get the result with java and dl4j
of course, I’m sure that people more familiar with compiling dl4j will get it much faster, but I want to use the framework for research purposes, for analyzing time series of eeg signals, and I need this…