Cuda error during run the example

Good afternoon!
I try to run the example Nd4jEx14_Normalizers and get the following error:

o.n.l.f.Nd4jBackend - Loaded [JCublasBackend] backend
o.n.n.NativeOpsHolder - Number of threads used for linear algebra: 32
o.n.l.a.o.e.DefaultOpExecutioner - Backend used: [CUDA]; OS: [Windows 10]
o.n.l.a.o.e.DefaultOpExecutioner - Cores: [4]; Memory: [3,6GB];
o.n.l.a.o.e.DefaultOpExecutioner - Blas vendor: [CUBLAS]
o.n.l.j.JCublasBackend - ND4J CUDA build version: 10.2.89
o.n.l.j.JCublasBackend - CUDA device 0: [GeForce RTX 3090]; cc: [8.6]; Total memory: [25769803776]
Exception in thread “main” java.lang.RuntimeException: cudaGetSymbolAddress(…) failed; Error code: [13]
at org.nd4j.linalg.jcublas.ops.executioner.CudaExecutioner.createShapeInfo(CudaExecutioner.java:2162)
at org.nd4j.linalg.api.shape.Shape.createShapeInformation(Shape.java:3280)
at org.nd4j.linalg.api.ndarray.BaseShapeInfoProvider.createShapeInformation(BaseShapeInfoProvider.java:74)
at org.nd4j.jita.constant.ProtectedCudaShapeInfoProvider.createShapeInformation(ProtectedCudaShapeInfoProvider.java:92)
at org.nd4j.jita.constant.ProtectedCudaShapeInfoProvider.createShapeInformation(ProtectedCudaShapeInfoProvider.java:73)
at org.nd4j.linalg.jcublas.CachedShapeInfoProvider.createShapeInformation(CachedShapeInfoProvider.java:42)
at org.nd4j.linalg.api.ndarray.BaseNDArray.(BaseNDArray.java:181)
at org.nd4j.linalg.api.ndarray.BaseNDArray.(BaseNDArray.java:327)
at org.nd4j.linalg.jcublas.JCublasNDArray.(JCublasNDArray.java:127)
at org.nd4j.linalg.jcublas.JCublasNDArrayFactory.createUninitialized(JCublasNDArrayFactory.java:177)
at org.nd4j.linalg.factory.Nd4j.createUninitialized(Nd4j.java:4339)
at org.nd4j.linalg.factory.Nd4j.rand(Nd4j.java:2787)
at org.nd4j.examples.advanced.operations.Nd4jEx14_Normalizers.main(Nd4jEx14_Normalizers.java:43)

Process finished with exit code 1

When I had the latest version of Cuda ToolKit installed, there was another error message like: library not found cudart32_102.dll

how can I solve this problem? Thanks!

@Vladimir https://deeplearning4j.konduit.ai/config/backends/config-cudnn could you follow this and make sure you have the proper setup? Generally when these issues come up it’s a mis configured install or mismatch between dl4j cuda version and the installed cuda.

At first I tried CUDA 11.1 with the “Download cuDNN v8.0.4 (September 28th, 2020), for CUDA 11.1” from website nvidea
then, seeing that deeplearning4j can use CUDA not above 10.2, I installed CUDA version 10.2, and respectively "Download cuDNN v8.0.4 (September 28th, 2020), for CUDA 10.2 " from the nvidea website
From the link you suggested, I saw:
Note there are multiple combinations of sudden and CUDA supported. At this time the following combinations are supported by Deeplearning4j:
CUDA Version 10.2
cuDNN Version 7.6
And I downloaded and added cuDNN v7. 6. 5
unfortunately, nothing changed and the same error remained

Yeah trying random versions of cuda isn’t the right thing to do. Could you show us your pom.xml first? The version you’re using with dl4j should match the installed version.
The installed cudnn version also needs to match up.

Its default pom.xml

<?xml version="1.0" encoding="UTF-8"?>


4.0.0

<groupId>org.deeplearning4j</groupId>
<artifactId>nd4j-ndarray-examples</artifactId>
<version>1.0.0-beta7</version>
<name>ND4J Examples operating on ndarrays</name>
<description>Working with NDArrays</description>

<properties>
    <dl4j-master.version>1.0.0-beta7</dl4j-master.version>
    <!-- Change the nd4j.backend property to nd4j-cuda-X-platform to use CUDA GPUs -->
    <nd4j.backend>nd4j-cuda-10.2-platform</nd4j.backend>
<!--<nd4j.backend>nd4j-native</nd4j.backend>-->

<java.version>1.8</java.version>
<maven-compiler-plugin.version>3.6.1</maven-compiler-plugin.version>
<maven.minimum.version>3.3.1</maven.minimum.version>
<logback.version>1.1.7</logback.version>

        <dependency>
            <groupId>org.nd4j</groupId>
            <artifactId>nd4j-cuda-10.2</artifactId>
            <version>1.0.0-beta7</version>
        </dependency>
       <dependency>
           <groupId>ch.qos.logback</groupId>
           <artifactId>logback-classic</artifactId>
           <version>${logback.version}</version>
       </dependency>
       <dependency>
           <groupId>org.deeplearning4j</groupId>
           <artifactId>deeplearning4j-utility-iterators</artifactId>
           <version>${dl4j-master.version}</version>
       </dependency>
    </dependencies>

    <!-- Maven Enforcer: Ensures user has an up to date version of Maven before building -->
<build>
    <plugins>
        <plugin>
            <artifactId>maven-enforcer-plugin</artifactId>
            <version>1.0.1</version>
            <executions>
                <execution>
                    <id>enforce-default</id>
                    <goals>
                        <goal>enforce</goal>
                    </goals>
                    <configuration>
                        <rules>
                            <requireMavenVersion>
                                <version>[${maven.minimum.version},)</version>
                                <message>********** Minimum Maven Version is ${maven.minimum.version}. Please upgrade Maven before continuing (run "mvn --version" to check). **********</message>
                            </requireMavenVersion>
                        </rules>
                    </configuration>
                </execution>
            </executions>
        </plugin>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-compiler-plugin</artifactId>
            <version>${maven-compiler-plugin.version}</version>
            <configuration>
                <source>${java.version}</source>
                <target>${java.version}</target>
            </configuration>
        </plugin>
        <plugin>
            <groupId>com.lewisd</groupId>
            <artifactId>lint-maven-plugin</artifactId>
            <version>0.0.11</version>
            <configuration>
                <failOnViolation>true</failOnViolation>
                <onlyRunRules>
                    <rule>DuplicateDep</rule>
                    <rule>RedundantPluginVersion</rule>
                    <!-- Rules incompatible with Java 9
                    <rule>VersionProp</rule>
                    <rule>DotVersionProperty</rule> -->
                </onlyRunRules>
            </configuration>
            <executions>
                <execution>
                    <id>pom-lint</id>
                    <phase>validate</phase>
                    <goals>
                        <goal>check</goal>
                    </goals>
                </execution>
            </executions>
        </plugin>
    </plugins>
    <pluginManagement>
        <plugins>
            <plugin>
                <groupId>org.eclipse.m2e</groupId>
                <artifactId>lifecycle-mapping</artifactId>
                <version>1.0.0</version>
                <configuration>
                    <lifecycleMappingMetadata>
                        <pluginExecutions>
                            <pluginExecution>
                                <pluginExecutionFilter>
                                    <groupId>com.lewisd</groupId>
                                    <artifactId>lint-maven-plugin</artifactId>
                                    <versionRange>[0.0.11,)</versionRange>
                                    <goals>
                                        <goal>check</goal>
                                    </goals>
                                </pluginExecutionFilter>
                                <action>
                                    <ignore/>
                                </action>
                            </pluginExecution>
                        </pluginExecutions>
                    </lifecycleMappingMetadata>
                </configuration>
            </plugin>
        </plugins>
    </pluginManagement>
</build>

Maybe the new GeForce 3090 video card with the new 456.71 video card driver can’t work with the old Cuda Toolkit 10.2? And new CUDA Toolkit 11.1 can’t work with dl4j? ((

@Vladimir anything with cuda in any framework (not just dl4j) requires matched versions. Cuda is not backwards compatible. It sounds like you have a broken cuda setup (multiple versions?). Please install cuda 10.2 if you want to use that cuda version with dl4j. If you want to use cuda 11.0 use the associated version in dl4j.

Ok! But I tried it initially! Now.
I deleted all CUDA installations
reinstalled CUDA 10.2
Installed the cudnn-10.2-windows10-x64-v8. 0. 4. 30
The result is the same:

o.n.l.f.Nd4jBackend - Loaded [JCublasBackend] backend
o.n.n.NativeOpsHolder - Number of threads used for linear algebra: 32
o.n.l.a.o.e.DefaultOpExecutioner - Backend used: [CUDA]; OS: [Windows 10]
o.n.l.a.o.e.DefaultOpExecutioner - Cores: [4]; Memory: [3,6GB];
o.n.l.a.o.e.DefaultOpExecutioner - Blas vendor: [CUBLAS]
o.n.l.j.JCublasBackend - ND4J CUDA build version: 10.2.89
o.n.l.j.JCublasBackend - CUDA device 0: [GeForce RTX 3090]; cc: [8.6]; Total memory: [25769803776]
Exception in thread “main” java.lang.RuntimeException: cudaGetSymbolAddress(…) failed; Error code: [13]
at org.nd4j.linalg.jcublas.ops.executioner.CudaExecutioner.createShapeInfo(CudaExecutioner.java:2162)
at org.nd4j.linalg.api.shape.Shape.createShapeInformation(Shape.java:3280)
at org.nd4j.linalg.api.ndarray.BaseShapeInfoProvider.createShapeInformation(BaseShapeInfoProvider.java:74)
at org.nd4j.jita.constant.ProtectedCudaShapeInfoProvider.createShapeInformation(ProtectedCudaShapeInfoProvider.java:92)
at org.nd4j.jita.constant.ProtectedCudaShapeInfoProvider.createShapeInformation(ProtectedCudaShapeInfoProvider.java:73)
at org.nd4j.linalg.jcublas.CachedShapeInfoProvider.createShapeInformation(CachedShapeInfoProvider.java:42)
at org.nd4j.linalg.api.ndarray.BaseNDArray.(BaseNDArray.java:181)
at org.nd4j.linalg.api.ndarray.BaseNDArray.(BaseNDArray.java:327)
at org.nd4j.linalg.jcublas.JCublasNDArray.(JCublasNDArray.java:127)
at org.nd4j.linalg.jcublas.JCublasNDArrayFactory.createUninitialized(JCublasNDArrayFactory.java:177)
at org.nd4j.linalg.factory.Nd4j.createUninitialized(Nd4j.java:4339)
at org.nd4j.linalg.factory.Nd4j.rand(Nd4j.java:2787)
at org.nd4j.examples.advanced.operations.Nd4jEx14_Normalizers.main(Nd4jEx14_Normalizers.java:43)

Output from C:\ProgramData\NVIDIA Corporation\CUDA Samples\v10.2\bin\win64\Debug>deviceQuery
deviceQuery Starting…
…
…
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.1, CUDA Runtime Version = 10.2, NumDevs = 1
Result = PASS
How I can use cuda 11.0 with dp4j?

I found only one post with a similar problem and only 1 answer… I wonder if I really need to rebuild dp4j to solve this problem?

@Vladimir not sure why we’re having this conversation… you don’t rebuild dl4j, you just use the appropriate version we provide. We already build the binaries for you. You can specify different cuda versions. The issue is what I alrady described, you have a broken cuda install issue somewhere. That’s generally all it is.

There’s already another thread like this and the guy just had to follow the docs: Libcublas.so.10: cannot open shared object file: No such file or directory

@Vladimir so I reread your posts again. @saudet can comment more but maybe you can try the redist artifacts? https://mvnrepository.com/artifact/org.bytedeco/cuda-platform-redist

This doesn’t require installation. @saudet are there any specific docs on this?

No docs beyond https://deeplearning4j.konduit.ai/config/backends/config-cudnn, no.

Thank you, I will try!

I try to join redist CUDA:

org.nd4j
nd4j-cuda-10.2-platform
1.0.0-beta7


org.bytedeco
cuda-platform-redist
10.2-7.6-1.5.3

and I have same error, also Im downloaded the source of javacpp-presets and assembled redist libraries for CUDA 11.1 and cuDNN 8.0, it dont help me - I have still error((
Exception in thread “main” java.lang.RuntimeException: cudaGetSymbolAddress(…) failed; Error code: [13]

Now I sure about I need to download source of deeplearning4j, use script change-cuda-versions.sh in root directory of source, change the version (Im changed script, its easly) of cuda and cudnn, and assemble version of dl4j for cuda 11.1, like nd4j-cuda-11.1-platform
a similar problem was recently for tensorfow, while I was building it (for several days) the community released a new version with cuda 11, and it works in python, although I don’t like python itself, and I want to get the result with java and dl4j
of course, I’m sure that people more familiar with compiling dl4j will get it much faster, but I want to use the framework for research purposes, for analyzing time series of eeg signals, and I need this…

cuda-platform-redist 11.1-8.0-1.5.5

All the javacpp cuda stuff for 11.1 is already built and works well, https://oss.sonatype.org/content/repositories/snapshots/org/bytedeco/cuda-platform-redist/11.1-8.0-1.5.5-SNAPSHOT/ and sources here https://github.com/bytedeco/javacpp-presets/tree/master/cuda

What need to be build from source or wait for the update in the snapshots as @agibsonccc mention is the DL4J and ND4J related for cuda 11 over windows https://deeplearning4j.konduit.ai/getting-started/build-from-source over linux works with the snapshots already built for 11.0 https://oss.sonatype.org/content/repositories/snapshots/org/nd4j/nd4j-cuda-11.0-platform/1.0.0-SNAPSHOT/

	<dependency>
		<groupId>org.nd4j</groupId>
		<artifactId>nd4j-cuda-10.2</artifactId>
		<version>${dl4j.version}</version>
		<classifier>linux-x86_64</classifier>
	</dependency>
    <dependency>
        <groupId>org.deeplearning4j</groupId>
        <artifactId>deeplearning4j-cuda-10.2</artifactId>
        <version>${dl4j.version}</version>
    </dependency>
	<dependency>
		<groupId>org.bytedeco</groupId>
		<artifactId>cuda-platform-redist</artifactId>
		<version>10.2-7.6-1.5.3</version>
	</dependency>

this work for me using gtx1080ti

But the uber.jar is very big, it has 4+g. How to reduce the size? I want it only to include linux_x64 platform.

@SidneyLann you’ll want to use -Djavacpp.platform: https://stackoverflow.com/questions/61819636/build-jar-with-binaries-for-required-platform-only-javacpp
This will auto filter the classifiers.

Redist, JFYI includes all of cuda. If you already have cuda installed, you don’t have to use redist.