Using JavaCPP Presets for CUDA with M2.1 for cuda 11.6

Hi

I am unable to install native cuda as I am on Fedora 38 and cuda only supports up to 37.

Previously when using beta7 I could install the JavaCPP Presets for CUDA which had everything such as cuda and cudnn bundled, however for M2.1 that requires cuda 11.4 or 11.6, there does not seem to be any equivalent JavaCPP Presets for CUDA package as maven says there is only the 10.0 or the latest 12.1 cuda package which DL4J does not support yet.

https://repo.maven.apache.org/maven2/org/bytedeco/javacpp-presets/cuda/

Is it no longer possible to use JavaCPP Presets for CUDA with the current build of DL4J?

Thanks

@MPdaedalus that was never the case I can tell you that 100% fact. I’m not sure why you perceived it as bundled. That would be way too big of a download for us to redistribute all the time. Maybe you were remembering it wrong.

One thing I can think of is the instruction to use the -redist classifier? That’s still available and hasn’t changed. Could you please try that? The classifier would be cuda plus your platform just add -redist to the classifier. You can find the list of classifiers here:
https://repo1.maven.org/maven2/org/bytedeco/cuda/11.6-8.3-1.5.7/

Note that cuda 11.8 should also work.

ok thanks, i’ll give that a try, its been a while since I used DL4J so must have remembered it wrong, manged to get cudnn installed but not sure it will run without cuda as well.

I was referring to the following with regards to cuda and cuddn being bundled instead of using native versions>

" Alternatively, in the case of CUDA 9.2, cuDNN comes bundled with the “redist” package of the JavaCPP Presets for CUDA. After agreeing to the license, we can add the following dependencies instead of installing CUDA and cuDNN:"

I was hoping the same would apply for M2.1

although the cuda package link you provided supports cuda 11 the javacpp-presets package only goes up to 10.0 and I think this is required also to avoid the org.nd4j.linalg.factory.Nd4jBackend$NoAvailableBackendException I am getting.

https://repo1.maven.org/maven2/org/bytedeco/javacpp-presets/cuda/

Alternatively, in the case of CUDA 9.2, cuDNN comes bundled with the “redist” package of the JavaCPP Presets for CUDA. After agreeing to the license, we can add the following dependencies instead of installing CUDA and cuDNN:

That is above a section telling you to use the presets that @agibsonccc is talking about.

I know this page triggers documentation blindness in a lot of people. The dl4j cuda packages were only about some special packages to actually make use of cudnn, they never included it.

I know the dl4j/nd4j cuda packages don’t include cuda but I thought the third party javacpp-presets project does @ javacpp-presets/cuda at master · bytedeco/javacpp-presets · GitHub

so what your basically saying is that this project does not include cuda either and the only way to get it to work is to have cuda libraries installed nativity?

I currently use the following in my pom.xml but still gives me Nd4jBackend$NoAvailableBackendException >

<dependency>
            <groupId>org.deeplearning4j</groupId>
            <artifactId>deeplearning4j-core</artifactId>
            <version>1.0.0-M2.1</version>
        </dependency>
        <dependency>
            <groupId>org.nd4j</groupId>
            <artifactId>nd4j-cuda-11.6-preset</artifactId>
            <version>1.0.0-M2.1</version>
            <classifier>linux-x86_64</classifier>
        </dependency>
        <dependency>
            <groupId>org.bytedeco</groupId>
            <artifactId>cuda</artifactId>
            <version>11.6-8.3-1.5.7</version>
            <classifier>linux-x86_64-redist</classifier>
        </dependency>

But I thought I also needed something like this as well from the javacpp-presets project to make it work without native cuda>

 <dependency>
     <groupId>org.bytedeco.javacpp-presets</groupId>
     <artifactId>cuda</artifactId>
     <version>10.0-7.4-1.4.4</version>
     <classifier>linux-x86_64-redist</classifier>
 </dependency>

but there is no version for cuda 11.6 for this just 10.0 or 12.3.

or am I missing something here (sorry its been a while).

Ah, I see.

You one thing that is wrong here is that you probably want to use ndj4-cuda-11.6-platform or if you are using it with the classifier you should have nd4j-cuda-11.6 there twice, one without the classifier and one with. Because the one with the classifier only contains the native files.

And then for the cuda dependency you need to use one that actually exists:

<dependency>
    <groupId>org.bytedeco</groupId>
    <artifactId>cuda</artifactId>
    <version>11.6-8.3-1.5.7</version>
</dependency>
<dependency>
    <groupId>org.bytedeco</groupId>
    <artifactId>cuda</artifactId>
    <version>11.6-8.3-1.5.7</version>
   <classifier>linux-x86_64-redist</classifier>
</dependency>

i tried both combinations with and without platform, without luck as the issue is the lack of matching cuda support as javacpp-presets is required if no native cuda.

unless I can beg saudet at the javacpp-presets project to do a cuda 11.6 build of javacpp-presets I guess I will have to wait until DL4J support cuda 12.3 or wait till NVIDIA supports fedora 38 which I don’t hold out much hope for.

The combination of nd4j-cuda-11.6 and the bytedeco cuda redist should work fine. They even have exactly the same javacpp version.

Do you have the nvidia drivers installed on the system at all?

I have nvidia driver 545.29.06 installed and working fine along with xorg-x11-drv-nvidia, xorg-x11-drv-nvidia-cuda and xorg-x11-drv-nvidia-cuda-libs but without the actual cuda toolkits or anything else installed

the reason I can’t install the actual cuda toolkit and everything else is due to current cuda release using an older version of GCC compared to fedora 38 build, even the latest cuda build 12.1 uses Gcc12 which was removed in fedora 38 for a newer version.

when using>

 <dependency>
            <groupId>org.nd4j</groupId>
            <artifactId>nd4j-cuda-11.6-platform</artifactId>
            <version>1.0.0-M2.1</version>
        </dependency>
        <dependency>
            <groupId>org.bytedeco</groupId>
            <artifactId>cuda-platform-redist</artifactId>
            <version>11.6-8.3-1.5.7</version>
        </dependency>

I am getting>

java.lang.UnsatisfiedLinkError: /home/daedalus/.javacpp/cache/cuda-11.6-8.3-1.5.7-linux-x86_64.jar/org/bytedeco/cuda/linux-x86_64/libjnicudart.so: libcudart.so.11.0: cannot open shared object file: No such file or directory
	at java.base/jdk.internal.loader.NativeLibraries.load(Native Method)
	at java.base/jdk.internal.loader.NativeLibraries$NativeLibraryImpl.open(NativeLibraries.java:331)
	at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:197)
	at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:139)
	at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2418)
	at java.base/java.lang.Runtime.load0(Runtime.java:852)
	at java.base/java.lang.System.load(System.java:2021)
	at org.bytedeco.javacpp.Loader.loadLibrary(Loader.java:1747)
	at org.bytedeco.javacpp.Loader.load(Loader.java:1402)
	at org.bytedeco.javacpp.Loader.load(Loader.java:1214)
	at org.bytedeco.javacpp.Loader.load(Loader.java:1190)
	at org.bytedeco.cuda.global.cudart.<clinit>(cudart.java:14)
	at org.nd4j.linalg.jcublas.JCublasBackend.canRun(JCublasBackend.java:66)
	at org.nd4j.linalg.jcublas.JCublasBackend.isAvailable(JCublasBackend.java:51)
	at org.nd4j.linalg.factory.Nd4jBackend.load(Nd4jBackend.java:175)
	at org.nd4j.linalg.factory.Nd4j.initContext(Nd4j.java:5062)
	at org.nd4j.linalg.factory.Nd4j.<clinit>(Nd4j.java:284)
	at neural.FCNetwork.loadData(FCNetwork.java:52)
Dec 14, 2023 10:39:22 PM org.nd4j.linalg.factory.Nd4jBackend load
WARNING: Skipped [JCublasBackend] backend (unavailable): java.lang.UnsatisfiedLinkError: /home/daedalus/.javacpp/cache/cuda-11.6-8.3-1.5.7-linux-x86_64.jar/org/bytedeco/cuda/linux-x86_64/libjnicudart.so: libcudart.so.11.0: cannot open shared object file: No such file or directory
Exception in thread "main" java.lang.ExceptionInInitializerError
	at neural.FCNetwork.loadData(FCNetwork.java:52)
Caused by: java.lang.RuntimeException: org.nd4j.linalg.factory.Nd4jBackend$NoAvailableBackendException: Please ensure that you have an nd4j backend on your classpath. Please see: https://deeplearning4j.konduit.ai/nd4j/backend
	at org.nd4j.linalg.factory.Nd4j.initContext(Nd4j.java:5066)
	at org.nd4j.linalg.factory.Nd4j.<clinit>(Nd4j.java:284)

its looking for the native .so cuda files but can not find them. This is where my thought that javacpp-presets is required in order to provide those .so files.

I have done mvn clean/verify/compile/install in Intellij idea but no luck.

I can run the native cpu version no problem.

Can you check if the files are there in the javacpp cache directory? If they are, it may happen because some other usually existing file is missing.

the file libjnicudart.so is there in the directory but libcudart.so is missing.

I did a search and have libcudart.so in there for older cuda versions such as in cuda-11.2-8.1-1.5.5-linux-x86_64.jar dated back in 2021 when I last used DL4J but I also had native cuda installed at that time as well.

I check the contents of the jar files on maven for
https://repo1.maven.org/maven2/org/bytedeco/cuda-platform-redist/11.6-8.3-1.5.7/

and they don’t have the .so file in them so it looks like they are being built or taken from somewhere else, i’m guessing the native install or javacpp-presents if I had a compatible version.

I have downgraded DL4J/nd4j/bytedeco to M2.0/cuda 11.2 as that version still support 11.2 and hey presto it works!

Warning: Versions of org.bytedeco:javacpp:1.5.7 and org.bytedeco:cuda:11.2-8.1-1.5.5 do not match.
Dec 15, 2023 10:18:40 AM org.nd4j.linalg.factory.Nd4jBackend load
INFO: Loaded [JCublasBackend] backend
Dec 15, 2023 10:18:45 AM org.nd4j.nativeblas.NativeOpsHolder <init>
INFO: Number of threads used for linear algebra: 32
Dec 15, 2023 10:18:45 AM org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner printEnvironmentInformation
INFO: Backend used: [CUDA]; OS: [Linux]
Dec 15, 2023 10:18:45 AM org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner printEnvironmentInformation
INFO: Cores: [16]; Memory: [25.0GB];
Dec 15, 2023 10:18:45 AM org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner printEnvironmentInformation
INFO: Blas vendor: [CUBLAS]
Dec 15, 2023 10:18:45 AM org.nd4j.linalg.jcublas.JCublasBackend logBackendInit
INFO: ND4J CUDA build version: 11.2.152
Dec 15, 2023 10:18:45 AM org.nd4j.linalg.jcublas.JCublasBackend logBackendInit
INFO: CUDA device 0: [NVIDIA GeForce RTX 3080 Ti]; cc: [8.6]; Total memory: [12630163456]
Dec 15, 2023 10:18:45 AM org.nd4j.linalg.jcublas.JCublasBackend logBackendInit
INFO: Backend build information:
 GCC: "7.5.0"
STD version: 201103L
DEFAULT_ENGINE: samediff::ENGINE_CUDA
HAVE_FLATBUFFERS

However for some reason maven shade is adding both versions of javacpp into the uberjar as I don’t have 1.5.7 listed in my pom.xml only 1.5.5

  <dependency>
            <groupId>org.deeplearning4j</groupId>
            <artifactId>deeplearning4j-core</artifactId>
            <version>1.0.0-M2</version>
        </dependency>
        <dependency>
            <groupId>org.nd4j</groupId>
            <artifactId>nd4j-cuda-11.2</artifactId>
            <version>1.0.0-M2</version>
        </dependency>
        <dependency>
            <groupId>org.bytedeco</groupId>
            <artifactId>cuda</artifactId>
            <version>11.2-8.1-1.5.5</version>
        </dependency>
        <dependency>
            <groupId>org.bytedeco</groupId>
            <artifactId>cuda</artifactId>
            <version>11.2-8.1-1.5.5</version>
            <classifier>linux-x86_64-redist</classifier>
        </dependency>

[INFO] Including org.bytedeco:javacpp:jar:1.5.7 in the shaded jar.
[INFO] Including org.bytedeco:javacpp:jar:linux-x86_64:1.5.7 in the shaded jar.
[INFO] Including org.nd4j:nd4j-native-api:jar:1.0.0-M2 in the shaded jar.
[INFO] Including org.bytedeco:cuda-platform:jar:11.2-8.1-1.5.5 in the shaded jar.
[INFO] Including org.bytedeco:javacpp-platform:jar:1.5.5 in the shaded jar.

I can live with just M2.0 for now but I need to sort out this 1.5.7 vs 1.5.5 mismatch as I think it will cause errors later

Warning: Versions of org.bytedeco:javacpp:1.5.7 and org.bytedeco:cuda:11.2-8.1-1.5.5 do not match.

i’m using 3.5.1 of maven shade.

@treo update to M2.1. You can’t use the newer cuda with the older cuda 11.2 versions. Those by definition will fail.

as I mentioned in the previous posts above, if I used M2.1 then I need to use cuda 11.4 or 11.6/.8 but I don’t have native cuda .so files for those versions because I can’t install cuda toolkit due to cuda needing GCC12 which fedora 38 does not have.

the only native versions I have are from cuda 11.2 down from 2021 which were generated when I did have native cuda installed at that time, which restricts me to M2.0 as the repos do not support M2.1 with old versions of cuda.

@MPdaedalus the redist artifacts should come with those. Could you clarify what’s missing? The compiler shouldn’t matter on the deployment system nor should fedora “coming with it”.
The only time the linux version comes in to play is with libc. There shouldn’t be too many issues in practice.

so basically with my old versions of cuda from 2021 I have the following

but with

    <dependency>
            <groupId>org.deeplearning4j</groupId>
            <artifactId>deeplearning4j-core</artifactId>
            <version>1.0.0-M2.1</version>
        </dependency>
        <dependency>
            <groupId>org.nd4j</groupId>
            <artifactId>nd4j-cuda-11.6-preset</artifactId>
            <version>1.0.0-M2.1</version>
        </dependency>
        <dependency>
            <groupId>org.bytedeco</groupId>
            <artifactId>cuda-platform-redist</artifactId>
            <version>11.6-8.3-1.5.7</version>
        </dependency>

I am getting>

Caused by: java.lang.RuntimeException: org.nd4j.linalg.factory.Nd4jBackend$NoAvailableBackendException: Please ensure that you have an nd4j backend on your classpath. Please see: https://deeplearning4j.konduit.ai/nd4j/backend

because I only have>

notice that the folder does not have redist in the title even though I used <artifactId>cuda-platform-redist</artifactId>, there is no redist folder for 11.6 even though the folder above were only created yesterday during my build.

when I use 1.0.0-M2 with cuda 11.2 it runs and detects the CUDA backend (albeit with miss matching javacpp versions which i’m still trying to fix (see above)) because all the required .so files are in the 11.2 folder but for 11.6 its not generating them because like you said in your first post DL4J has never come with cuda bundled insted it has to come from native cuda install from nvidia which I can not do for current 11.6 version because of GCC12 missing, OR from org.bytedeco.javacpp-presets project which I used to use for the early DL4j betas but this project only supports cuda 10.0 or the latest 12 version, neither of which are compatable with M2.1 which needs cuda 11. So i’m stuck between a rock and a hard place unless you can tell me how I can generate all those .so files without a native cuda install with my fedora rpm package manager or the org.bytedeco.javacpp-presets project.

Id be happy to put up with just M2 rather than2.1 at this point as at least this runs with cuda but I dont know why its including javacpp 1.5.7 when I only specify 1.5.5 in the pom. (see prev post).

My best guess is that something about the way you reference it is wrong, because the cuda package does indeed contain all of the so files even for 11.6:

Please share your full pom.xml file (ideally as a gist), so we can take a look.

I finally got it working, turns out I was using an out of date pom.xml from 2021, I think something from maven was also messed up as I used it with eclipse previously and now using it with intellij Idea.

I didn’t change any code from when I got it working with M2.0 and cuda 11.2 before but when using the correct pom with M2.1 and cuda 11.6 my NN trains like normal, however I notice that its not printing out the usual INFO stuff at the start eg>

INFO: Loaded [JCublasBackend] backend
Dec 15, 2023 10:18:45 AM org.nd4j.nativeblas.NativeOpsHolder <init>
INFO: Number of threads used for linear algebra: 32
Dec 15, 2023 10:18:45 AM org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner printEnvironmentInformation
INFO: Backend used: [CUDA]; OS: [Linux]
Dec 15, 2023 10:18:45 AM org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner printEnvironmentInformation
INFO: Cores: [16]; Memory: [25.0GB];
Dec 15, 2023 10:18:45 AM org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner printEnvironmentInformation
INFO: Blas vendor: [CUBLAS]
Dec 15, 2023 10:18:45 AM org.nd4j.linalg.jcublas.JCublasBackend logBackendInit
INFO: ND4J CUDA build version: 11.2.152
Dec 15, 2023 10:18:45 AM org.nd4j.linalg.jcublas.JCublasBackend logBackendInit
INFO: CUDA device 0: [NVIDIA GeForce RTX 3080 Ti]; cc: [8.6]; Total memory: [12630163456]
Dec 15, 2023 10:18:45 AM org.nd4j.linalg.jcublas.JCublasBackend logBackendInit
INFO: Backend build information:

etc.

and also its not printing any scores as I train my regression neural network using>

model.setListeners(new ScoreIterationListener(100));

is this normal for the latest M2.1 build?

My own System.out.println() is working so its not an issue with the console or anything, same thing when I run from cmd line with uber jar.

my pom.xml

Thanks for your help, I really appreciate it after being away from DL4J for 2 years.

I think we used to ship a logger implementation, but that sometimes resulted in problems so if you want to have your output again, all you need to do is to re-add a slf4j compatible logger. Logback classifc for example.

thanks, that worked like a charm, Reference for others>

<dependency>
            <groupId>ch.qos.logback</groupId>
            <artifactId>logback-classic</artifactId>
            <version>1.4.8</version>
</dependency>
public static void setupLogging(){
		LoggerContext lc = (LoggerContext) LoggerFactory.getILoggerFactory();
		StatusManager statusManager = lc.getStatusManager();
		OnConsoleStatusListener onConsoleListener = new OnConsoleStatusListener();
		statusManager.add(onConsoleListener);
}