How to choose video card?

Hello! How to choose video card for deep learning with especially high effeciency for it’s cost. Currently here is almost no video cards available because of miners. Others cost insane amount of money.

  • Old video cards may have good performance for their price (for example NVIDIA GeForce GTX 780 can have around 4000 gflops, but low compute capability), but may not be supported.
  • I’ve been looking for cards like 900, 1000 series, Tesla, Quadro and such. But they don’t have enough RAM size for deep learning or are too expensive.
  • GTX 1030 and such little GPUs offers 2.5 times more flops per cost in comparasion to GTX 1660 (1 tflops to 5 tflops), but memory capacity is 2 Gb.
  • Tesla K80 have 24 Gb RAM, but it is probably not supported due to CUDA compute capability 3.7. The price for them is very low for memory size.
  • Tesla M40, M60 have compute capability 5.2 which is still kinda low. But the price for them is low for their memory size.
  • Also i saw Jetson modules, but i’m not sure they’re currently supported. If they are than which module is better, Nano, TX or NX Xavier or AGX Xavier?

How to choose video card for deep learning? I saw at least 4 Gb RAM ir recommended, but no explanation about CUDA cores, tensor cores, interface speed and such. I’m worrying i buy video card which is not supported or have insufficient memory size.
Here is also tensor cores which efficiency i cannot compare to CUDA cores.

just buy the best gpu you can afford. Minimum CC 6.1 so a 1060 6GB or greater, 2060 with tensor cores is even better, given that second hand prices are nutty right now its best to hunt for a shop selling new 3000 series card as they will be cheaper than second hand!

Don’t go below 6GB or your models will not fit in memory if using CNN and you will have to resort to very small batch sizes which will hurt performance.

An alternative is https://gpu.land which is cheapest GPU cloud compute on earth @ 99 US cents per hour for a Tesla V100 with 16GB RAM which will blow almost all other cards out of the water
see>AI-Benchmark

How about Jetson series? Are they supported?

Semi-supported. We have supported them in beta7 but aren’t going to support them in M1, and are probably going to bring back support later.

However, those are small IoT type devices where the GPU is aimed at speeding up inference tasks. So they aren’t that useful for training, esp. compared to desktop GPUs.

Why so? Jetson AGX Xavier offers 32 TOPS or around 8 TFLOPS (may be wrong) for that price. What’s the problem of supporting that?

Because it is a very niche product and we only have a limited amount of time to support things.

No one of us owns such a board and as far as I know you are the first one to ask for the AGX, we’ve supported the nano before, but that too has only seen very little use.

So we’d rather release a new version for most users sooner, rather than add yet another thing to the support workload.

According to the press release it has 11 TFLOPS in FP16. For regular GPUs that number is given for FP32. I guess it also isn’t calculated in the same way, given how much emphasis they put on tensor cores in that press release, most likely those 11TFLOPS are in relation to the tensor cores.

For comparison with a desktop GPU that has tensor cores: The RTX 2060 (not Super!) has 240 tensor cores (vs the 64 of the AGX) and that gives it 52TFlops for its tensor cores.

Is supporting Jetson hard? I mean if i can help with supporting it. I may write this kind of code if i get how to.

They offer many performance and neural network usage. They don’t require buying full pc. The price is very different and it allows using GPIOs.

In principle it isn’t too hard, the biggest issue here is that we need support from javacpp cuda presets for arm64 cuda 10.2 (as that is the newest that is available in nvidias jetpack as far as I know).

But when it was released for cuda 10.2, jetpack didn’t have support for cuda 10.2 yet, so it wasn’t released with support for it.

So there is a bit of a circular problem that needs to be resolved first, and we plan on doing that after the M1 release.

How can i with javacpp?

@jijiji You will need to compile the presets from source on the branch tag where cuda 10.2 was set. @saudet could offer some ideas as well. Ideally, nvidia would just update the nano to 11, but I’m not sure on the status of what you can do with cuda 11 on the nano. From what I understand it doesn’t work.

Im gonna buy nano and test it. It is almost 50 times faster than Pi computers. It is small, lightweight and energy saving. The problem is limited RAM.

Nvidia CUDA support matrix shows that for CUDA 11.3 required driver version is r465, but latest version is only 466 currently. Should it work with that? I cannot find 465 version but looks like it supports versions 465 and later.

@jijiji I would actually buy a jetson nano to understand what it is you do on there first. The jetson nano does not even work with an up to date cuda version. It only has cuda 10.2 on there: JetPack SDK | NVIDIA Developer

Thank you for your answer. I’m gonna test Nano, but right now i have to test NVIDIA GeForce RTX 3060. The problem is that cudnn doesn’t work. I run javacpp-presets/cuda/samples with MNISTCUDNN.java. Output is

log with cuda-platform-redist

mvn compile exec:java causes (compile passed)

Error opening file data/conv1.bin java.lang.Exception: Stack trace at java.base/java.lang.Thread.dumpStack(Thread.java:1379) at MNISTCUDNN.FatalError(MNISTCUDNN.java:58) at MNISTCUDNN$Layer_t.readBinaryFile(MNISTCUDNN.java:122) at MNISTCUDNN$Layer_t.<init>(MNISTCUDNN.java:97) at MNISTCUDNN.main(MNISTCUDNN.java:485) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:254) at java.base/java.lang.Thread.run(Thread.java:832) Aborting...

It is trying to open FileInputStream and system cannot find resource at this path.

log without (just for data)

Edit: It says “Procedure not found”. Looks like file is found but not special procedure. This is weird because version is correct.

SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/C:/Users/Administrator/.p2/pool/plugins/org.eclipse.m2e.maven.runtime.slf4j.simple_1.16.0.20200610-1735/jars/slf4j-simple-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [file:/C:/Users/Administrator/eclipse/java-2021-03/eclipse/configuration/org.eclipse.osgi/5/0/.cp/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.SimpleLoggerFactory] SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/C:/Users/Administrator/.p2/pool/plugins/org.eclipse.m2e.maven.runtime.slf4j.simple_1.16.0.20200610-1735/jars/slf4j-simple-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [file:/C:/Users/Administrator/eclipse/java-2021-03/eclipse/configuration/org.eclipse.osgi/5/0/.cp/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.SimpleLoggerFactory] [INFO] Scanning for projects... [INFO] [INFO] --------------------< org.bytedeco.cuda:mnistcudnn >-------------------- [INFO] Building mnistcudnn 1.5.6-SNAPSHOT [INFO] --------------------------------[ jar ]--------------------------------- [INFO] [INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ mnistcudnn --- [WARNING] Using platform encoding (Cp1251 actually) to copy filtered resources, i.e. build is platform dependent! [INFO] skip non existing resourceDirectory C:\Users\Administrator\git\javacpp-presets\cuda\samples\src\main\resources [INFO] [INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ mnistcudnn --- [INFO] Nothing to compile - all classes are up to date [INFO] [INFO] --- exec-maven-plugin:3.0.0:java (default-cli) @ mnistcudnn --- [WARNING] java.lang.UnsatisfiedLinkError: no jnicudnn in java.library.path: C:\Users\Administrator\.p2\pool\plugins\org.eclipse.justj.openjdk.hotspot.jre.full.win32.x86_64_15.0.2.v20210201-0955\jre\bin;C:\Windows\Sun\Java\bin;C:\Windows\system32;C:\Windows;C:/Users/Administrator/.p2/pool/plugins/org.eclipse.justj.openjdk.hotspot.jre.full.win32.x86_64_15.0.2.v20210201-0955/jre/bin/server;C:/Users/Administrator/.p2/pool/plugins/org.eclipse.justj.openjdk.hotspot.jre.full.win32.x86_64_15.0.2.v20210201-0955/jre/bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\libnvvp;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\libnvvp;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\libnvvp;;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\libnvvp;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Windows\System32\OpenSSH\;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Program Files\Java\jdk-10\bin;C:\Program Files\Maven\apache-maven-3.8.1\bin;C:\Program Files\Git\cmd;C:\Program Files\NVIDIA Corporation\Nsight Compute 2021.1.1\;C:\Program Files\NVIDIA Corporation\NVIDIA NvDLISR;C:\Users\Administrator\AppData\Local\Microsoft\WindowsApps;C:\Users\Administrator\eclipse\java-2021-03\eclipse;;. at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2447) at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:809) at java.base/java.lang.System.loadLibrary(System.java:1893) at org.bytedeco.javacpp.Loader.loadLibrary(Loader.java:1734) at org.bytedeco.javacpp.Loader.load(Loader.java:1344) at org.bytedeco.javacpp.Loader.load(Loader.java:1157) at org.bytedeco.javacpp.Loader.load(Loader.java:1133) at org.bytedeco.cuda.global.cudnn.<clinit>(cudnn.java:16) at MNISTCUDNN$network_t.createHandles(MNISTCUDNN.java:150) at MNISTCUDNN$network_t.<init>(MNISTCUDNN.java:175) at MNISTCUDNN.main(MNISTCUDNN.java:477) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:254) at java.base/java.lang.Thread.run(Thread.java:832) Caused by: java.lang.UnsatisfiedLinkError: C:\Users\Administrator\.javacpp\cache\cuda-11.3-8.2-1.5.6-SNAPSHOT-windows-x86_64.jar\org\bytedeco\cuda\windows-x86_64\jnicudnn.dll: Не найдена указанная процедура at java.base/jdk.internal.loader.NativeLibraries.load(Native Method) at java.base/jdk.internal.loader.NativeLibraries$NativeLibraryImpl.open(NativeLibraries.java:383) at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:227) at java.base/jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:169) at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2407) at java.base/java.lang.Runtime.load0(Runtime.java:747) at java.base/java.lang.System.load(System.java:1857) at org.bytedeco.javacpp.Loader.loadLibrary(Loader.java:1684) ... 9 more [INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE [INFO] ------------------------------------------------------------------------ [INFO] Total time: 1.218 s [INFO] Finished at: 2021-05-27T05:33:42+05:00 [INFO] ------------------------------------------------------------------------ [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:3.0.0:java (default-cli) on project mnistcudnn: An exception occured while executing the Java class. no jnicudnn in java.library.path: C:\Users\Administrator\.p2\pool\plugins\org.eclipse.justj.openjdk.hotspot.jre.full.win32.x86_64_15.0.2.v20210201-0955\jre\bin;C:\Windows\Sun\Java\bin;C:\Windows\system32;C:\Windows;C:/Users/Administrator/.p2/pool/plugins/org.eclipse.justj.openjdk.hotspot.jre.full.win32.x86_64_15.0.2.v20210201-0955/jre/bin/server;C:/Users/Administrator/.p2/pool/plugins/org.eclipse.justj.openjdk.hotspot.jre.full.win32.x86_64_15.0.2.v20210201-0955/jre/bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\libnvvp;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\libnvvp;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\libnvvp;;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\libnvvp;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Windows\System32\OpenSSH\;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Program Files\Java\jdk-10\bin;C:\Program Files\Maven\apache-maven-3.8.1\bin;C:\Program Files\Git\cmd;C:\Program Files\NVIDIA Corporation\Nsight Compute 2021.1.1\;C:\Program Files\NVIDIA Corporation\NVIDIA NvDLISR;C:\Users\Administrator\AppData\Local\Microsoft\WindowsApps;C:\Users\Administrator\eclipse\java-2021-03\eclipse;;. C:\Users\Administrator\.javacpp\cache\cuda-11.3-8.2-1.5.6-SNAPSHOT-windows-x86_64.jar\org\bytedeco\cuda\windows-x86_64\jnicudnn.dll: Не найдена указанная процедура -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException

It is trying to read file at path data/conv1.bin but here is no such file after mvn compile. I have no clue why here is no such file, but i might miss version.

@jijiji I only replied to you about the nano so you understand that there’s a bit more work and you can’t just hope to install the latest cuda on there.

Regarding the cuda presets, I’ll defer to @saudet but beyond that won’t comment.

For deeplearning4j, we do not support cuda 11.3 on snapshots and won’t be till the presets are updated. We generally expect very specific versions of libraries to be present. If you don’t have those exact versions, linking will fail.

Assuming it should work though. Sometimes cuda installs can interfere with trying to use redist. I’m not sure what OS you’re on, but I know if you’re on windows and have cuda in your path it can cause issues.

As for linux, it could be a similar issue where you have LD_LIBRARY_PATH or something misconfigured. I know for a fact because we’re building/testing artifacts just fine on both platforms.

If you would like, 1.0.0-M1 is out on maven central. Docs are still WIP yet, but I’m always happy to hear feedback for testing.

Thank you i will test it. OS is Windows 10. I am doing performance tests to find stuff to optimize.

Here is a lack of guildes. Maybe writing guides could help. Here is some questions i don’t find full answer on https://deeplearning4j.konduit.ai/:

  • Which backend to choose?
  • How to use cuDNN library with dl4j? What backend is required for that?
  • What CUDA version and should i install? CUDA 10.0 or CUDA 10.2 or maybe CUDA 11.1 for example?
  • How to choose video card? How to check video card compatibility?
  • How to use Jetson modules? Should i use them?
  • How to longer video card life? How to avoid video card heating?

Currently now I’m trying to help you by reducing amount of questions/problems you solve so you don’t spend much time for that. If I would find exact explanation I might ask a lot less questions, but currently it’s almost impossible to solve some stuff without help. Also I see many questions ppl ask just because they don’t know basic stuff.

I might write some guides and you can just edit and check them. Then just copy them. It will solve a bit.

  1. For complete beginners, I agree with you but most people looking to use DL frameworks generally know gpus are faster by now.

  2. We show how to configure that. Did you look at this page? cuDNN - Deeplearning4j It comes right up on google

  3. That’s not really a dl4j question…there’s guides on this topic that are framework agnostic

  4. Jetson…that’s a separate thing. I agree it’s under supported, but it’s also just not the top of our list. We’ll flesh that support out, but don’t really have the time or bandwidth to do it properly since nvidia decided not to update cuda to 11.0 for the nano. We only support 2 cuda versions and we rely on javacpp for that. The ROI isn’t really there.
    Right now, that’s asking us to maintain a 1 off module for just the jetson nano.
    We’d like to do that, but I would prefer to just wait for nvidia to support cuda 11 on there. It has to come at some point. For now, we’ve helped people work around it.

  5. That’s again not our problem. We don’t need to write docs for things that are fairly common knowledge in the industry. We can easily link to references.

You’re already attempting a lot of different projects. I haven’t seen 1 small thing from you yet. I appreciate that you’re trying, but I’d advise you again not to keep starting a bunch of projects, write about here, then not actually show up with code/docs.

Take 1 step at a time so you both: don’t get burned out and 2 finish something.

Did you look at this page? cuDNN - Deeplearning4j It comes right up on google

Yes i saw that. The problem is that here is lack of information. I could be a bit more. About other stuff things that ain’t covered by dl4j still can be explained here. For example, video card temperature management.

I am busy 100% of time or almost. I have almost no rest, but i have to do multiple things so I cannot make things fast. I could upload some stuff to git and you might take a look of that.

Honestly, instead of taking on even more work, take some rest and focus on one thing at a time.

If you still want to contribute once you have some more breathing room, you will be just as welcome to do so as you are now.

Thank you for your care, but i don’t need much rest.