Could not resolve dependencies for project test-SNAPSHOT: The following artifacts could not be resolved: org.nd4j:nd4j-cuda-10.2:jar:linux-arm64:1.0.0-beta7, org.bytedeco:cuda:jar:linux-arm64:10.2-7.6-1.5.3: Failure to find org.nd4j:nd4j-cuda-10.2:jar:linux-arm64:1.0.0-beta7 in https://jitpack.io was cached in the local repository, resolution will not be reattempted until the update interval of jitpack.io has elapsed or updates are forced
basically i am trying to use cuda 10.2 on jetson nano board, that come preinstalled with cuda 10.2.
using 10.0 itās ok (from pom) but then itās not able to load the correct runtime.
do you think that itās possible to add on repo the 10.2 version for arm? or any way to get it?
At the time of the beta7 release, there was no cuda 10.2 for the Nano yet, so thatās why it isnāt available.
As weāve changed our CI since the last release, it will probably take a while before we have a new version that supports the newest JetPack version for the Jetson Nano.
Maybe @agibsonccc can comment on that, as he has been working on the new CI infrastructure.
i did try, but i got
root@rama-jetson:/home/rama/Scaricati/deeplearning4j# ./change-cuda-versions.sh 10.2
Updating CUDA versions in pom.xml files to CUDA 10.2
sed: canāt read : No such file or directory
sed: canāt read : No such file or directory
sed: canāt read : No such file or directory
sed: canāt read : No such file or directory
sed: canāt read : No such file or directory
sed: canāt read : No such file or directory
sed: canāt read : No such file or directory
(repeated forever)
do you have some hint on that?
----updateā
after checking pom.xml seems that the version got updated, so maybe itās safe to ignore the warning, iāll let you know if the build went fine, i think itās gonna take a while
@ramarro123 we can compile our c++ code base for jetson nano, and have made sure that this step works. However, our cuda bindings we use from javacpp only support cuda 10.0 on the jetson nano.
You can see those here: Central Repository: org/bytedeco/cuda/10.0-7.4-1.5
In order to make this work you would have to compile those cuda bindings for the latest version of cuda.
There isnāt enough ROI for us to do all of this right now given other priorities. If you would like to attempt this yourself, Iām happy to try to point you at the particular steps. Otherwise, there arenāt a lot of alternatives right now.
i had time to play again with jetson, and unfortunately after the logging that i posted above, the code suddenly crash (not even run fit, just create a MultiLayerInstance)
Java VM: OpenJDK 64-Bit Server VM (11.0.11+9-Ubuntu-0ubuntu2.18.04, mixed mode, tiered, compressed oops, g1 gc, linux-aarch64)
Problematic frame:
C [libnd4jcpu.so+0x496c8b8] samediff::acquiredThreads(unsigned int)+0x0
Core dump will be written. Default location: Core dumps may be processed with ā/usr/share/apport/apport %p %s %c %d %Pā (or dumping to /home/rama/java/test/core.10062)
An error report file with more information is saved as:
/home/rama/java/test/hs_err_pid10062.log
If you would like to submit a bug report, please visit:
@ramarro123 t
Nd4j does not use this library. We used to but it hasnāt been maintained and the way it interfaces with native memory is at best subpar.
It especially does not work with gpus. That is a completely cpu library.
Could you give me an idea if something in our docs gave you an indication we were related to this somehow? If so, Iād like to fix this. I appreciate you working with me here.
i think i did a mistake on pom.xml as the modified version of code didnāt correlty report ācublasā and nvidia.
after modify it started again, but i am still facing some issue with a core dump after 10 mins of training.
i will be back on that matter when i find some more evidence.
Just a quick question, can i start various threads with their own MultiLayerNetwork? thatās allowed? i guess yes, as i create new objects for every thread and they donāt share anything, but better to double check before start other tests
@ramarro123 you generally should assume that networks are not thread safe. The jetson has so few resources, why are you multi threading? Is there a need for that? The point of it is to just offload work to the gpu. Multiple networks in ram will push the nano to its limits (which are only 4g of unified host and device memory) - you have to account for the JVM process taking up space as well.
Itās just to train 2 different network at the same time (different params, different input, just 2 different problem with some common logic on how to get the data from the source) but yeah, i can run one per time and play safe
@ramarro123 yeah just play it safe then. Itās easier as a whole to deal with. We wrote parallelwrapper for larger machines if people want to do parallel training of a single model (multiple minibatches in multiple threads, average the changes)
Parallel training of 100 models is better way to train them in some cases in comparasion with single model training. Model can get to local minimum which is not absolute. So training multiple models can reduce that chance or affect.