Using ML models with Amazon EMR

I’m trying to set up machine learning models in Amazon. When trying with the MNIST example model in scala, I got the following error:

[2021-04-10 21:36:23.144]Container exited with a non-zero exit code 13. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
java.lang.ClassNotFoundException: org.nd4j.linalg.learning.config.IUpdater
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)

The total code is here:

This happens when executing the jar file both locally and on EMR.

Edit: Implementing the interface IUpdater makes that error go away, and the following error happens instead.

Exception in thread “main” java.lang.NoClassDefFoundError: org/nd4j/linalg/api/ndarray/INDArray
at java.base/java.lang.Class.getDeclaredMethods0(Native Method)
at java.base/java.lang.Class.privateGetDeclaredMethods(Class.java:3325)
at java.base/java.lang.Class.getMethodsRecursive(Class.java:3466)
at java.base/java.lang.Class.getMethod0(Class.java:3452)

This looks like you simply haven’t packaged all dependencies into your jar file.

Have you checked the jar file to ensure that everything is there?

build.sbt:

name := “test3”

version := “0.1”

scalaVersion := “2.12.10”

libraryDependencies ++= Seq(
“org.apache.spark” %% “spark-core” % “2.4.7”,
“org.apache.spark” %% “spark-sql” % “2.4.7”,
“org.apache.spark” %% “spark-mllib” % “2.4.7”,
“org.apache.spark” %% “spark-streaming” % “2.4.7”,
“org.twitter4j” % “twitter4j-core” % “4.0.4”,
“org.twitter4j” % “twitter4j-stream” % “4.0.4”,
“org.deeplearning4j” % “deeplearning4j-core” % “1.0.0-beta7”,
“org.deeplearning4j” % “deeplearning4j-ui” % “1.0.0-beta7”,
“org.nd4j” % “nd4j-native-platform” % “1.0.0-beta7”,
“org.scala-lang.modules” %% “scala-parser-combinators” % “1.0.4”

)

JDK: 15
IDE: Intellij, Language: Scala

I’m building the files from build.sbt using package by command line, then running them by spark-submit on the command line or on EMR (which leads to the same errors).

The problem with not compiling due to IUpdater vanished once I implemented it. However, the IDE and package command compile the program even when all the interfaces IUpdater uses aren’t imported. Once I use the spark-submit command errors appear, sometimes even when I import the interfaces. So far I’ve had errors with GradientUpdater, INDarray, and ISchedule.

unless sbt packages an uberjar, i.e. unless it adds all transitive dependencies into your jar, you will be missing something.

That’s why I told you to check your packaged jar file.

My uberjar now contains the class INDArray, but I’m still getting the message “java.lang.NoClassDefFoundError: org/nd4j/linalg/api/ndarray/INDArray”. Is this specifically because of a mismatch with my computer’s CUDA 11, or is it some other issue? I compiled using assembly with the standard plugin file and I’m importing “org.nd4j” % “nd4j-native-platform” % “1.0.0-beta7” in build.sbt.

@kmcphee Few things.

nd4j-native-platform is for cpu. Unless you’re actually including nd4j-cuda somewhere, it won’t affect anything during execution.
Are you trying to use GPUs with EMR? If so, you need to make sure the version of cuda running on each node matches the expected cuda version bundled with your uber jar.

Beyond that, could you provide complete stack trace? We can’t predict what your problem is just from a 1 line description. The only thing I can maybe understand from your problem is that something else is the underlying cause.

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/C:/Spark/spark-3.1.1-bin-hadoop2.7/jars/spark-unsafe_2.12-3.1.1.jar) to constructor java.nio.DirectByteBuffer(long,int)
WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
21/04/11 17:30:56 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
Exception in thread “main” java.lang.NoClassDefFoundError: org/nd4j/linalg/learning/GradientUpdater
at java.base/java.lang.Class.getDeclaredMethods0(Native Method)
at java.base/java.lang.Class.privateGetDeclaredMethods(Class.java:3325)
at java.base/java.lang.Class.getMethodsRecursive(Class.java:3466)
at java.base/java.lang.Class.getMethod0(Class.java:3452)
at java.base/java.lang.Class.getMethod(Class.java:2199)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:42)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1030)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1039)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.nd4j.linalg.learning.GradientUpdater
at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:435)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:589)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
… 13 more
log4j:WARN No appenders could be found for logger (org.apache.spark.util.ShutdownHookManager).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See Apache log4j 1.2 - Frequently Asked Technical Questions for more info.

This is the entire message. I’m not sure why the log4j tutorials I’ve followed aren’t getting rid of the warnings.

@kmcphee ah ok a ClassNotFoundException actually is a valid issue. Sometimes NoClassDefFound comes from something else as the underlying stack trace.
This tells me that your bundle jar still isn’t being built properly. The sbt assembly isn’t including all the necessary classes. Have you verified that the final jar has everything in it? I know you mentioned one class being present.

Like @treo mentioned, your jar has issues. Please verify that all dependencies needed are included.

The class GradientUpdater wasn’t present this time. I have the current error with org.nd4j.linalg.factory.Nd4jBackend present:

Exception in thread “main” java.lang.ExceptionInInitializerError
at org.deeplearning4j.nn.conf.NeuralNetConfiguration$Builder.seed(NeuralNetConfiguration.java:579)
at SimpleApp$.main(SimpleApp.scala:54)
at SimpleApp.main(SimpleApp.scala)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1030)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1039)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.RuntimeException: org.nd4j.linalg.factory.Nd4jBackend$NoAvailableBackendException: Please ensure that you have an nd4j backend on your classpath. Please see: https://deeplearning4j.konduit.ai/nd4j/backend
at org.nd4j.linalg.factory.Nd4j.initContext(Nd4j.java:5094)
at org.nd4j.linalg.factory.Nd4j.(Nd4j.java:270)
… 15 more
Caused by: org.nd4j.linalg.factory.Nd4jBackend$NoAvailableBackendException: Please ensure that you have an nd4j backend on your classpath. Please see: https://deeplearning4j.konduit.ai/nd4j/backend
at org.nd4j.linalg.factory.Nd4jBackend.load(Nd4jBackend.java:221)
at org.nd4j.linalg.factory.Nd4j.initContext(Nd4j.java:5091)

I’m guessing that this error is due to missing more interface files? Since I’m already making an uberjar file is there something else I need to add to get these packaged?

@kmcphee yes please ensure the backend is included. (Either nd4j-native or cuda) in your uber jar. See:
https://deeplearning4j.konduit.ai/config/backends

You can use nd4j-native-platform (make sure when you build to specify -Djavacpp.platform=linux-x86_64) or whatever your platform you’re building for is so you don’t add a bunch of OSes you don’t need. More on that here: Reducing the Number of Dependencies · bytedeco/javacpp-presets Wiki · GitHub (dl4j uses javacpp for packaging dependencies)

I’m already using “org.nd4j” % “nd4j-native-platform” % “1.0.0-beta7” as a dependency though. Do I need to have maven set up as well, or does the merge statement below discard the native-platform library?

assemblyMergeStrategy in assembly := {
case PathList(“META-INF”, xs @ _*) => MergeStrategy.discard
case x => MergeStrategy.first
}

@kmcphee I don’t think so. There’s nothing under META-INF. I see which code you’re copying from:

Regarding SBT, most folks are able to work with it just fine without using maven. Having 2 build systems in 1 project is a bad idea.

That being said, the happy path is generally maven + maven-shade.

@saudet do you know what issues can occur with javacpp native deps + sbt-assembly uber jars? Nothing here looks out of the ordinary.

The only thing I can think of here is it’s something specific to spark + emr.

To get more information about loading issues like that, make sure that the “org.bytedeco.javacpp.logger.debug” system property is set to “true” or the “org.bytedeco.javacpp.logger” one is set to “slf4j” with your logger set to its debug level.

I set this logging property, but it seems to only be producing logs in the Intellij IDE, where the program works. Changing CUDA to 10.0 and switching from the native library didn’t work, nor did importing the backend directly. Is there any way I can bugcheck this?

@kmcphee could you try just running on just cpu instead? Strip out cuda and everything. I’m not sure why you are importing the cpu backend then you keep talking about cuda 10. Cuda isn’t baked in by default and on a separate backend. First get something basic with cpu working first. It sounds like you’re trying a bunch of 1 off hacks on your own which just makes it harder for us to support you and harder to debug in general.

Sorry for the confusion. I have been using nd4j-native-platform consistently. I only switched to 10.0 in the hope of getting a different result.

If you do not see any additional messages in your log when “org.bytedeco.javacpp.logger” is set to “slf4j”, you will need to figure out how to make your logging framework work correctly with SLF4J with debug messages.