Allocation failed: [[DEVICE] allocation failed; Error code: [2]]

Hi
I create ComputationGraphConfiguration.
When set following code
.setInputTypes(InputType.convolutionalFlat(640, 360, channels))
run net.init() will got the error message as title.
But when I change to
.setInputTypes(InputType.convolutionalFlat(240, 240, channels))
net.init() run well.
Any help would be much appreciated!

Following is my ComputationGraphConfiguration

new NeuralNetConfiguration.Builder()
.seed(rngSeed)
.l2(0.0005) // ridge regression value
.updater(new Nesterovs(0.006, 0.9))
.weightInit(WeightInit.XAVIER)
.graphBuilder()
.addInputs(“input”)
.setInputTypes(InputType.convolutionalFlat(640, 360, channels)) // InputType.convolutional for normal image HEIGHT, WIDTH
.addLayer(“L1”, new ConvolutionLayer.Builder(3, 3).nIn(channels).stride(1, 1).nOut(50).activation(Activation.RELU).build(), “input”)
.addLayer(“L2”, new SubsamplingLayer.Builder(SubsamplingLayer.PoolingType.MAX).kernelSize(2, 2).stride(2, 2).build(), “L1”)
.addLayer(“L3”, new ConvolutionLayer.Builder(3, 3)
.stride(1, 1) // nIn need not specified in later layers
.nOut(50)
.activation(Activation.RELU)
.build(), “L2”)
.addLayer(“L4”, new SubsamplingLayer.Builder(SubsamplingLayer.PoolingType.MAX)
.kernelSize(2, 2)
.stride(2, 2)
.build(), “L3”)
.addLayer(“L5”, new DenseLayer.Builder().activation(Activation.RELU)
.nOut(500)
.build(), “L4”)
.addLayer(“out”, new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
.nOut(N_OUTCOMES)
.activation(Activation.SOFTMAX)
.build(), “L5”)
.setOutputs(“out”)
.build();

It’s kind of hard to tell much from just your model but that is generally a memory management problem. Make sure you have enough RAM for being able to run your model. See more here: Memory - Deeplearning4j

If you can provide more context I can provide better advice.

Thank you very much。
if run with CPU, it’s OK. No error. forllowing is my POM dependency.

org.nd4j
nd4j-native
1.0.0-M2

But change to CUDA, as following

org.nd4j
nd4j-cuda-11.2
1.0.0-M1.1

and get the following message.
Warning: Versions of org.bytedeco:javacpp:1.5.7 and org.bytedeco:cuda:11.2-8.1-1.5.5 do not match.
[main] INFO org.nd4j.linalg.factory.Nd4jBackend - Loaded [JCublasBackend] backend
[main] INFO org.nd4j.nativeblas.NativeOpsHolder - Number of threads used for linear algebra: 32
[main] INFO org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Backend used: [CUDA]; OS: [Windows 11]
[main] INFO org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Cores: [8]; Memory: [8.0GB];
[main] INFO org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Blas vendor: [CUBLAS]
[main] INFO org.nd4j.linalg.jcublas.JCublasBackend - ND4J CUDA build version: 11.2.142
[main] INFO org.nd4j.linalg.jcublas.JCublasBackend - CUDA device 0: [NVIDIA GeForce MX350]; cc: [6.1]; Total memory: [2147352576]
[main] INFO org.nd4j.linalg.jcublas.JCublasBackend - Backend build information:
MSVC: 192930038
STD version: 201703L
CUDA: 11.2.142
DEFAULT_ENGINE: samediff::ENGINE_CUDA
HAVE_FLATBUFFERS

and runtime exception as following

java.lang.RuntimeException: cudaMalloc failed; Bytes: [246402000]; Error code [2]; DEVICE [0]
at org.nd4j.jita.memory.CudaMemoryManager.allocate(CudaMemoryManager.java:83)
at org.nd4j.jita.workspace.CudaWorkspace.alloc(CudaWorkspace.java:233)
at org.nd4j.linalg.jcublas.buffer.BaseCudaDataBuffer.(BaseCudaDataBuffer.java:425)
at org.nd4j.linalg.jcublas.buffer.CudaFloatDataBuffer.(CudaFloatDataBuffer.java:76)
at org.nd4j.linalg.jcublas.buffer.factory.CudaDataBufferFactory.create(CudaDataBufferFactory.java:419)
at org.nd4j.linalg.factory.Nd4j.createBuffer(Nd4j.java:1454)
at org.nd4j.linalg.jcublas.JCublasNDArrayFactory.createUninitialized(JCublasNDArrayFactory.java:1538)
at org.nd4j.linalg.factory.Nd4j.createUninitialized(Nd4j.java:4333)
at org.deeplearning4j.nn.layers.convolution.ConvolutionLayer.preOutput(ConvolutionLayer.java:446)
at org.deeplearning4j.nn.layers.convolution.ConvolutionLayer.activate(ConvolutionLayer.java:509)
at org.deeplearning4j.nn.graph.vertex.impl.LayerVertex.doForward(LayerVertex.java:110)
at org.deeplearning4j.nn.graph.ComputationGraph.ffToLayerActivationsInWS(ComputationGraph.java:2139)
at org.deeplearning4j.nn.graph.ComputationGraph.computeGradientAndScore(ComputationGraph.java:1376)
at org.deeplearning4j.nn.graph.ComputationGraph.computeGradientAndScore(ComputationGraph.java:1345)
at org.deeplearning4j.optimize.solvers.BaseOptimizer.gradientAndScore(BaseOptimizer.java:174)
at org.deeplearning4j.optimize.solvers.StochasticGradientDescent.optimize(StochasticGradientDescent.java:61)
at org.deeplearning4j.optimize.Solver.optimize(Solver.java:52)
at org.deeplearning4j.nn.graph.ComputationGraph.fitHelper(ComputationGraph.java:1169)
at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1119)
at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1086)
at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1022)
at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1010)
at com.jufan.machinelearning.trainer.NavigationTrainer.train(NavigationTrainer.java:152)
at com.jufan.machinelearning.NavigationTrainerTest.testTrain(NavigationTrainerTest.java:24)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:93)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:40)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:529)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:756)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:452)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:210)

@Justin that is pretty self explanatory. Your GPU is out of memory. Adjust your batch size to make sure it fits on the GPU.

@agibsonccc Thank you very much.