Std::bad_alloc error

Purva-Chaudhari · January 2, 2022, 6:35am

Hi everyone, I have been trying to make the Unet model using DL4J. In the line computational_graph.fit() I get following error

(I guess it is the memory error right? Is there any way to solve it or is it limited by the laptop configurations) Let me know if any more details are required

    at org.nd4j.nativeblas.Nd4jCpu.mallocHost(Native Method)
    at org.nd4j.linalg.cpu.nativecpu.CpuMemoryManager.allocate(CpuMemoryManager.java:48)
    at org.nd4j.linalg.api.memory.abstracts.Nd4jWorkspace.alloc(Nd4jWorkspace.java:421)
    at org.nd4j.linalg.api.memory.abstracts.Nd4jWorkspace.alloc(Nd4jWorkspace.java:320)
    at org.nd4j.linalg.cpu.nativecpu.buffer.BaseCpuDataBuffer.<init>(BaseCpuDataBuffer.java:492)
    at org.nd4j.linalg.cpu.nativecpu.buffer.FloatBuffer.<init>(FloatBuffer.java:68)
    at org.nd4j.linalg.cpu.nativecpu.buffer.DefaultDataBufferFactory.create(DefaultDataBufferFactory.java:329)
    at org.nd4j.linalg.factory.Nd4j.createBuffer(Nd4j.java:1467)
    at org.nd4j.linalg.api.ndarray.BaseNDArray.<init>(BaseNDArray.java:324)
    at org.nd4j.linalg.cpu.nativecpu.NDArray.<init>(NDArray.java:191)
    at org.nd4j.linalg.cpu.nativecpu.CpuNDArrayFactory.createUninitialized(CpuNDArrayFactory.java:226)
    at org.nd4j.linalg.factory.Nd4j.createUninitialized(Nd4j.java:4364)
    at org.deeplearning4j.nn.layers.convolution.ConvolutionLayer.preOutput(ConvolutionLayer.java:442)
    at org.deeplearning4j.nn.layers.convolution.ConvolutionLayer.activate(ConvolutionLayer.java:505)
    at org.deeplearning4j.nn.graph.vertex.impl.LayerVertex.doForward(LayerVertex.java:110)
    at org.deeplearning4j.nn.graph.ComputationGraph.ffToLayerActivationsInWS(ComputationGraph.java:2135)
    at org.deeplearning4j.nn.graph.ComputationGraph.computeGradientAndScore(ComputationGraph.java:1372)
    at org.deeplearning4j.nn.graph.ComputationGraph.computeGradientAndScore(ComputationGraph.java:1341)
    at org.deeplearning4j.optimize.solvers.BaseOptimizer.gradientAndScore(BaseOptimizer.java:174)
    at org.deeplearning4j.optimize.solvers.StochasticGradientDescent.optimize(StochasticGradientDescent.java:61)
    at org.deeplearning4j.optimize.Solver.optimize(Solver.java:52)
    at org.deeplearning4j.nn.graph.ComputationGraph.fitHelper(ComputationGraph.java:1165)
    at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1115)
    at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1082)
    at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1018)
    at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1006)
    at activeSegmentation.deepLearning.UNet1.run(UNet1.java:111)
    at activeSegmentation.deepLearning.UNet1.main(UNet1.java:287)

treo · January 3, 2022, 8:06am

Unfortunately you haven’t provided the full stack trace.

In principle memory limitations are caught before an allocation error like that occurs.

Can you please share some details about the system this is running on?

Purva-Chaudhari · January 3, 2022, 12:40pm

Thank you for the reply,

My system configurations

I am using Intellij

Here is debug console output

18:26:15.587 [main] DEBUG oshi.util.platform.windows.WmiUtil - Query: SELECT Version,ProductType,BuildNumber,CSDVersion,SuiteMask FROM Win32_OperatingSystem 
18:26:16.055 [main] DEBUG oshi.software.os.windows.WindowsOSVersionInfoEx - Initialized OSVersionInfoEx
18:26:17.393 [main] DEBUG oshi.hardware.common.AbstractCentralProcessor - Oracle MXBean detected.
18:26:17.422 [main] DEBUG oshi.util.platform.windows.WmiUtil - Connected to ROOT\CIMV2 WMI namespace
18:26:17.422 [main] DEBUG oshi.util.platform.windows.WmiUtil - Query: SELECT ProcessorID FROM Win32_Processor 
18:26:17.461 [main] DEBUG oshi.util.platform.windows.WmiUtil - Connected to ROOT\CIMV2 WMI namespace
18:26:17.461 [main] DEBUG oshi.util.platform.windows.WmiUtil - Query: SELECT Name,PercentIdleTime,PercentPrivilegedTime,PercentUserTime,PercentInterruptTime,PercentDPCTime FROM Win32_PerfRawData_Counters_ProcessorInformation WHERE NOT Name LIKE "%_Total"
18:26:24.431 [main] DEBUG oshi.util.platform.windows.WmiUtil - Connected to ROOT\CIMV2 WMI namespace
18:26:24.431 [main] DEBUG oshi.util.platform.windows.WmiUtil - Query: SELECT PercentInterruptTime,PercentDPCTime FROM Win32_PerfRawData_Counters_ProcessorInformation WHERE Name="_Total"
18:26:24.446 [main] DEBUG oshi.hardware.platform.windows.WindowsCentralProcessor - Initialized Processor
18:26:24.687 [main] ERROR org.deeplearning4j.util.CrashReportingUtil - >>> Out of Memory Exception Detected. Memory crash dump written to: G:\ACTIVESEGMENTATION-testBranch\dl4j-memory-crash-dump-1641214573133_1.txt
18:26:24.688 [main] WARN org.deeplearning4j.util.CrashReportingUtil - Memory crash dump reporting can be disabled with CrashUtil.crashDumpsEnabled(false) or using system property -Dorg.deeplearning4j.crash.reporting.enabled=false
18:26:24.688 [main] WARN org.deeplearning4j.util.CrashReportingUtil - Memory crash dump reporting output location can be set with CrashUtil.crashDumpOutputDirectory(File) or using system property -Dorg.deeplearning4j.crash.reporting.directory=<path>
Exception in thread "main" java.lang.OutOfMemoryError: Cannot allocate new LongPointer(4): totalBytes = 560, physicalBytes = 7236M
	at org.bytedeco.javacpp.LongPointer.<init>(LongPointer.java:88)
	at org.bytedeco.javacpp.LongPointer.<init>(LongPointer.java:53)
	at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.createShapeInfo(NativeOpExecutioner.java:2016)
	at org.nd4j.linalg.api.shape.Shape.createShapeInformation(Shape.java:3247)
	at org.nd4j.linalg.api.ndarray.BaseShapeInfoProvider.createShapeInformation(BaseShapeInfoProvider.java:68)
	at org.nd4j.linalg.api.ndarray.BaseNDArray.<init>(BaseNDArray.java:180)
	at org.nd4j.linalg.api.ndarray.BaseNDArray.<init>(BaseNDArray.java:174)
	at org.nd4j.linalg.cpu.nativecpu.NDArray.<init>(NDArray.java:78)
	at org.nd4j.linalg.cpu.nativecpu.CpuNDArrayFactory.create(CpuNDArrayFactory.java:409)
	at org.nd4j.linalg.factory.Nd4j.create(Nd4j.java:4033)
	at org.nd4j.linalg.api.shape.Shape.newShapeNoCopy(Shape.java:2123)
	at org.deeplearning4j.nn.layers.convolution.ConvolutionLayer.preOutput(ConvolutionLayer.java:477)
	at org.deeplearning4j.nn.layers.convolution.ConvolutionLayer.activate(ConvolutionLayer.java:505)
	at org.deeplearning4j.nn.graph.vertex.impl.LayerVertex.doForward(LayerVertex.java:110)
	at org.deeplearning4j.nn.graph.ComputationGraph.ffToLayerActivationsInWS(ComputationGraph.java:2135)
	at org.deeplearning4j.nn.graph.ComputationGraph.computeGradientAndScore(ComputationGraph.java:1372)
	at org.deeplearning4j.nn.graph.ComputationGraph.computeGradientAndScore(ComputationGraph.java:1341)
	at org.deeplearning4j.optimize.solvers.BaseOptimizer.gradientAndScore(BaseOptimizer.java:174)
	at org.deeplearning4j.optimize.solvers.StochasticGradientDescent.optimize(StochasticGradientDescent.java:61)
	at org.deeplearning4j.optimize.Solver.optimize(Solver.java:52)
	at org.deeplearning4j.nn.graph.ComputationGraph.fitHelper(ComputationGraph.java:1165)
	at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1115)
	at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1082)
	at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1018)
	at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1006)
	at activeSegmentation.deepLearning.UNet1.run(UNet1.java:111)
	at activeSegmentation.deepLearning.UNet1.main(UNet1.java:287)

Caused by: java.lang.OutOfMemoryError: Physical memory usage is too high: physicalBytes (7236M) > maxPhysicalBytes (6104M)
	at org.bytedeco.javacpp.Pointer.deallocator(Pointer.java:700)
	at org.bytedeco.javacpp.Pointer.init(Pointer.java:126)
	at org.bytedeco.javacpp.LongPointer.allocateArray(Native Method)
	at org.bytedeco.javacpp.LongPointer.<init>(LongPointer.java:80)
	... 26 more

agibsonccc · January 3, 2022, 12:45pm

@Purva-Chaudhari that doesn’t look related to anything related to memory. A NoSuchMethodError is a standard problem that typically comes up with java versions clashing. We would still need more information to go on on what actually causes the crash.

As I mentioned before, I think it’s related to your computer running out of memory. Generally when it comes to running out of memory you will want to limit batch sizes. std::bad_alloc is a c++ error that says that the kernel can’t allocate anymore memory. Due to our usage of c++ code that would be my first guess.

So for now try to do anything that will reduce your memory foot print and ensure you have enough RAM to actually train your model. Monitor the output and the cpu/ram usage of your computer first.

treo · January 3, 2022, 12:57pm

@Purva-Chaudhari
The error you share here is entirely unrelated to the one you initially shared.

When sharing your error messages, please share them fully. If the error you show here happens in addition to the error that you’ve shared partially in your initial post, then please say so and also add the full error that contains the hint that it is about memory allocation.

So as @agibsonccc said, we can’t really help you all that much with the limited information. All we can do is speculate.

Purva-Chaudhari · January 3, 2022, 1:04pm

Sorry I updated the output above
It looks like my memory is running out. I had a very small dataset (5 train with ground truths and 5-validation and ground truths). So my batch size was small. I see the Out of Memory Exception Detected.

agibsonccc · January 3, 2022, 1:09pm

@Purva-Chaudhari read this page: Memory - Deeplearning4j

I would double check also how big each image is. Convolution memory is fairly heavy when executing (the forward and backwards pass) despite what it looks like on disk. Depending on the size of your image you may need to shrink your batch size to 3 or lower.

treo · January 3, 2022, 1:13pm

The other aspect here is maxPhysicalBytes (6104M).

So despite you having 12GB of memory, you only allow your JVM to use about 6GB. The link @agibsonccc pointed you to will also tell you how to increase that.

Purva-Chaudhari · January 3, 2022, 2:24pm

Thank you very much @treo @agibsonccc , the resource helped ! I changed my vm memory options to use more bytes for maxPhysicalBytes.

Topic		Replies	Views
Bert: Allocation failed: [[DEVICE] allocation failed; Error code: [2]] DL4J	2	355	January 26, 2022
Cuda error during run the example DL4J	46	3072	June 2, 2021
Fatal Error terminate called after throwing an instance of 'std::runtime_error' what(): bad data type DL4J	14	2529	May 27, 2020
DL4J and Mobile GPU Feature Request	1	454	October 9, 2020
Allocation failed: [[DEVICE] allocation failed; Error code: [2]] ND4J	4	513	May 24, 2022

Std::bad_alloc error

Related topics