Running out of GPU Memory Despite Setting Parameters

joel-a · July 22, 2021, 5:52pm

I’m encountering the following error over and over again:

Exception in thread “main” java.lang.OutOfMemoryError: Cannot allocate new FloatPointer(2048000): totalBytes = 109M, physicalBytes = 34827M
at org.bytedeco.javacpp.FloatPointer.(FloatPointer.java:88)
at org.bytedeco.javacpp.FloatPointer.(FloatPointer.java:53)
at org.nd4j.linalg.jcublas.buffer.BaseCudaDataBuffer.set(BaseCudaDataBuffer.java:930)
at org.nd4j.linalg.jcublas.buffer.BaseCudaDataBuffer.setData(BaseCudaDataBuffer.java:1077)
at org.nd4j.linalg.factory.Nd4j.createTypedBuffer(Nd4j.java:1599)
at org.nd4j.linalg.jcublas.JCublasNDArrayFactory.create(JCublasNDArrayFactory.java:1463)
at org.nd4j.linalg.factory.Nd4j.create(Nd4j.java:3775)
at org.nd4j.linalg.factory.Nd4j.create(Nd4j.java:3493)
at edu.mit.ll.seamnet.ONNXEnhancementDNN.runMask(ONNXEnhancementDNN.java:171)
at edu.mit.ll.seamnet.ONNXEnhancementDNN.runInference(ONNXEnhancementDNN.java:125)
at edu.mit.ll.seamnet.SpeechEnhancement.enhance(SpeechEnhancement.java:88)
at edu.mit.ll.seamnet.SpeechEnhancement.enhance(SpeechEnhancement.java:130)
at edu.mit.ll.seamnet.SpeechEnhancement.runSpeedExp(SpeechEnhancement.java:365)
at edu.mit.ll.seamnet.SpeechEnhancement.main(SpeechEnhancement.java:415)
Caused by: java.lang.OutOfMemoryError: Physical memory usage is too high: physicalBytes (34827M) > maxPhysicalBytes (34816M)
at org.bytedeco.javacpp.Pointer.deallocator(Pointer.java:700)
at org.bytedeco.javacpp.Pointer.init(Pointer.java:126)
at org.bytedeco.javacpp.FloatPointer.allocateArray(Native Method)
at org.bytedeco.javacpp.FloatPointer.(FloatPointer.java:80)
… 13 more

java -Xms1G -Xms2G -Dorg.bytedeco.javacpp.cachedir=$TMPDIR -Dorg.bytedeco.javacpp.maxbytes=30G -Dorg.bytedeco.javacpp.maxphysicalbytes=34G

The gpu I’m using has 32 Gigs of memory and I have 137Gigs of RAM in the machine.

All I’m doing is calling the output method of three separate ComputationGraph objects over and over with different inputs. The output arrays are all been written over with each new output; I’m not collecting them. One thing that I’m noticing that is unexpected is that the GPU memory keeps increasing even though no new arrays are allocated from one iteration to the other. The input might be slightly bigger but does not increase consistently; the memory seems to grow consistently.

Any help with this issue would be greatly appreciated.

Thank you!

Joel

joel-a · July 27, 2021, 2:39pm

@treo Any thoughts on this? I’m still seeing this behavior and can’t find a solution. Thanks!

agibsonccc · July 27, 2021, 2:48pm

@joel-a have you tried calling System.gc() occasionally? Sometimes when people run OOM it’s just due to a race condition with the garbage collector not collecting in time.

@saudet might have more here for you.

joel-a · July 27, 2021, 5:26pm

@agibsonccc @saudet Thank you for the advice; I will try it.

That said, I don’t understand why the memory is growing in the first place, to 40982M in my last test, and exceeding the maxPhysicalBytes value (set to 40960M in the same run). Shouldn’t the underlying library realize it is reaching its limit and garbage collect before trying to allocate more memory?

agibsonccc · July 28, 2021, 2:30am

@joel-a that’s not quite how it works. The JVM GC isn’t actually aware of the memory on the gpu. The way our off heap memory management works is you either trigger the gc, or you set a gc frequency. You can see a similar thread here on this very topic: Dl4j cuda 11.2 running out of memory on evaluation on ubuntu 20.04 - #12 by ajmakoni

joel-a · July 29, 2021, 6:07am

@abisonccc I followed your advice and now I’m calling System.gc() after every iteration of the loop that runs inference on the DNN’s, collect the results in INDArray objects, and perform matrix operations on those INDArray’s. I see now that the GPU is stable, i.e. it’s not increasing without bound as it was before. However, the process memory keeps increasing constantly even though all my object references are re-used in the main loop, i.e. I’m not accumulating results in memory. Any idea why this is happening and resulting in the OOM error detailed previously? What am I missing?

agibsonccc · July 29, 2021, 6:50am

@joel-a did you also try the thread I gave you with the gc auto frequency? Do you have a configuration we could look at maybe?

joel-a · July 29, 2021, 3:13pm

@agibsonccc I looked at the thread. I thought all the gc auto frequency did was call System.gc() with a certain period. Does it do something else I’m not aware of? I’m already calling System.gc() as soon as I finish with the arrays on every iteration. Regarding configuration, do you mean beyond the parameters I’m sending in the java -jar command? Please let me know how I get the configuration values you would like to see.

Thank you for your help!

agibsonccc · July 29, 2021, 10:12pm

@joel-a mainly something we could run to understand things like the size of the model, or something we could profile? If there’s an actual real issue, knowing the use case would help a ton. If cudnn is involved for example there are extra small components like descriptors that get created.

These kinds of issues are subtle so really just dumping anything we could run would help a lot.

Topic		Replies	Views
Out of Memory Error RL4J	8	1169	April 28, 2020
Memory leak in ND4J if using the ZGC garbage collector ND4J	6	776	November 21, 2021
Error: _reductionPointer allocation failed when loading model ND4J	4	569	May 21, 2020
Search for "Off-Heap" Memory Leak ND4J	4	479	July 22, 2021
PCA on GPU - Insufficient memory ND4J	2	963	August 21, 2020

Running out of GPU Memory Despite Setting Parameters

Related topics