Memory leak in ND4J if using the ZGC garbage collector

I know that ZGC is still in an experimental phase, but recently I tried it out and it caused a memory leak.
here are the logs:

java.lang.OutOfMemoryError: Cannot allocate new LongPointer(1): totalBytes = 512, physicalBytes = 4146M
        at com.badlogic.gdx.backends.lwjgl3.Lwjgl3Application.<init>(
Caused by: java.lang.OutOfMemoryError: Cannot allocate new LongPointer(1): totalBytes = 512, physicalBytes = 4146M
        at org.bytedeco.javacpp.LongPointer.<init>(
        at org.bytedeco.javacpp.LongPointer.<init>(
        at org.nd4j.linalg.cpu.nativecpu.ops.CpuOpContext.setIArguments(
        at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(
        at org.nd4j.linalg.factory.Nd4j.exec(
        at org.nd4j.linalg.cpu.nativecpu.CpuNDArrayFactory.concat(
        at org.nd4j.linalg.factory.Nd4j.concat(
        at org.nd4j.linalg.factory.BaseNDArrayFactory.vstack(
        at org.nd4j.linalg.factory.Nd4j.vstack(
        at com.badlogic.gdx.Game.render(
        at com.badlogic.gdx.backends.lwjgl3.Lwjgl3Window.update(
        at com.badlogic.gdx.backends.lwjgl3.Lwjgl3Application.loop(
        at com.badlogic.gdx.backends.lwjgl3.Lwjgl3Application.<init>(
        ... 2 more
Caused by: java.lang.OutOfMemoryError: Physical memory usage is too high: physicalBytes (4146M) > maxPhysicalBytes (4040M)
        at org.bytedeco.javacpp.Pointer.deallocator(
        at org.bytedeco.javacpp.Pointer.init(
        at org.bytedeco.javacpp.LongPointer.allocateArray(Native Method)
        at org.bytedeco.javacpp.LongPointer.<init>(
        ... 17 more

specs: windows 10, nd4j 1.0.0-M1


It’s not a memory leak. Using JavaCPP 1.5.6 should help with this: OutOfMemoryError: Physical memory usage is too high: physicalBytes (5300M) > maxPhysicalBytes (5300M) · Issue #468 · bytedeco/javacpp · GitHub

1 Like

Thanks for the reply ! This error showed up after running the app for 3+ mins with ZGC. It works fine if not using ZGC with almost constant memory usage. The other thing I noticed was extremely high RAM usage in the profiler when ZGC was used. I will post screenshots soon.

btw how do I update javaCPP ? I only have nd4j as a dependency. Do I need to edit the nd4j pom files for this ?


With ZGC:

In this case the heap shoots to 2GB. and the app finally crrashes at last.

Let me know if you need any heap dumps or other things!

Yes, in the pom.xml file like this:

1 Like

It’s actually the nature of ZGC. It provides low-latency garbage collection, but at a price - by “eating” more RAM. It depends on many factors of course, but I’ve noticed in practice, that when the heap is quite volatile and a lot of stuff needs to be collected by GC very often, ZGC starts taking more memory for own purposes in order to guarantee the low-latency (it needs memory space in order keep all the “household” info). There’s one thing though - the actual array memory allocations for DL4J occur in the native memory, not heap. So what you see on your heap usage graphs is the part which is being used basically by your app directly (like by generated/loaded test/train data, service data etc.). Those graphs won’t show you how much exactly the whole application takes - this thing can be done by running org.bytedeco.javacpp.Pointer#physicalBytes()

1 Like