Out of Memory Error

haubna · April 28, 2020, 6:46am

I am using RL4J QLearning and i constantly get an out of memory error. This is what my QLearning Configuration looks like, the memory leak doesn’t come from my application, I’ve tried giving it a ton of memory, tried WorkspaceConfigurations and yeah now I ran out of ideas:

public static QLearning.QLConfiguration TETRIS_QL =
            new QLearning.QLConfiguration( 
                    123,    //Random seed
                    10000,    //Max step By epoch
                    50000000, //Max step
                    5000, //Max size of experience replay
                    32,     //size of batches
                    20,    //target update (hard)
                    10,     //num step noop warmup
                    0.1,   //reward scaling
                    0.99,   //gamma
                    1.0,    //td-error clipping
                    0.1f,   //min epsilon
                    1000,   //num step for eps greedy anneal
                    false    //double DQN
            );

Exception in thread "main" java.lang.OutOfMemoryError: Cannot allocate new DoublePointer(14): totalBytes = 837K, physicalBytes = 8224M
	at org.bytedeco.javacpp.DoublePointer.<init>(DoublePointer.java:76)
	at org.nd4j.linalg.api.buffer.BaseDataBuffer.<init>(BaseDataBuffer.java:705)
	at org.nd4j.linalg.api.buffer.DoubleBuffer.<init>(DoubleBuffer.java:48)
	at org.nd4j.linalg.api.buffer.factory.DefaultDataBufferFactory.create(DefaultDataBufferFactory.java:288)
	at org.nd4j.linalg.factory.Nd4j.getDataBuffer(Nd4j.java:1492)
	at org.nd4j.linalg.factory.Nd4j.createTypedBuffer(Nd4j.java:1502)
	at org.nd4j.linalg.cpu.nativecpu.CpuNDArrayFactory.create(CpuNDArrayFactory.java:344)
	at org.nd4j.linalg.factory.BaseNDArrayFactory.create(BaseNDArrayFactory.java:1350)
	at org.nd4j.linalg.factory.Nd4j.create(Nd4j.java:3547)
	at org.nd4j.linalg.factory.Nd4j.create(Nd4j.java:3391)
	at org.deeplearning4j.rl4j.util.LegacyMDPWrapper.getInput(LegacyMDPWrapper.java:106)
	at org.deeplearning4j.rl4j.util.LegacyMDPWrapper.step(LegacyMDPWrapper.java:68)
	at org.deeplearning4j.rl4j.learning.sync.qlearning.discrete.QLearningDiscrete.trainStep(QLearningDiscrete.java:146)
	at org.deeplearning4j.rl4j.learning.sync.qlearning.QLearning.trainEpoch(QLearning.java:124)
	at org.deeplearning4j.rl4j.learning.sync.SyncLearning.train(SyncLearning.java:96)
	at tetris.TetrisWindow.train(TetrisWindow.java:174)
	at tetris.TetrisWindow.init(TetrisWindow.java:101)
	at tetris.TetrisWindow.run(TetrisWindow.java:60)
	at tetris.TetrisWindow.<init>(TetrisWindow.java:56)
	at tetris.TetrisWindow.main(TetrisWindow.java:52)
Caused by: java.lang.OutOfMemoryError: Physical memory usage is too high: physicalBytes (8224M) > maxPhysicalBytes (8192M)
	at org.bytedeco.javacpp.Pointer.deallocator(Pointer.java:659)
	at org.bytedeco.javacpp.Pointer.init(Pointer.java:127)
	at org.bytedeco.javacpp.DoublePointer.allocateArray(Native Method)
	at org.bytedeco.javacpp.DoublePointer.<init>(DoublePointer.java:68)
	... 19 more

treo · April 28, 2020, 6:47am

How does the network you’ve configured for it look like? The QLearning configuration on its own isn’t enough to tell you anything.

haubna · April 28, 2020, 6:56am

Thanks for the quick reply! It probably is just an error that I made but I can’t seem to find it. I am pretty new to DL. Here is everything that I use for my QLearning Neural Network. I am using version 1.0.0-beta6 and OpenJDK 12.0.2.

public class TetrisEnvironment implements MDP<TetrisState, Integer, DiscreteSpace> {
	
	private TetrisWindow window;
	private DeferredRenderer renderer;
	
	private DiscreteSpace discreteSpace;
	private ObservationSpace<TetrisState> observationSpace;
	
	private TetrisGame tetrisGame;
	private List<Integer> actions = new ArrayList<Integer>();
	private boolean pause;
	
	public TetrisEnvironment(TetrisWindow window, DeferredRenderer renderer, boolean pause) {
		this.tetrisGame = new TetrisGame(window, renderer);
		this.window = window;
		this.renderer = renderer;
		this.pause = pause;
		
		if (renderer != null) {
			tetrisGame.initGUI(renderer.getGUIRootComponent());
		}
		
		// the different amount of outputs
		this.discreteSpace = new DiscreteSpace(4);
		this.observationSpace = new ArrayObservationSpace<TetrisState>(new int[] { 14 });
	}

	@Override
	public ObservationSpace<TetrisState> getObservationSpace() {
		return observationSpace;
	}

	@Override
	public DiscreteSpace getActionSpace() {
		return discreteSpace;
	}

	@Override
	public TetrisState reset() {
		tetrisGame.reset();
		actions.clear();
		
		System.out.println(("free Memory: " + Pointer.formatBytes(Pointer.availablePhysicalBytes()) + "/"
                        + Pointer.formatBytes(Pointer.totalPhysicalBytes())));
		
		return new TetrisState(tetrisGame);
	}

	@Override
	public void close() {
	}

	@Override
	public StepReply<TetrisState> step(Integer action) {
		double reward = tetrisGame.makeAction(action);
		actions.add(action);
		StepReply<TetrisState> reply = new StepReply<TetrisState>(new TetrisState(tetrisGame), reward, tetrisGame.lost, null);
		
		if (renderer != null) {
			renderer.render();
			window.getWindow().swapBuffers();
			window.getWindow().update();
			window.getWindow().pollEvents();
			
			if (pause) {
				try {
					Thread.sleep(50);
				} catch (InterruptedException e) {
					e.printStackTrace();
				}
			}
		}
		
		return reply;
	}

	@Override
	public boolean isDone() {
		return tetrisGame.lost;
	}

	@Override
	public MDP<TetrisState, Integer, DiscreteSpace> newInstance() {
		return new TetrisEnvironment(window, renderer, pause);
	}
	
	public List<Integer> getActions() {
		return actions;
	}

}

public class TetrisState implements Encodable {
	
	private double[] data;
	
	public TetrisState(TetrisGame tetris) {
		int index = 0;
		
		for (int i = 0; i < tetris.tiles.length; i++) {
			if (tetris.tiles[i] == tetris.currentTile) {
				index = i;
			}
		}

		data = new double[4 + tetris.playfield.length];
		data[0] = index / (double) tetris.tiles.length;
		data[1] = tetris.tileVariation / 4.0;
		data[2] = tetris.tilePosX / (double) tetris.playfield.length;
		data[3] = tetris.tilePosY / (double) tetris.playfield[0].length;
		
		for (int x = 0; x < tetris.playfield.length; x++) {
			int height = tetris.playfield[x].length;
			
			for (int y = 0; y < tetris.playfield[x].length; y++) {
				if (tetris.playfield[x].length != 0) {
					height = y;
					break;
				}
			}
			
			data[4 + x] = height / (double) tetris.playfield[0].length;
		}
	}

	@Override
	public double[] toArray() {
		return data;
	}
	
}

and this is how I train the network:

public static QLearning.QLConfiguration TETRIS_QL =
            new QLearning.QLConfiguration( 
                    123,    //Random seed
                    10000,    //Max step By epoch
                    50000000, //Max step
                    5000, //Max size of experience replay
                    32,     //size of batches
                    20,    //target update (hard)
                    10,     //num step noop warmup
                    0.1,   //reward scaling
                    0.99,   //gamma
                    1.0,    //td-error clipping
                    0.1f,   //min epsilon
                    1000,   //num step for eps greedy anneal
                    true    //double DQN
            );

	public static DQNFactoryStdDense.Configuration TETRIS_NET =
	        DQNFactoryStdDense.Configuration.builder()
	            .updater(new Adam(0.001)).numHiddenNodes(32).numLayer(9).build();

	private void train() {
        try {
	        // define the mdp from gym (name, render)
	        TetrisEnvironment mdp = new TetrisEnvironment(this, renderer, false);
	        // define the training
	        QLearningDiscreteDense<TetrisState> dql = new QLearningDiscreteDense<TetrisState>(mdp, TETRIS_NET, TETRIS_QL);
	
	        // train
	        dql.train();
	
	        // get the final policy
	        DQNPolicy<TetrisState> pol = dql.getPolicy();
	
	        // serialize and save (serialization showcase, but not required)
			pol.save("pol1");
			
	        // close the mdp 
	        mdp.close();
		} catch (IOException e) {
			e.printStackTrace();
		}
	}

treo · April 28, 2020, 7:20am

How large is your playing field?

Other than that, I don’t see anything that looks like an obvious reason why it would use that much memory.

But, you should try using Snapshots, as there was quite a lot of work on RL4J since the last release, and maybe you are running into something that was already fixed.

haubna · April 28, 2020, 7:22am

It is 10 so the arrays size is 14. I will give it a try and will let you know if it worked

haubna · April 28, 2020, 7:51am

It seems like that the Snapshot has fixed the issue. So I guess it was a memory leak of some kind beforehands. Thanks for the quick help

haubna · April 28, 2020, 9:02am

Nevermind. It reduced the memory leak but it’s not gone. Still throwing out of memory after 1 hour of training.

saudet · April 28, 2020, 9:07am

Does it also happen with ALE and Cartpole? Or only with your custom environment?

haubna · April 28, 2020, 11:14am

I tested it with a completely empty environment. So apparently I messed up some stuff in my OpenGL renderer which allocated native memory and I forgot to free it. My bad. Sorry for wasting your guys time. Thanks for the fast help anyways, much appreciated!

Topic		Replies	Views
Running out of GPU Memory Despite Setting Parameters DL4J	8	605	July 29, 2021
OutOfMemoryError and Memory Management RL4J	0	488	March 24, 2021
Memory leak in ND4J if using the ZGC garbage collector ND4J	6	772	November 21, 2021
Search for "Off-Heap" Memory Leak ND4J	4	472	July 22, 2021
How to determine cause of memory leak while making predictions? DL4J	14	2000	June 5, 2020

Out of Memory Error

Related topics