List featureList = new ArrayList<>();
when add INDArray instances to featureList , it is cache in GPU RAM, how to cache them to HOST RAM?
@SidneyLann everything is already bidirectionally allocated/cached automatically in an async manner. The only thing you can really do is control when the data is synced. If you need to make sure the data is synced, just call:
Nd4j.getExecutioner().commit();
List featureList = new ArrayList<>();
INDArray output = sameDiff.output(…);
featureList.add(output);
This output INArray is NOT released in GPU RAM and in a long time hit error: Allocation failed: [[DEVICE] allocation failed; Error code: [2]], how to release it in GPU RAM and only cache it in HOST RAM? Thanks.
Ok, so what you actually want to do is release the memory from gpu. Could you elaborate on your end to end use case a bit?
Also, what’s the actual root problem? You’re running out of gpu memory during training?
Let’s try to discuss the actual problem you’re having rather than what you think a good workaround might be. We might be able to come up with a few different things you can try.
My work steps:
- Use BERT model by SameDiff to get output(the minibatch size can only be 1)
- Concat 32 outputs in step 1 to train my model
So I need to cache the outputs in step 1, but I want to cache it only in HOST RAM but not GPU RAM. How to do? Thanks.
If comment this line featureList.add(output);, the output will not cache in HOST RAM and GPU RAM and then will NOT hit error: Allocation failed: [[DEVICE] allocation failed; Error code: [2]].
@SidneyLann if all you want to do is save memory, then I would suggest calling System.gc() each epoch if you are pushing the limits of the gpu.
That will mostly ensure GC happens fast.
Beyond that, I would look in to running things within a configured workspace if you want to do that. Look at our docs for a bit more about the memory management: https://deeplearning4j.konduit.ai/config/config-memory
The bottom section should be relevant for you.
basicConfig = WorkspaceConfiguration.builder()
.policyAllocation(AllocationPolicy.STRICT)
.policyLearning(LearningPolicy.FIRST_LOOP)
.policyMirroring(MirroringPolicy.HOST_ONLY)
.policySpill(SpillPolicy.EXTERNAL).build();
...
outIdxsArr = Nd4j.createFromArray(outIdxs);
outMaskArr = Nd4j.createFromArray(outMask);
outSegmentIdArr = Nd4j.createFromArray(outSegment);
placeholders.put("Placeholder", outIdxsArr);
placeholders.put("Placeholder_2", outSegmentIdArr);
placeholders.put("Placeholder_1", outMaskArr);
output = output1(placeholders);
featureTmp.add(output);
}
try (MemoryWorkspace ws = Nd4j.getWorkspaceManager().getAndActivateWorkspace(basicConfig, "OTHER_ID")) {
features.add(Nd4j.vstack(featureTmp).detach());
labels.add(Nd4j.createFromArray(outLabel).detach());
} catch (Exception e) {
e.printStackTrace();
}
Are the INDArrays in features and labels ONLY allocated in CPU RAM? How to allocate INDArrays ONLY in jvm heap but not off heap?
I have 32g CPU RAM and 11g GPU RAM, such then off heap only used 11g, and jvm heap used < 3g here, I want to use more CPU RAM to cache INDArrays.