PCA on GPU - Insufficient memory

sogawa-sps · August 19, 2020, 9:07pm

Hi All,

I tried to perform PCA factorization on relatively small data set using GPU but ran into error. Code sample:

INDArray data = Nd4j.rand(1000,31000);
PCA.pca_factor(data, 50, false);

Output:

[main] INFO org.nd4j.linalg.factory.Nd4jBackend - Loaded [JCublasBackend] backend
[main] INFO org.nd4j.nativeblas.NativeOpsHolder - Number of threads used for linear algebra: 32
[main] INFO org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Backend used: [CUDA]; OS: [Windows 10]
[main] INFO org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Cores: [4]; Memory: [7.1GB];
[main] INFO org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Blas vendor: [CUBLAS]
[main] INFO org.nd4j.linalg.jcublas.JCublasBackend - ND4J CUDA build version: 10.2.89
[main] INFO org.nd4j.linalg.jcublas.JCublasBackend - CUDA device 0: [GeForce GTX 1060 6GB]; cc: [6.1]; Total memory: [6442450944]
Exception in thread "main" java.lang.RuntimeException: Allocation failed: [[DEVICE] allocation failed; Error code: [2]]
	at org.nd4j.nativeblas.OpaqueDataBuffer.allocateDataBuffer(OpaqueDataBuffer.java:79)
	at org.nd4j.linalg.jcublas.buffer.BaseCudaDataBuffer.initPointers(BaseCudaDataBuffer.java:389)
	...

And while it’s not completely clear from error message it looks like that memory is the reason.

2020-08-19-ND4J-PCA

CPU backed works just fine.

Could somebody please advise is there a way run PCA on GPU in this case? Does it make a sense to use that shared memory and how to do it if so? How to tackle such cases generally?

agibsonccc · August 21, 2020, 4:28am

Hi, that doesn’t seem right. Do you have a test case we can see?

sogawa-sps · August 21, 2020, 11:53pm

Hi!

It reproduces on any INDArray big enough like in code snippet from the post above (ready to run sample: GitHub - sogawa-sps/testnd4j). In my real world app I have a sparse matrix (density less then 0.001) but issue reproduces on dense randomly filled matrix in the same way.

Topic		Replies	Views
CUDA - Failure to allocate bytes	1	1155	March 13, 2020
Allocation failed: [[DEVICE] allocation failed; Error code: [2]] ND4J	4	520	May 24, 2022
No CUDA devices were found DL4J	13	1183	March 4, 2020
Bert: Allocation failed: [[DEVICE] allocation failed; Error code: [2]] DL4J	2	364	January 26, 2022
Running out of GPU Memory Despite Setting Parameters DL4J	8	620	July 29, 2021

PCA on GPU - Insufficient memory

Related topics