Hi All,
I tried to perform PCA factorization on relatively small data set using GPU but ran into error. Code sample:
INDArray data = Nd4j.rand(1000,31000);
PCA.pca_factor(data, 50, false);
Output:
[main] INFO org.nd4j.linalg.factory.Nd4jBackend - Loaded [JCublasBackend] backend
[main] INFO org.nd4j.nativeblas.NativeOpsHolder - Number of threads used for linear algebra: 32
[main] INFO org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Backend used: [CUDA]; OS: [Windows 10]
[main] INFO org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Cores: [4]; Memory: [7.1GB];
[main] INFO org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Blas vendor: [CUBLAS]
[main] INFO org.nd4j.linalg.jcublas.JCublasBackend - ND4J CUDA build version: 10.2.89
[main] INFO org.nd4j.linalg.jcublas.JCublasBackend - CUDA device 0: [GeForce GTX 1060 6GB]; cc: [6.1]; Total memory: [6442450944]
Exception in thread "main" java.lang.RuntimeException: Allocation failed: [[DEVICE] allocation failed; Error code: [2]]
at org.nd4j.nativeblas.OpaqueDataBuffer.allocateDataBuffer(OpaqueDataBuffer.java:79)
at org.nd4j.linalg.jcublas.buffer.BaseCudaDataBuffer.initPointers(BaseCudaDataBuffer.java:389)
...
And while it’s not completely clear from error message it looks like that memory is the reason.
CPU backed works just fine.
Could somebody please advise is there a way run PCA on GPU in this case? Does it make a sense to use that shared memory and how to do it if so? How to tackle such cases generally?