This is my gradle:
implementation 'org.deeplearning4j:deeplearning4j-core:1.0.0-beta7'
implementation 'org.deeplearning4j:deeplearning4j-ui:1.0.0-beta7'
implementation 'org.deeplearning4j:deeplearning4j-zoo:1.0.0-beta7'
implementation 'ch.qos.logback:logback-classic:1.2.3'
implementation 'org.projectlombok:lombok:1.18.12'
// GPU
implementation group: 'org.nd4j', name: 'nd4j-cuda-10.2-platform', version: '1.0.0-beta7'
implementation group: 'org.deeplearning4j', name: 'deeplearning4j-cuda-10.2', version: '1.0.0-beta7'
17:25:52.497 [main] INFO org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Blas vendor: [CUBLAS]
17:25:52.516 [main] INFO org.nd4j.linalg.jcublas.JCublasBackend - ND4J CUDA build version: 10.2.89
17:25:52.517 [main] INFO org.nd4j.linalg.jcublas.JCublasBackend - CUDA device 0: [GeForce RTX 2070]; cc: [7.5]; Total memory: [8589934592]
17:25:53.187 [main] INFO org.deeplearning4j.nn.graph.ComputationGraph - Starting ComputationGraph with WorkspaceModes set to [training: ENABLED; inference: ENABLED], cacheMode set to [NONE]
Caused by: org.nd4j.linalg.exception.ND4JIllegalStateException: Op [histogram] X argument uses leaked workspace pointer from workspace [WS_LAYER_WORKING_MEM]: Workspace the array was defined in is no longer open.
All open workspaces:
Caused by: org.nd4j.linalg.exception.ND4JIllegalStateException: Op [histogram] X argument uses leaked workspace pointer from workspace [WS_LAYER_WORKING_MEM]: Workspace the array was defined in is no longer open.
Then i tried to add:
.trainingWorkspaceMode(WorkspaceMode.NONE)
.inferenceWorkspaceMode(WorkspaceMode.NONE)
and it didnt crash but all values at evaluations was NaNs
I wanted to try HALF, because model should be executed on android device, so I wanted to check is any improvement in execute speed.
“I would recommend optimizing it for inference instead after training”. I am not sure how to do this, do you have any links/example so I can learn more about this?