RecurrentAttentionLayer error on gpu

When using RecurrentAttentionLayer on gpu, it gets exception:

Exception in thread “main” java.lang.RuntimeException: Op [multi_head_dot_product_attention_bp] execution failed
at org.nd4j.linalg.jcublas.ops.executioner.CudaExecutioner.exec(CudaExecutioner.java:2316)
at org.nd4j.linalg.factory.Nd4j.exec(Nd4j.java:6599)
at org.nd4j.autodiff.samediff.internal.InferenceSession.doExec(InferenceSession.java:483)
at org.nd4j.autodiff.samediff.internal.InferenceSession.getOutputs(InferenceSession.java:217)
at org.nd4j.autodiff.samediff.internal.InferenceSession.getOutputs(InferenceSession.java:67)
at org.nd4j.autodiff.samediff.internal.AbstractSession.output(AbstractSession.java:380)
at org.nd4j.autodiff.samediff.SameDiff.directExecHelper(SameDiff.java:2601)
at org.nd4j.autodiff.samediff.SameDiff.batchOutputHelper(SameDiff.java:2569)
at org.nd4j.autodiff.samediff.SameDiff.calculateGradientsAndOutputs(SameDiff.java:4049)
at org.nd4j.autodiff.samediff.SameDiff.calculateGradients(SameDiff.java:4010)
at org.deeplearning4j.nn.layers.samediff.SameDiffLayer.backpropGradient(SameDiffLayer.java:169)
at org.deeplearning4j.nn.graph.vertex.impl.LayerVertex.doBackward(LayerVertex.java:149)
at org.deeplearning4j.nn.graph.ComputationGraph.calcBackpropGradients(ComputationGraph.java:2713)
at org.deeplearning4j.nn.graph.ComputationGraph.computeGradientAndScore(ComputationGraph.java:1382)
at org.deeplearning4j.nn.graph.ComputationGraph.computeGradientAndScore(ComputationGraph.java:1342)
at org.deeplearning4j.optimize.solvers.BaseOptimizer.gradientAndScore(BaseOptimizer.java:170)
at org.deeplearning4j.optimize.solvers.StochasticGradientDescent.optimize(StochasticGradientDescent.java:63)
at org.deeplearning4j.optimize.Solver.optimize(Solver.java:52)
at org.deeplearning4j.nn.graph.ComputationGraph.fitHelper(ComputationGraph.java:1166)
at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1116)
at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1026)

Caused by: java.lang.RuntimeException: [DEVICE] allocation failed; Error code: [2]
at org.nd4j.linalg.jcublas.ops.executioner.CudaExecutioner.exec(CudaExecutioner.java:2500)
at org.nd4j.linalg.jcublas.ops.executioner.CudaExecutioner.exec(CudaExecutioner.java:2306)
… 22 more

The layer is added by the code like:

graphBuilder = graphBuilder.addLayer(“attension1”,
new RecurrentAttentionLayer.Builder().activation(Activation.SOFTSIGN)
.updater(new Adam(learningRateSchedule)).nHeads(5).headSize(40).nIn(sentenceNOut).nOut(sentenceNOut).build()
, lastLayerName);
lastLayerName = “attension1”;

It runs ok without the RecurrentAttentionLayer (s).

Dl4j version is beta-6.
The enviroment is same as No CUDA devices were found.

This looks like you again don’t have enough memory. Attention uses quite a lot of memory, and I’ve struggled even with the 11GB of VRAM that I have.

Ye, attention ops need some polishing in order to reduce memory footprint they use… Known issue.

It’s indeed short of memory.
When i use only one attension layer instead of two previously, it can run.