RuntimeException in Bidirectional trainning

Bidirectional bidirectional = new Bidirectional.Builder().mode(Bidirectional.Mode.ADD).rnnLayer(new LSTM.Builder().nOut(250).l2(0.003495283883324279).dropOut(0.6816483486423628).activation(Activation.TANH)
.updater(new Adam(new StepSchedule(ScheduleType.EPOCH, learningRate, decayRate, step))).build()).build();

Can Bidirectional be used like this way? RuntimeException thrown when trainning:

[main] WARN org.deeplearning4j.nn.layers.recurrent.LSTMHelpers - MKL/CuDNN execution failed - falling back on built-in implementation
java.lang.RuntimeException: cuDNN status = 8: CUDNN_STATUS_EXECUTION_FAILED
at org.deeplearning4j.cuda.BaseCudnnHelper.checkCudnn(BaseCudnnHelper.java:48)
at org.deeplearning4j.cuda.recurrent.CudnnLSTMHelper.activate(CudnnLSTMHelper.java:469)
at org.deeplearning4j.nn.layers.recurrent.LSTMHelpers.activateHelper(LSTMHelpers.java:205)
at org.deeplearning4j.nn.layers.recurrent.LSTM.activateHelper(LSTM.java:177)
at org.deeplearning4j.nn.layers.recurrent.LSTM.activate(LSTM.java:147)
at org.deeplearning4j.nn.layers.recurrent.BidirectionalLayer.activate(BidirectionalLayer.java:201)
at org.deeplearning4j.nn.graph.vertex.impl.LayerVertex.doForward(LayerVertex.java:111)
at org.deeplearning4j.nn.graph.ComputationGraph.ffToLayerActivationsInWS(ComputationGraph.java:2136)
at org.deeplearning4j.nn.graph.ComputationGraph.computeGradientAndScore(ComputationGraph.java:1373)
at org.deeplearning4j.nn.graph.ComputationGraph.computeGradientAndScore(ComputationGraph.java:1342)
at org.deeplearning4j.optimize.solvers.BaseOptimizer.gradientAndScore(BaseOptimizer.java:170)
at org.deeplearning4j.optimize.solvers.StochasticGradientDescent.optimize(StochasticGradientDescent.java:63)
at org.deeplearning4j.optimize.Solver.optimize(Solver.java:52)
at org.deeplearning4j.nn.graph.ComputationGraph.fitHelper(ComputationGraph.java:1166)
at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1116)
at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1083)

Can train on CPU successfully but GPU can’t do it.

@SidneyLann you likely could without cudnn. Could you give me something end to end (eg: not just the snippet) I could run to verify this behavior? Thanks!

Bidirectional bidirectional = new Bidirectional.Builder().mode(Bidirectional.Mode.ADD).rnnLayer(new LSTM.Builder().nOut(250).l2(0.003495283883324279).dropOut(0.6816483486423628).activation(Activation.TANH)
        .updater(new Adam(new StepSchedule(ScheduleType.EPOCH, learningRate, decayRate, step))).build()).build();
    
    ComputationGraphConfiguration conf = new NeuralNetConfiguration.Builder().seed(12345).weightInit(WeightInit.XAVIER).graphBuilder()
        .setInputTypes(InputType.recurrent(NerUtil.MAX_SENTENCE_LENGTH * 6)).addInputs("input").addLayer("lstm", bidirectional, "input")
        .addLayer("output",
            new RnnOutputLayer.Builder(new LossMCXENT(Nd4j.create(NerUtil.labelWeight1d()))).nOut(NerUtil.getLabelNum()).l2(0.006332749296897214).dropOut(0.45063086836326066)
                .activation(Activation.SOFTMAX).updater(new Adam(new StepSchedule(ScheduleType.EPOCH, learningRate, decayRate, step))).build(),
            "lstm")
        .setOutputs("output").backpropType(BackpropType.Standard).build();

    net = new ComputationGraph(conf);
    net.init();

I use BERT output as my input, just one hidden layer in my model to do NER job.