RuntimeException in Bidirectional trainning

SidneyLann · November 22, 2020, 12:33am

Bidirectional bidirectional = new Bidirectional.Builder().mode(Bidirectional.Mode.ADD).rnnLayer(new LSTM.Builder().nOut(250).l2(0.003495283883324279).dropOut(0.6816483486423628).activation(Activation.TANH)
.updater(new Adam(new StepSchedule(ScheduleType.EPOCH, learningRate, decayRate, step))).build()).build();

Can Bidirectional be used like this way? RuntimeException thrown when trainning:

[main] WARN org.deeplearning4j.nn.layers.recurrent.LSTMHelpers - MKL/CuDNN execution failed - falling back on built-in implementation
java.lang.RuntimeException: cuDNN status = 8: CUDNN_STATUS_EXECUTION_FAILED
at org.deeplearning4j.cuda.BaseCudnnHelper.checkCudnn(BaseCudnnHelper.java:48)
at org.deeplearning4j.cuda.recurrent.CudnnLSTMHelper.activate(CudnnLSTMHelper.java:469)
at org.deeplearning4j.nn.layers.recurrent.LSTMHelpers.activateHelper(LSTMHelpers.java:205)
at org.deeplearning4j.nn.layers.recurrent.LSTM.activateHelper(LSTM.java:177)
at org.deeplearning4j.nn.layers.recurrent.LSTM.activate(LSTM.java:147)
at org.deeplearning4j.nn.layers.recurrent.BidirectionalLayer.activate(BidirectionalLayer.java:201)
at org.deeplearning4j.nn.graph.vertex.impl.LayerVertex.doForward(LayerVertex.java:111)
at org.deeplearning4j.nn.graph.ComputationGraph.ffToLayerActivationsInWS(ComputationGraph.java:2136)
at org.deeplearning4j.nn.graph.ComputationGraph.computeGradientAndScore(ComputationGraph.java:1373)
at org.deeplearning4j.nn.graph.ComputationGraph.computeGradientAndScore(ComputationGraph.java:1342)
at org.deeplearning4j.optimize.solvers.BaseOptimizer.gradientAndScore(BaseOptimizer.java:170)
at org.deeplearning4j.optimize.solvers.StochasticGradientDescent.optimize(StochasticGradientDescent.java:63)
at org.deeplearning4j.optimize.Solver.optimize(Solver.java:52)
at org.deeplearning4j.nn.graph.ComputationGraph.fitHelper(ComputationGraph.java:1166)
at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1116)
at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1083)

SidneyLann · November 22, 2020, 12:37am

Can train on CPU successfully but GPU can’t do it.

agibsonccc · November 22, 2020, 1:25am

@SidneyLann you likely could without cudnn. Could you give me something end to end (eg: not just the snippet) I could run to verify this behavior? Thanks!

SidneyLann · November 22, 2020, 3:09am

Bidirectional bidirectional = new Bidirectional.Builder().mode(Bidirectional.Mode.ADD).rnnLayer(new LSTM.Builder().nOut(250).l2(0.003495283883324279).dropOut(0.6816483486423628).activation(Activation.TANH)
        .updater(new Adam(new StepSchedule(ScheduleType.EPOCH, learningRate, decayRate, step))).build()).build();
    
    ComputationGraphConfiguration conf = new NeuralNetConfiguration.Builder().seed(12345).weightInit(WeightInit.XAVIER).graphBuilder()
        .setInputTypes(InputType.recurrent(NerUtil.MAX_SENTENCE_LENGTH * 6)).addInputs("input").addLayer("lstm", bidirectional, "input")
        .addLayer("output",
            new RnnOutputLayer.Builder(new LossMCXENT(Nd4j.create(NerUtil.labelWeight1d()))).nOut(NerUtil.getLabelNum()).l2(0.006332749296897214).dropOut(0.45063086836326066)
                .activation(Activation.SOFTMAX).updater(new Adam(new StepSchedule(ScheduleType.EPOCH, learningRate, decayRate, step))).build(),
            "lstm")
        .setOutputs("output").backpropType(BackpropType.Standard).build();

    net = new ComputationGraph(conf);
    net.init();

I use BERT output as my input, just one hidden layer in my model to do NER job.

Topic		Replies	Views
Upgraded from 1.0.0-beta7 to 1.0.0-M1 - Getting Exceptions in code that worked prior DL4J	6	379	June 5, 2021
Bidirectional LSTM in DL4J based on python example DL4J	11	1663	May 21, 2020
RecurrentAttentionLayer error on gpu DL4J	3	490	March 5, 2020
Cant find out how to fix DL4JInvalidInputException DL4J	5	39	February 21, 2025
Bidirectional layer and its output	10	962	August 30, 2020

RuntimeException in Bidirectional trainning

Related topics