When I try to use Batch normalization layer inside FrozenLayerWithBackprop I am getting null pointer exception. Same exception is with CPU and CUDA backend. I think it has something to do with mean/var cache.
This is my layer, how I am creating it:
listBuilder.layer(new FrozenLayerWithBackprop(new BatchNormalization.Builder().eps(0.001).useLogStd(false).build()));
And there is exception:
WARNING: CuDNN BatchNormalization backprop execution failed - falling back on built-in implementation
java.lang.NullPointerException
at org.deeplearning4j.nn.layers.mkldnn.MKLDNNBatchNormHelper.backpropGradient(MKLDNNBatchNormHelper.java:101)
at org.deeplearning4j.nn.layers.normalization.BatchNormalization.backpropGradient(BatchNormalization.java:164)
at org.deeplearning4j.nn.layers.FrozenLayerWithBackprop.backpropGradient(FrozenLayerWithBackprop.java:62)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.calcBackpropGradients(MultiLayerNetwork.java:1946)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.computeGradientAndScore(MultiLayerNetwork.java:2761)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.computeGradientAndScore(MultiLayerNetwork.java:2704)
at org.deeplearning4j.optimize.solvers.BaseOptimizer.gradientAndScore(BaseOptimizer.java:170)
at org.deeplearning4j.optimize.solvers.StochasticGradientDescent.optimize(StochasticGradientDescent.java:63)
at org.deeplearning4j.optimize.Solver.optimize(Solver.java:52)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.fitHelper(MultiLayerNetwork.java:2305)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.fit(MultiLayerNetwork.java:2263)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.fit(MultiLayerNetwork.java:2326)
at eu.extech.deep_learning.computer_vision.BatchNormTest.main(BatchNormTest.java:38)Exception in thread “main” java.lang.NullPointerException
at org.deeplearning4j.nn.layers.normalization.BatchNormalization.backpropGradient(BatchNormalization.java:296)
at org.deeplearning4j.nn.layers.FrozenLayerWithBackprop.backpropGradient(FrozenLayerWithBackprop.java:62)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.calcBackpropGradients(MultiLayerNetwork.java:1946)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.computeGradientAndScore(MultiLayerNetwork.java:2761)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.computeGradientAndScore(MultiLayerNetwork.java:2704)
at org.deeplearning4j.optimize.solvers.BaseOptimizer.gradientAndScore(BaseOptimizer.java:170)
at org.deeplearning4j.optimize.solvers.StochasticGradientDescent.optimize(StochasticGradientDescent.java:63)
at org.deeplearning4j.optimize.Solver.optimize(Solver.java:52)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.fitHelper(MultiLayerNetwork.java:2305)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.fit(MultiLayerNetwork.java:2263)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.fit(MultiLayerNetwork.java:2326)
at eu.extech.deep_learning.computer_vision.BatchNormTest.main(BatchNormTest.java:38)
What do you think about it? Thank you in advance