Simple CNN predicts NaNs

zsvizi · November 2, 2020, 9:27am

Hi All,
I am newbie in DL4J. I tried to implement a simple CNN like this:

val conf = new NeuralNetConfiguration.Builder()
    .seed(seed)
    .updater(Updater.NESTEROVS)
    .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
    .weightInit(WeightInit.XAVIER)
    .list
    .layer(0, new ConvolutionLayer.Builder()
      .name("c1")
      .kernelSize(5, numFeatures)
      .convolutionMode(ConvolutionMode.Truncate)
      .activation(Activation.RELU)
      .nIn(1)
      .nOut(32)
      .build)
    .layer(1, new OutputLayer.Builder(LossFunctions.LossFunction.MSE)
      .nOut(nOut)
      .dropOut(0.5)
      .activation(Activation.IDENTITY)
      .build)
    .setInputType(InputType.convolutional(windowSize, numFeatures, 1))
    .build

My first minibatch has size [batchSize, 1, windowSize, numFeatures] , so compatible with the CNN. The problem now, that I got NaN values from inference, thus the network does not learn… I want to use this network for time series prediction (trying this for first, then later going to RNN).
Do you have any idea, what causes the problem? The network gives sensible values after init, but after first iteration only NaN values are outputed.
In the dependencies, I have 0.9.1 version
Thank you for your help in advance!

agibsonccc · November 2, 2020, 10:32am

@zsvizi first of all, make sure you update your version to 1.0.0-beta7.
That’s what the current examples: GitHub - deeplearning4j/deeplearning4j-examples: Deeplearning4j Examples (DL4J, DL4J Spark, DataVec) is

Also, make sure to read this first: https://deeplearning4j.konduit.ai/tuning-and-training/troubleshooting-training

After you do that, let us know what you find.

zsvizi · November 2, 2020, 10:36am

The problem is, that the project I am working on is maintained by the DevOps team of the company I am working for… and you know, upgrading a dependency is far from trivial. Considering these constrains, can you suggest some approach to troubleshoot?

agibsonccc · November 2, 2020, 11:10am

@zsvizi we can try but we have our own constraints as well that tuning guide should still be relevant for you. If you need something more for support, please feel free to DM me and we’ll see what we can do for you.

Topic		Replies	Views
NaNs present in prediction DL4J	3	1411	March 23, 2021
A very simple case of linear regression DL4J	2	49	November 22, 2024
Problem in loading input of 1D CNN for regression DataVec	6	1086	March 19, 2020
Advice on non-square input DL4J	4	495	March 22, 2020
DL4J Need help with my input data DL4J	5	460	December 25, 2020

Simple CNN predicts NaNs

Related topics