Cannot update a model using Gradient if it hasn't had a computeGradientAndScore() called on it

sachag678 · October 3, 2020, 3:05am

val model = new MultiLayerNetwork(generateConf())
model.init()

val temp = model.clone()

val x = Nd4j.create(Array(0.5, 0.5), Array(1,2))
val y = Nd4j.create(Array(2.0), Array(1,1))
temp.feedForward(x, true)
temp.setLabels(y)
temp.computeGradientAndScore()

val grad = temp.gradient()

println(grad)

println(model.params())
model.update(grad)
println(model.params())

def generateConf(): MultiLayerConfiguration = {
val numNodes = List(2, 3, 1)

new NeuralNetConfiguration.Builder()
  .seed(42)
  .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
  .updater(new Sgd(0.001)) // alpha
  .list()
  // input of the raw features
  .layer(0, new DenseLayer.Builder().nIn(numNodes.head).nOut(numNodes(1))
    .weightInit(WeightInit.XAVIER)
    .activation(Activation.TANH)
    .build())
  .layer(1, new OutputLayer.Builder(LossFunction.MSE)
    .weightInit(WeightInit.XAVIER)
    .activation(Activation.IDENTITY)
    .nIn(numNodes(1)).nOut(numNodes(2)).build())
  .build()
}

An error occurs on the model.update(grad) line and the output is shown below

DefaultGradient{gradients={1_W=[[-1.8363], 
[-2.7604], 
[2.5442]], 1_b=[[-5.4909]], 0_b=[[    4.3251,    0.0419,   -4.1294]], 0_W=[[    2.1625,          
0.0210,   -2.0647], 
[    2.1625,    0.0210,   -2.0647]]}}
[[    0.6854,    0.0102,    0.1997,    0.9062,   -0.8818,   -0.1214,         0,         0,         0,      
-0.8869,   -0.0102,    0.9577,         0]]
Exception in thread "main" java.lang.NullPointerException
at  
org.deeplearning4j.nn.multilayer.MultiLayerNetwork.update
(MultiLayerNetwork.java:3064)
at Testing$.main(Testing.scala:34)
at Testing.main(Testing.scala)

Essentially the gradient of model is null. Is there a way to initialize the gradient to some default value like zero for each layer? I could theoretically set the inputs as zeros and then the output as zero and then call the computeGradientAndScore() but that seems roundabout.

Any suggestions would be great. I’m trying to have a cloned copies of the main model gather the gradients and then pass it back to the main model for updating.

agibsonccc · October 3, 2020, 3:43am

@sachag678 try calling: deeplearning4j/MultiLayerNetwork.java at df0d5083c33d49f3cbe6663d3c5102cf983a63fc · KonduitAI/deeplearning4j · GitHub

sachag678 · October 3, 2020, 6:20pm

Unfortunately that doesn’t work. I took a look at the function and it doesn’t seem to set the gradient variable to anything.

My proposed method of setting the input to zero and the label to zero and calling the calculateGradientAndScore method works. But it’s super hacky.

agibsonccc · October 5, 2020, 3:49am

@sachag678
Here’s a workaround after I downloaded your code and played with it:

    val model = new MultiLayerNetwork(generateConf())
    model.init()
    val field = classOf[MultiLayerNetwork].getDeclaredField("gradient")
    field.setAccessible(true)
    val gradient = new DefaultGradient()
    model.paramTable().entrySet.forEach(entry => {
      gradient.setGradientFor(entry.getKey,Nd4j.zeros(entry.getValue.shape(),entry.getValue.ordering()))
    })

    ReflectionUtils.setField(field,model,gradient)

Imports

import org.deeplearning4j.nn.api.OptimizationAlgorithm
import org.deeplearning4j.nn.conf.layers.{DenseLayer, OutputLayer}
import org.deeplearning4j.nn.conf.{MultiLayerConfiguration, NeuralNetConfiguration}
import org.deeplearning4j.nn.gradient.DefaultGradient
import org.deeplearning4j.nn.multilayer.MultiLayerNetwork
import org.deeplearning4j.nn.weights.WeightInit
import org.nd4j.common.io.ReflectionUtils
import org.nd4j.linalg.activations.Activation
import org.nd4j.linalg.factory.Nd4j
import org.nd4j.linalg.learning.config.Sgd
import org.nd4j.linalg.lossfunctions.LossFunctions.LossFunction

This is in scala.

sachag678 · October 5, 2020, 1:56pm

  val model = new MultiLayerNetwork(Util.generateConf(0.00001, inputSize))

  model.init()

  val tempx = Nd4j.create((for (_ <- 0 until inputSize) yield 0.0).toArray, Array(1,inputSize))
  val tempy = Nd4j.create(Array(0.0), Array(1,1))
  model.feedForward(tempx, true)
  model.setLabels(tempy)
  model.computeGradientAndScore()

The above is my current solution. A bit more hacky than yours, but it works. I like yours too. I guess using there is no use case here for having this functionality. I may open a feature request in the github. Thanks for your help!

Topic		Replies	Views
After update the gradients, model output all zeros for whatever inputs? DL4J	0	206	November 27, 2022
How to use the result of calculateGradients api to update the network? DL4J	3	261	November 28, 2022
Computing gradient without backpropagation DL4J	4	280	May 31, 2022
Issues about modifying the source code DL4J	64	1573	June 23, 2022
Question on calculating gradient by external error DL4J	1	395	February 23, 2020

Cannot update a model using Gradient if it hasn't had a computeGradientAndScore() called on it

Related topics