How to use the result of calculateGradients api to update the network?

TempKonduitUser1 · November 20, 2022, 7:22pm

Say I use some code as the following:
Pair<Gradient, INDArray> pair = trainableModel.calculateGradients(state0BatchProcessed, targets, null, masks);
then how can I use the result pair to update the model? Please provide code.

Thanks in advance!

agibsonccc · November 21, 2022, 4:39am

@TempKonduitUser1
If you see the gradient object there, you just have to do (psuedo code here):

for each gradient in pair:
  net.update(gradient)

You’ll see both computation graph and multilayernetwork have an update(Gradient) method you can use there.

TempKonduitUser1 · November 27, 2022, 12:22pm

@agibsonccc
If I have used your code to update the gradient, I still need to write more code such as:

INDArray params = net.params(true);
params.addi(gradient);

to update the parameters of the model, right? I found it seems not to need so, because the update function of the net has already done this for you. Do I guess right?
But do I still need to zero gradients before every training step?

agibsonccc · November 28, 2022, 3:42am

@TempKonduitUser1 use model.update(Gradient)

Topic		Replies	Views
Cannot update a model using Gradient if it hasn't had a computeGradientAndScore() called on it DL4J	4	494	October 5, 2020
After update the gradients, model output all zeros for whatever inputs? DL4J	0	206	November 27, 2022
Question on calculating gradient by external error DL4J	1	395	February 23, 2020
Custom Loss Function and Gradient DL4J	1	278	August 3, 2023
Accessing values of variables after training DL4J	3	214	May 17, 2023

How to use the result of calculateGradients api to update the network?

Related topics