Say I use some code as the following:
Pair<Gradient, INDArray> pair = trainableModel.calculateGradients(state0BatchProcessed, targets, null, masks);
then how can I use the result pair to update the model? Please provide code.
Thanks in advance!
Say I use some code as the following:
Pair<Gradient, INDArray> pair = trainableModel.calculateGradients(state0BatchProcessed, targets, null, masks);
then how can I use the result pair to update the model? Please provide code.
Thanks in advance!
@TempKonduitUser1
If you see the gradient object there, you just have to do (psuedo code here):
for each gradient in pair:
net.update(gradient)
You’ll see both computation graph and multilayernetwork have an update(Gradient) method you can use there.
@agibsonccc
If I have used your code to update the gradient, I still need to write more code such as:
INDArray params = net.params(true);
params.addi(gradient);
to update the parameters of the model, right? I found it seems not to need so, because the update function of the net has already done this for you. Do I guess right?
But do I still need to zero gradients before every training step?
@TempKonduitUser1 use model.update(Gradient)