The code confuse me in the BatchNormalization class

dgao · March 21, 2021, 2:57am

I found the code confuse me in the BatchNormalization class
1:In the method backpropGradient
//dL/dmu
val effectiveBatchSize = input.size(0) * input.size(hIdx) * input.size(wIdx);
INDArray dxmu1 = dxhat.sum(nonChDims).divi(std).negi();
INDArray dxmu2 = xMu.sum(nonChDims).muli(-2.0 / effectiveBatchSize).muli(dLdVar);
INDArray dLdmu = dxmu1.addi(dxmu2);

why calculate the dxmu2?, It is a zero.
2:In the method backpropGradient UseLogStd comments
//Use log10(std) parameterization. This is more numerically stable for FP16 and better for distributed training
//First: we have log10(var[i]) from last iteration, hence can calculate var[i] and stdev[i]
//Need to calculate log10{std[i]) - log10(std[i+1]) as the “update”
//Note, var[i+1] = d*var[i] + (1-d)*batchVar

//Need to calculate log10{std[i]) - log10(std[i+1]) as the “update” shouble be
//Need to calculate log{std[i]) - log(std[i+1])?

Thank somebody help me!

Topic		Replies	Views
How to use sd.nn.batchNorm(…) in Deeplearning4j? SameDiff	6	923	July 3, 2020
BatchNormalization Layer only support single int as nIn and nOut? DL4J	6	448	October 6, 2020
BatchNormalization layer inside FrozenLayerWithBackprop null exception DL4J	3	429	September 29, 2020
More Batch normalization problems with 1dconvolution DL4J	1	358	April 9, 2021
How to data normalization for INDArray/Array ND4J	4	1643	March 9, 2020

The code confuse me in the BatchNormalization class

Related topics