The code confuse me in the BatchNormalization class

I found the code confuse me in the BatchNormalization class
1:In the method backpropGradient
//dL/dmu
val effectiveBatchSize = input.size(0) * input.size(hIdx) * input.size(wIdx);
INDArray dxmu1 = dxhat.sum(nonChDims).divi(std).negi();
INDArray dxmu2 = xMu.sum(nonChDims).muli(-2.0 / effectiveBatchSize).muli(dLdVar);
INDArray dLdmu = dxmu1.addi(dxmu2);

why calculate the dxmu2?, It is a zero.
2:In the method backpropGradient UseLogStd comments
//Use log10(std) parameterization. This is more numerically stable for FP16 and better for distributed training
//First: we have log10(var[i]) from last iteration, hence can calculate var[i] and stdev[i]
//Need to calculate log10{std[i]) - log10(std[i+1]) as the “update”
//Note, var[i+1] = d*var[i] + (1-d)*batchVar

//Need to calculate log10{std[i]) - log10(std[i+1]) as the “update” shouble be
//Need to calculate log{std[i]) - log(std[i+1])?

Thank somebody help me!