Attributes normalization

treo · May 6, 2021, 7:01am

Yes normalizing the data will be necessary. That is because for most activation functions the sensitive region is between -1 and 1, and everything beyond that is saturated, which makes training very hard, as you have almost no gradient to work with.

As for the specific normalization you want to use, that may depend on the data (see also Quickstart with Deeplearning4J – dubs·tech). With NormalizerStandardize, the normalization you get will be to calculate the statistics for each column and then normalize each column in such a way that it has a zero mean and unit variance (μ=0, σ=1).

That has an interesting side effect if a column always has the same value (as you maybe have in the pressure column). Because it is moving the mean value to zero, that feature is effectively dropped (see Methods for dropping out inputs (features) during post-fit Evaluation? - #8 by treo for the explanation of the math for zero features).

When you are running a classification, it will leave the labels alone. If you were running a regression, you’d also have to set fitLabels(true) on the normalizer.

Topic		Replies	Views
Does NormalizerStandardize.transform(dataSet) support multi-threads? Tuning Help	3	136	March 8, 2024
Advice on proper use of Dataset labels DL4J	4	388	March 19, 2020
How to data normalization for INDArray/Array ND4J	4	1643	March 9, 2020
How to do denormalization? DL4J	2	52	September 18, 2024
Attribute selection Tuning Help	2	382	May 6, 2021

Attributes normalization

Related topics