Dear,
I have an extensive application using DL4J and I have a few comments on the API to improve its integration. I am actually very happy with the performance and general approach to solve ML use cases
on time series.
- Label Normalization Setup Is Unintuitive
The requirement to call fitLabel(true) before fit() is not obvious and is easy to miss, leading to runtime errors (NullPointerException).
If fitLabel(true) is called after fit(), label stats are not initialized, but no warning is given.
2. Dummy Dataset Construction Is Error-Prone
Users must manually construct a dummy dataset with min/max values for both features and labels, matching the exact shapes expected by the model.
Shape mismatches or missing label data silently result in null stats and runtime errors later.
3. Lack of Direct Min/Max Setters
There is no API to directly set min/max values for features and labels, which would be much more convenient for production workflows with large datasets.
4. Documentation Could Be Improved
The documentation should clearly state the order of method calls and provide examples for initializing the normalizer with custom min/max values for both features and labels.
5. Error Messages Could Be More Helpful
When label stats are not initialized, the error message could suggest checking the order of fitLabel(true) and fit() calls, or the shape of the dummy dataset.