Suggestions Thank you

Dear,
I have an extensive application using DL4J and I have a few comments on the API to improve its integration. I am actually very happy with the performance and general approach to solve ML use cases
on time series.

  1. Label Normalization Setup Is Unintuitive

The requirement to call fitLabel(true) before fit() is not obvious and is easy to miss, leading to runtime errors (NullPointerException).
If fitLabel(true) is called after fit(), label stats are not initialized, but no warning is given.
2. Dummy Dataset Construction Is Error-Prone

Users must manually construct a dummy dataset with min/max values for both features and labels, matching the exact shapes expected by the model.
Shape mismatches or missing label data silently result in null stats and runtime errors later.
3. Lack of Direct Min/Max Setters

There is no API to directly set min/max values for features and labels, which would be much more convenient for production workflows with large datasets.
4. Documentation Could Be Improved

The documentation should clearly state the order of method calls and provide examples for initializing the normalizer with custom min/max values for both features and labels.
5. Error Messages Could Be More Helpful

When label stats are not initialized, the error message could suggest checking the order of fitLabel(true) and fit() calls, or the shape of the dummy dataset.

@MintakaB thanks for the suggestions! Happy to address it in the next release. I’ve mainly been internals focused as of late. The c++ library needed a lot of modernization and it’s not something users really see. Please feel free to file an issue and I’ll mark it for improvement though!

One suggestion I have is if you’re not seeing something, it’s less than idea of course but please do take a look at the tests in the main repo.