M2 release is out

M2 is out. This contains quite a few changes. Details here:

Examples are also updated for M2:

Of note is the removal of the rl4j-examples and the addition of the onnx-import-examples.
More context on rl4j can be found here: rl4j - frequent syntax errors · Issue #9594 · eclipse/deeplearning4j · GitHub

We’ll be working on a way of handling supporting of modules like this if the community would like to help in a similar fashion. More info can also be found in the release notes.

Blockers on the release:

  1. We had a number of problems with infrastructure that I hope are under control now. Github actions being what it is we had issues with quite a few of the builds which made releases unpredictable. Much of this is due to the complexity of our builds. We have to provide builds for a wide variety of platforms including 2 different versions of cuda on windows and linux which take around 4 to 5 hours each to build.
  2. Sonatype upgrade. We migrated from the old oss.sonatype.org to s01.sonatype.org
  3. Test consolidation. Dl4j is a large code base that has a lot of technical debt we are still cleaning up.
    In order to make components easier to test, we consolidated all of the tests in to a platform-tests module that allows us to run tests in a controlled manner. This prevents random crashes and allows us to test different versions of dl4j from the same module. It also allows us to control where the code runs (like cpu or gpu) enabling easier testing.
    Random crashes in our tests largely came from javacpp and nd4j deallocators. We alleviated this by turning off de allocation for tests.
    For cpu based tests, we also found that the jemalloc pre load can help.

Hopefully with some of these issues mostly under control now shipping will be easier.

Updates on roadmap:

  1. We’ll continue adding new model import and focusing on expanding our pretrained model zoo. You can find more about the pretrained model zoo here:
    GitHub - KonduitAI/omnihub-zoo: Contains samediff and deeplearning4j pretrained models

  2. Our model import will support proprietary pytorch ops from onnx as well allowing conversion and execution of these models in java.

  3. We’ll add missing keras import features such as the NLP pre processing.

  4. Graalvm and easier building of modular models and pipelines will be a big focus. For those that don’t know our internals allow you to build smaller binaries with just the operations you need reducing model and binary size for mobile and embedded use cases. Combining this with graalvm allows us to become a great way to deploy and train models in both bigger enterprise environments as well as devices with smaller footprints.

  5. Continuing to simplify the framework: we’ll continue to focus on interop and being easier to build on top of. Many modules over the last months have been removed in order to reduce confusion in what is/isn’t maintained as well as what people actually use the framework for. By being small/modular and easy to build on top of it will allow us to focus on doing just what a DL framework needs to do rather than trying to be everything. We hope to encourage users to build on top of us with 3rd party modules using just the parts you need.

1 Like