How can I turn a network into stub codes?

@huzpsb great! Then let me know if you need help with next steps. Remember you need to use both those build steps first and then get the associated jars I mentioned out.

The end result will be 3 dependencies with nd4j-native in your build.gradle. 1 without the classifier and 2 with android-arm and android-arm64 as classifiers.

Also remember that if you run on android studio you should check what your image architecture is. It can be x86_64 or arm. You’ll need to have the specified classifier bundled with your apk in order for it to use the right one in the app. I didn’t provide build instructions for intel (although we do have associated files for it).

I only provided ARM since that’s what you will generally use in production. Try to ensure you test your app on a real phone to avoid issues.

I’m evaluating the build product but I’ve encountered an error.

Exception in thread "main" java.lang.ExceptionInInitializerError
        at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.<init>(NativeOpExecutioner.java:79)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at java.lang.Class.newInstance(Class.java:442)
        at org.nd4j.linalg.factory.Nd4j.initWithBackend(Nd4j.java:5129)
        at org.nd4j.linalg.factory.Nd4j.initContext(Nd4j.java:5044)
        at org.nd4j.linalg.factory.Nd4j.<clinit>(Nd4j.java:269)
        at org.deeplearning4j.util.ModelSerializer.restoreMultiLayerNetworkHelper(ModelSerializer.java:282)
        at org.deeplearning4j.util.ModelSerializer.restoreMultiLayerNetwork(ModelSerializer.java:237)
        at org.deeplearning4j.util.ModelSerializer.restoreMultiLayerNetwork(ModelSerializer.java:221)
        at org.deeplearning4j.util.ModelSerializer.restoreMultiLayerNetwork(ModelSerializer.java:207)
        at cf.huzpsb.machinelearning.AIServer.init(AIServer.java:222)
        at cf.huzpsb.machinelearning.R.click(?:~?)
Caused by: java.lang.RuntimeException: ND4J is probably missing dependencies. For more information, please refer to: https://deeplearning4j.konduit.ai/nd4j/backend
        at org.nd4j.nativeblas.NativeOpsHolder.<init>(NativeOpsHolder.java:116)
        at org.nd4j.nativeblas.NativeOpsHolder.<clinit>(NativeOpsHolder.java:37)
        ... 15 more
Caused by: java.lang.UnsatisfiedLinkError: no jniopenblas_nolapack in java.library.path
        at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1860)
        at java.lang.Runtime.loadLibrary0(Runtime.java:871)
        at java.lang.System.loadLibrary(System.java:1124)
        at org.bytedeco.javacpp.Loader.loadLibrary(Loader.java:1800)
        at org.bytedeco.javacpp.Loader.load(Loader.java:1402)
        at org.bytedeco.javacpp.Loader.load(Loader.java:1214)
        at org.bytedeco.javacpp.Loader.load(Loader.java:1190)
        at org.bytedeco.openblas.global.openblas_nolapack.<clinit>(openblas_nolapack.java:12)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:348)
        at org.bytedeco.javacpp.Loader.load(Loader.java:1269)
        at org.bytedeco.javacpp.Loader.load(Loader.java:1214)
        at org.bytedeco.javacpp.Loader.load(Loader.java:1190)
        at org.nd4j.linalg.cpu.nativecpu.bindings.Nd4jCpu.<clinit>(Nd4jCpu.java:14)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:348)
        at org.nd4j.common.config.ND4JClassLoading.loadClassByName(ND4JClassLoading.java:62)
        at org.nd4j.common.config.ND4JClassLoading.loadClassByName(ND4JClassLoading.java:56)
        at org.nd4j.nativeblas.NativeOpsHolder.<init>(NativeOpsHolder.java:88)
        ... 16 more

I’ve just replaced the native libraries with my newly-built ones.


Also maybe off-topic,but does TVM works with dl4j modles?

@huzpsb ah you also need to also include openblas. Please use:

Same thing as nd4j-native. Include no classifier, android-arm64, android-arm.
Normally we tell users to just include nd4j-native-platform which makes that easier. When you manually include jars though sometimes you need extra dependencies.

Edit: That might affect your dependencies. I would suggest checking the jar size as well.

sorry to interrupt,but openblas with dependencies weighs more than I can accept :confused:
Is there an alternative way?

@huzpsb did you see the size of the nd4j-native libraries? The jar file I downloaded just now with the native libaries is only around 1MB. That and android-arm should still only be around 8 at max.

Technically with openblas, we do use its matrix multiply but also have our own. I would have to look in to how to exclude openblas. That might take a day or so to see.

Edit: Some initial research says I could probably set something like this up if we remove the openblas.class declaration here:

I think it can be made optional for binaries that won’t need every op.

It sounds promising
But is there any build tools available?
I really can’t understand natives on my own…

@huzpsb oh no like before I’d do it for you. As long as you can use docker I can set something up and try it. Just make sure you know how to include dependencies with classifiers like I mentioned before.

Well then.I’ll try after I’ve finished my recompiler. :wink:

—To anybody who read this thread later—

I’ve finished using docker way.The result is that,with dependencies around,even if you compile from javacpp-openblas-nd4j,there is no way to get it smaller than 15MB on a single platform,and 5MB more for an additional platform.

Thank agibsonccc for his time and guidance,but this way will NOT workout.

The best way to solve this problem is to convert the network into a computation graph,then import it into torch.While torch light can make it ~5MB,TVM can do it within 1 MB.
It should be noticed that the performance loss is obvious,and only dense layers can be added if you want to convert in this way.Though it works if you only want to do simple tasks alike me,it isn’t suitable for ocr,nlp,etc.In these cases consider using C/S framework or simply chose another framework.

It took me a month to get to this answer to my own question.And let me leave this as the last post of this thread,in hope that it would be helpful for someone who reads it later.

@huzpsb thanks for the update. For later users we’ll work on reducing the binary size for openblas to make sure we can get the binary size down further.

In order to reduce the size even more we would have to do it in a pure c++ way. Most of the tooling is there to do that.

I did take a look at the openblas support removal but that would unfortunately take a bit of time. It’s nice to know the use case for other people and what’s possible. To anyone else who reads this please feel free to ping me if you are looking for this and we can see if maybe an update would work for you.

It should be possible since your network is fairly simple but the overall solution is very limited for very specific networks. A more general solution (which is unfortunately what the library would have to implement) with all the proper op coverage would take a bit of work.

We have a project we’re working on to address this: GitHub - KonduitAI/kompile: Kompile generates optimized machine learning pipelines usable from python which I believe can help with mitigating a lot of the issues.
Note this can work for other things as well including python script execution, various models, opencv etc as well.

The main thing it does is helps produce binaries based on just the models people have. Anyone willing to give it a shot feel free to ping me.

—To anybody who read this thread later—
And semi-offtopic.
Libsvm is incredibly small & accurate.Though it’s not so fast,it’s definitely better than the mess i’d previously made.Try it!

@huzpsb that’s great you found a solution! Apologies for how things turned out. I’ll post a better solution when everything is up. We did manage to eliminate the openblas dependency with an independent backend. You can see how we eliminated that here: deeplearning4j/nd4j/nd4j-backends/nd4j-backend-impls/nd4j-minimizer at master · deeplearning4j/deeplearning4j · GitHub

It’s still not ideal though. This needs to be combined with our build tool in order to be a bit easier to use. Your use case here is definitely important and I hope we solve that better in the future.

For your alternative, I would suggest linking to the library you used. Was it libsvm’s java bindings?

I’m so touched that you are still working on this!
The LIBSVM I’ve just mentioned is linked here.As you can see,it’s a library being re-written in java with no external library.And it takes ~ 50 kb together with the pretrained model(I used the one-class SVM algorithm,if you’d love to know).

I said that it’s semi-offtopic because to be exactly it’s a kind of machine learning but not a kind of deep learning.

Also i’m looking forward to your new toolchain!Please do keep me updated when it can be used for production~Thank you.