Using model learnt by DenseNet example to classify new input

Hello,

I’ve run the DenseNet example from here: GitHub - eclipse/deeplearning4j-examples: Deeplearning4j Examples (DL4J, DL4J Spark, DataVec) on my own set of data. I let it run for a night on GPU and ended up with some checkpoint_XX_ComputationGraph.zip in model direcory. I know I can load a ComputationGraph from these files, but are they enough now to classify a directory full of completely new images? If yes - is there a code somewhere for this? If not - how would I do it?

I assume you’re referencing:

You should try to understand the underlying dataset it was trained on to know how to change the model. There’s enough there, but you’ll need to change the model to fit the dataset you want to learn on. In this case you’ll want to look at the animals dataset that’s extracted to the local folder:

Well, yes. But how do I use the trained data from “model” directory to classify completely new bears, deers, ducks and turtles stored in another directory? I’ve already trained with the data and have computation graphs dumps!

Ssuukk,

Please check this PR (Fixed DenseNet model by lavajaw · Pull Request #1006 · eclipse/deeplearning4j-examples · GitHub). I created that DenseNet example long time ago and recently I noticed some mistakes in model architecture.
So if you want to use correct implementation please check that PR.

I believe that this PR will be approved in future.

Have a nice day

@lavajaw thanks!

@ssuukk sorry didn’t see this. You can classify it as is out of the box. For any model, you have to look at what dataset the model was trained on and what labels it already has. If the input examples were close to your problem, it would just work out of the box. Otherwise you would need to use the transfer learning api.

Heee, heee - I know the theory, but I was asking about exact code / implementation examples. Btw - it woudl be so much easier to navigate if all the classes had documentation available in the IDE, like Android classes have…

@ssuukk could you clarify what you mean? We publish the javadoc on maven central and that allows intellij to download those. Android studio is based on intellij. The same concepts should apply.

Well, when I do coding in Android, I can press “CTRL+Q” on any class/method/field and usually get a nice formatted docs for highligted thingy. For dl4j I get close to nothing…

@ssuukk again that’s not right. There’s likely a default difference somewhere maybe. Make sure intellij downloads the javadoc from maven:

Beyond that it’s a case by case basis.

Aaaah, yes, that helped! Thanks. So what about code example that classifies data using ready model?

I have just reached this step myself. So far I have this code:

public static INDArray compute (float[][][] b, float[] g) {
    if (model == null) {
      try {
        model = ComputationGraph.load(new File("data/model11830.zip"), true);
      } catch (IOException e) {
        throw new Error(e);
      }
      model.init();
      inboard = Nd4j.zeros(new int[]{256, 20, 9, 9}, DataType.FLOAT);
      inglob = Nd4j.zeros(new int[]{256, 11}, DataType.FLOAT);
    }
    inboard.putSlice(0, Nd4j.createFromArray(b));
    inglob.putSlice(0, Nd4j.createFromArray(g));
    INDArray ind = model.output(false, false, inboard, inglob)[0];
    return ind.reshape(256, 8, 9, 9);
  }

It works, but it’s prohibitively slow, and my guess is that it’s the repeated sending of the 256x20x9x9 input to the GPU, and I need a way to change the batch size (currently 256) on the model now that I’m done training.

EDIT: Another question I have, what does init() do exactly? Do I need to call it if I’m loading a saved model?

To elaborate, the only thing in the model that forces me to explicitly specify the batch size is the ReshapeVertex:

      .addVertex("g2", new ReshapeVertex(256, 96, 1, 1), "g1")

Is there a way to make it dynamic?

@Xom you generally would load images directly using the NativeImageLoader for inference. Don’t use floats. Use something like:

 NativeImageLoader loader = new NativeImageLoader();
 INDArray content = loader.asMatrix(new File("path/to/your/image"));
INDArray output = model.outputSingle(content);
(this is a computation graph not a multi layer network^)

You can search it throughout the examples Search · NativeImageLoader · GitHub

The float[20][9][9] is a representation of a chesslike gamestate, not an image, and is coming from an ongoing Monte Carlo Tree Search, not from disk. My quoted code takes one such gamestate, and feeds it to the neural net alongside 255 dummy inputs, because the net had been configured to expect batches of 256 examples. It was only a proof of concept just to have something to run, and of course it was terribly slow. The next thing I did was to modify the net to expect batches of 4 instead of 256, and parallelize the MCTS to send 4 instead of 1.

Next I will parallelize further. Is there a way to figure out what batch size has the highest throughput in an already-trained model other than trial and error?

Also, the only thing preventing the net from handling any batch size is the ReshapeVertex for which I have to specify the batch size as the first dimension of the shape specification. Is there a way to make it more flexible?

@Xom I would recommend not using float arrays at all then. Use nd4j throughout.
There’s overhead when taking data off heap and putting it on heap to do training.

You would see similar overhead in doing that with numpy and python arrays.
I’m happy to take a look at your broader code base to make that happen if you want.