Most efficient way to do inference?

temerick · March 12, 2020, 8:06pm

Are there any recommendations for best practices for using dl4j for large scale inference jobs? I see a handful of examples for training, but only a trivial example for inference.

I have a function that basically generates an iterator over image chips that I want to apply inference to, but I don’t know what the best practices are for turning that into something that my ComputationGraph can process.

Thanks!

treo · March 13, 2020, 7:42am

The most efficient way to run inference always depends on how your data arrives. If you already have all of it on your local machine, it is pretty easy: Just run it with the largest batch sizes that your hardware can handle. That usually results in optimal utilization with single socket systems. Note, that if you have hyper threading, it may not use all available logical cores, as doing so might result in actually slower execution.

temerick · March 13, 2020, 12:46pm

So currently, my data is being loaded into a single java queue via multiple threads. Currently, we have a line along the lines of y = classifierModel.output(false, x).

I see the other signatures for output, and I’m wondering if some of these options allow me to do things like put my data directly into pinned memory, or move the data to the gpu asynchronously.

treo · March 13, 2020, 12:58pm

So you already have a preloading implemented, and have a queue of single inputs, right?
Take as many of those as your hardware can handle, and stack them on top of each other. So you get an input that is like a mini-batch during training. Let your model work on that.

As you can see in DL4J Classification Speed - #16 by ethiel @torstenbm was able to achieve over 200k classifications per second by batching his inputs.

If you meant the output(DataSetIterator) signature: This more or less just saves you the loop of iterating manually through your iterator at the moment.

But you can still get something close to what you were talking about with Workspaces.
Take a look at https://deeplearning4j.konduit.ai/config/config-memory/config-workspaces#iterators and https://deeplearning4j.konduit.ai/config/config-memory and the examples for them: https://github.com/eclipse/deeplearning4j-examples/blob/master/nd4j-examples/src/main/java/org/nd4j/examples/Nd4jEx15_Workspaces.java

temerick · March 13, 2020, 1:14pm

I’ll take a look. Thanks!

Topic		Replies	Views
DL4J Classification Speed DL4J	21	1504	March 9, 2020
Minimal version for inference	3	393	June 11, 2021
Load model and do inference in one workspace DL4J	2	522	April 21, 2020
Problem with parallel inference DL4J	15	1337	May 11, 2020
Deeplearning4j inference engine DL4J	5	340	October 26, 2021

Most efficient way to do inference?

Related topics