Image Inputs for ImageRecordReader

Hey everyone! I’m a student working on a final project for my undergrad machine learning course, and I’m trying to use DL4J for CNN image classification on this dataset:

https://www.kaggle.com/gpiosenka/100-bird-species#BIRDS-150.txt.

I’ve gotten the network to run well on the CIFAR-10 dataset, but I’m having a little trouble switching over to this new bird species one since there’s no built-in DataSetIterator like there was for the CIFAR-10. How do I have to format the image file directory to work with the ImageRecordReader? The data-set is saved in the following format currently:

downloads\180-bird-species\train(BIRD_NAME)\001.jpg
or
downloads\180-bird-species\test(BIRD_NAME)\001.jpg,
where there are 180 species of birds in each train/test/validation folders.

Thanks for your help!

Take a look at this section: https://deeplearning4j.konduit.ai/datavec/overview#reading-records-iterating-over-data

And then take a look at this part of the mnist classifier example:
https://github.com/eclipse/deeplearning4j-examples/blob/master/dl4j-examples/src/main/java/org/deeplearning4j/examples/convolution/mnist/MnistClassifier.java#L94-L112

The first link should tell you how it is supposed to be used generally, and the second link shows you how it is used to load MNIST train and test data that is effectively organized exactly like your bird species data.

The only real difference is that you’ve got 3 channels instead of 1 and that your width and height values are different.

This worked perfectly, thank you!

EDIT: It seems that the ImageRecordReader is going into an in-determinant state:

Exception in thread “main” java.lang.IllegalStateException: Indeterminant state: record must not be null, or a file iterator must exist
at org.datavec.image.recordreader.BaseImageRecordReader.hasNext(BaseImageRecordReader.java:279)
at org.deeplearning4j.datasets.datavec.RecordReaderDataSetIterator.hasNext(RecordReaderDataSetIterator.java:446)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.doEvaluationHelper(MultiLayerNetwork.java:3392)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.doEvaluation(MultiLayerNetwork.java:3384)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.evaluate(MultiLayerNetwork.java:3578)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.evaluate(MultiLayerNetwork.java:3488)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.evaluate(MultiLayerNetwork.java:3319)
at org.example.App.(App.java:152)
at org.example.App.main(App.java:170)

Is there anything I can do to fix this? It was able to run for about 1500 iterations as per the UIserver.

Please share the code you are using define your test data set iterator.

According to the stack trace it happens when you are running .evaluate, and that uses your iterator for test data.

Configuring the testing data:

    testData = new File("D:\\cam29\\Downloads\\100-bird-species\\180\\test");
    FileSplit testSplit = new FileSplit(testData, NativeImageLoader.ALLOWED_FORMATS, new Random());
    ImageRecordReader rrTest = new ImageRecordReader(224,224,3, labelMaker);
    rr.initialize(testSplit);
    DataSetIterator testIterator = new RecordReaderDataSetIterator(rrTest, 64, 1, 180);
    testIterator.setPreProcessor(imageScaler);

And calling for the evaluation after network.fit(trainIterator, numEpochs):

Evaluation evaluation = network.evaluate(testIterator);

That is your problem. You should have used rrTest here.

Thank you, can’t believe I missed that!