Can't feed DataSet into model trained on mnist

code:

 MultiLayerNetwork loadedNet = MultiLayerNetwork.load(model,false);

            ImageRecordReader imageRecordReader = new ImageRecordReader(28,28,new PatternPathLabelGenerator("."));
            File digitsDir = new File("/Users/xxxx/Downloads/RecognizeNumbers/src/main/resources/digits/");
            imageRecordReader.initialize(new FileSplit(digitsDir, NativeImageLoader.ALLOWED_FORMATS, new Random()));
            String[] labels = digitsDir.list();
            imageRecordReader.setLabels(List.of(labels));

            RecordReaderDataSetIterator customData;
            customData=new RecordReaderDataSetIterator.Builder(imageRecordReader, 32)
                    .preProcessor(new ImagePreProcessingScaler())
                    .classification(1,numOutputLablels)
                    .build()
            ;
            //(imageRecordReader, 10,1, numOutputLablels)
            while (customData.hasNext()){
                DataSet nextOne = customData.next();
                INDArray output = loadedNet.output(nextOne.getFeatures());
                INDArray label = nextOne.getLabels();
                INDArray preOutput = Nd4j.argMax(output, 2);
                INDArray realLabel = Nd4j.argMax(label, 2);
                StringBuilder peLabel = new StringBuilder();
                StringBuilder reLabel = new StringBuilder();
                for (int dataIndex = 0; dataIndex < 2; dataIndex++) {
                    peLabel.append(preOutput.getInt(dataIndex));
                    reLabel.append(realLabel.getInt(dataIndex));
                }
                System.out.println("prediction: " + peLabel + " real file title: " + realLabel);

Error:

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 0 out of bounds for length 0
	at org.datavec.api.io.labels.PatternPathLabelGenerator.getLabelForPath(PatternPathLabelGenerator.java:53)
	at org.datavec.api.io.labels.PatternPathLabelGenerator.getLabelForPath(PatternPathLabelGenerator.java:58)
	at org.datavec.image.recordreader.BaseImageRecordReader.initialize(BaseImageRecordReader.java:150)
	at Main.main(Main.java:107)

Line 107:
imageRecordReader.initialize(new FileSplit(digitsDir, NativeImageLoader.ALLOWED_FORMATS, new Random()));

All file names are in the following format:
(number).png
This number ranges from 1 to 0, just like the class labels.The program has full access to this directory and all files within it. What am I doing wrong, and should I also be using .predict() with the network instead?

UPDATE: By changing label generators, I managed to make progress in the form of another error:

Exception in thread "main" org.deeplearning4j.exception.DL4JInvalidInputException: Input that is not a matrix; expected matrix (rank 2), got rank 4 array with shape [10, 1, 28, 28]. Missing preprocessor or wrong input type? (layer name: layer0, layer index: 0, layer type: DenseLayer)
	at org.deeplearning4j.nn.layers.BaseLayer.preOutputWithPreNorm(BaseLayer.java:313)
	at org.deeplearning4j.nn.layers.BaseLayer.preOutput(BaseLayer.java:296)
	at org.deeplearning4j.nn.layers.BaseLayer.activate(BaseLayer.java:344)
	at org.deeplearning4j.nn.layers.AbstractLayer.activate(AbstractLayer.java:262)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.outputOfLayerDetached(MultiLayerNetwork.java:1349)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2467)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2430)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2421)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2408)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.predict(MultiLayerNetwork.java:2270)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.predict(MultiLayerNetwork.java:2286)
	at Main.main(Main.java:126)

I must investigate the input format of the MNIST data set. Now it comes down to converting the images into the array format that the network was originally trained upon.

@quan-thecreator sorry for the late reply. Holidays etc. MNIST can have 2 modes of input…you can either do a flat 1 x 784 or a convolutional 4d input. That’s what this expects.

For the flat input you typically use dense layers for the first layer. For the second approach you need a properly created conv layer. with the right inputs. That is typically auto configured for you as long as you set the input type. You aren’t showing your network if you can do that I can help you better. Thanks!

1 Like

Is there an example in the docs or on github? I’m not using convolution yet, so the former method(1x748 flat) should work.

Thank you!
Sorry for the late reply, School swept my off my feet.

@quan-thecreator your error indicates an issue with the way you setup the folders. We have tons of examples of mnist and but it’s hard to tell what your structure is. Like are you trying to use the folder layout we have in the examples? eg: here deeplearning4j-examples/LeNetMNISTReLu.java at 686db99fee3d4825ee70663e1a15aa8d6216f2c2 · deeplearning4j/deeplearning4j-examples · GitHub

The image record reader reads in everything as a standard image data set. That returns a 4d number of examples/batch size, channels, height, width format.

You have a big problem here where it looks like you’re attempting to use an mnist hello world even though mnist itself actually isn’t 2d and comes in a very strange format. Using dense layers + flatten also is WAY off any sort of real world computer vision.

Our record reader there mainly reflects the actual common usage for people building actual computer vision datasets solving actual problems.

MNIST is just a gray scale dataset. While dense layers work for that it’s not really what you’re supposed to use once you move to a real problem. Dense layers are just the easiest to understand when starting out.

I would just recommend using our standard mnist iterator if that’s what you want. You can find that here: deeplearning4j-examples/LeNetMNIST.java at 686db99fee3d4825ee70663e1a15aa8d6216f2c2 · deeplearning4j/deeplearning4j-examples · GitHub