DenseLayer (index=9, name=ffn0) nIn=0, nOut=3072; nIn and nOut must be > 0

Help I have error in my code

public class Main {

private static final Logger log = LoggerFactory.getLogger(Main.class);

private static final int batchSize = 15;

public static void main(String[] args) throws Exception {
    long startTime = System.currentTimeMillis();

    File file = new File("C:\\Users\\whail\\IdeaProjects\\DLRecognizer\\deeplearning\\src\\main\\resources");

    File modelDir = new File(file.getPath() + "/model");

    // create directory
    if (!modelDir.exists()) { //noinspection ResultOfMethodCallIgnored
        modelDir.mkdirs();
    }

    //create model
    ComputationGraph model = createModel();

    //construct the iterator
    MultiDataSetIterator trainMulIterator = new MultiRecordDataSetIterator(batchSize, "6");
    MultiDataSetIterator testMulIterator = new MultiRecordDataSetIterator(batchSize, "test");

    //fit
    model.setListeners(new ScoreIterationListener(10), new EvaluativeListener(testMulIterator, 1, InvocationType.EPOCH_END));

    log.info("Эпоха запущена");
    model.fit(trainMulIterator, 4);

    //save
    model.save(new File(modelDir.getPath() + "/model.zip"), true);
    long endTime = System.currentTimeMillis();

    System.out.println("=============run time===================== " + (endTime - startTime));

    System.out.println("=====eval model=====test==================");
    modelPredict(model, testMulIterator);
}

private static ComputationGraph createModel() {

    ComputationGraphConfiguration config = new NeuralNetConfiguration.Builder()
            .seed(123)
            .gradientNormalization(GradientNormalization.RenormalizeL2PerLayer)
            .l2(1e-3)
            .updater(new Adam(1e-3))
            .weightInit( WeightInit.XAVIER_UNIFORM)
            .graphBuilder()
            .addInputs("trainFeatures")
            .setInputTypes(InputType.convolutional(50, 130, 3))
            .setOutputs("out1", "out2", "out3", "out4", "out5", "out6")
            .addLayer("cnn1",  new ConvolutionLayer.Builder(new int[]{5, 5}, new int[]{1, 1}, new int[]{0, 0})
                    .nIn(1).nOut(48).activation( Activation.RELU).build(), "trainFeatures")
            .addLayer("maxpool1",  new SubsamplingLayer.Builder(PoolingType.MAX, new int[]{2,2}, new int[]{2, 2}, new int[]{0, 0})
                    .build(), "cnn1")
            .addLayer("cnn2",  new ConvolutionLayer.Builder(new int[]{5, 5}, new int[]{1, 1}, new int[]{0, 0})
                    .nOut(64).activation( Activation.RELU).build(), "maxpool1")
            .addLayer("maxpool2",  new SubsamplingLayer.Builder(PoolingType.MAX, new int[]{2,1}, new int[]{2, 1}, new int[]{0, 0})
                    .build(), "cnn2")
            .addLayer("cnn3",  new ConvolutionLayer.Builder(new int[]{3, 3}, new int[]{1, 1}, new int[]{0, 0})
                    .nOut(128).activation( Activation.RELU).build(), "maxpool2")
            .addLayer("maxpool3",  new SubsamplingLayer.Builder(PoolingType.MAX, new int[]{2,2}, new int[]{2, 2}, new int[]{0, 0})
                    .build(), "cnn3")
            .addLayer("cnn4",  new ConvolutionLayer.Builder(new int[]{4, 4}, new int[]{1, 1}, new int[]{0, 0})
                    .nOut(256).activation( Activation.RELU).build(), "maxpool3")
            .addLayer("maxpool4",  new SubsamplingLayer.Builder(PoolingType.MAX, new int[]{2,2}, new int[]{2, 2}, new int[]{0, 0})
                    .build(), "cnn4")
            .addLayer("ffn0",  new DenseLayer.Builder().nOut(3072)
                    .build(), "maxpool4")
            .addLayer("ffn1",  new DenseLayer.Builder().nOut(3072)
                    .build(), "ffn0")
            .addLayer("out1", new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                    .nOut(10).activation(Activation.SOFTMAX).build(), "ffn1")
            .addLayer("out2", new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                    .nOut(10).activation(Activation.SOFTMAX).build(), "ffn1")
            .addLayer("out3", new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                    .nOut(10).activation(Activation.SOFTMAX).build(), "ffn1")
            .addLayer("out4", new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                    .nOut(10).activation(Activation.SOFTMAX).build(), "ffn1")
            .addLayer("out5", new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                    .nOut(10).activation(Activation.SOFTMAX).build(), "ffn1")
            .addLayer("out6", new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                    .nOut(10).activation(Activation.SOFTMAX).build(), "ffn1")
            .build();

    // Construct and initialize model
    ComputationGraph model = new ComputationGraph(config);
    model.init();

    return model;
}

private static void modelPredict(ComputationGraph model, MultiDataSetIterator iterator) {
    int sumCount = 0;
    int correctCount = 0;

    while (iterator.hasNext()) {
        MultiDataSet mds = iterator.next();
        INDArray[]  output = model.output(mds.getFeatures());
        INDArray[] labels = mds.getLabels();
        int dataNum = Math.min(batchSize, output[0].rows());
        for (int dataIndex = 0;  dataIndex < dataNum; dataIndex ++) {
            StringBuilder reLabel = new StringBuilder();
            StringBuilder peLabel = new StringBuilder();
            INDArray preOutput;
            INDArray realLabel;
            for (int digit = 0; digit < 6; digit ++) {
                preOutput = output[digit].getRow(dataIndex);
                peLabel.append(Nd4j.argMax(preOutput).getInt(0));
                realLabel = labels[digit].getRow(dataIndex);
                reLabel.append(Nd4j.argMax(realLabel).getInt(0));
            }
            boolean equals = peLabel.toString().equals(reLabel.toString());
            if (equals) {
                correctCount ++;
            }
            sumCount ++;
            log.info("real image {}  prediction {} status {}", reLabel.toString(), peLabel.toString(), equals);
        }
    }
    iterator.reset();
    System.out.println("validate result : sum count =" + sumCount + " correct count=" + correctCount );
}

}

Error:
Exception in thread “main” org.deeplearning4j.exception.DL4JInvalidConfigException: DenseLayer (index=9, name=ffn0) nIn=0, nOut=3072; nIn and nOut must be > 0
at org.deeplearning4j.nn.conf.layers.LayerValidation.assertNInNOutSet(LayerValidation.java:55)
at org.deeplearning4j.nn.conf.layers.DenseLayer.instantiate(DenseLayer.java:58)
at org.deeplearning4j.nn.conf.graph.LayerVertex.instantiate(LayerVertex.java:106)
at org.deeplearning4j.nn.graph.ComputationGraph.init(ComputationGraph.java:581)
at org.deeplearning4j.nn.graph.ComputationGraph.init(ComputationGraph.java:442)
at org.mishaneyt.dlrecognizer.Main.createModel(Main.java:125)
at org.mishaneyt.dlrecognizer.Main.main(Main.java:55)

@whaile-off You need to add setInputType to the output. Plenty of examples here:
https://github.com/search?q=repo%3Adeeplearning4j%2Fdeeplearning4j-examples%20%20setInputType&type=code

You’ll probably want InputType.convolutional

This will set all of your number of inputs for each layer automatically based on the number of outputs for each layer.

I have InputType.convolutional in my code

@whaile-off can you show your code where that is? I dont’ see it anywhere.

In 87 string i have " .setInputTypes(InputType.convolutional(50, 130, 3))"

Please help me I do not know what to do

@whaile-off Could you print model.summary() for me? There’s no reason why your nIn should be 0. For readability, try to keep your setInputTypes at the end.

As for you getting help, you’ll get it when I have time. It’s the weekend and you’re not paying for anything. If you want an SLA and guaranteed help, please see the paid offering: Guarantee the success of your AI deployment with Deeplearning4J

Where and how to use model.summary()

@whaile-off hmm…can you try doing it before init? If that throws an error, then I’ll need another way of seeing what it think sis zero…
I know it’s here:

    .addLayer("ffn0",  new DenseLayer.Builder().nOut(3072)
                    .build(), "maxpool4")
            .addLayer("ffn1",  new DenseLayer.Builder().nOut(3072)
                    .build(), "ffn0"

based on the node name. I’m not sure what comes before it though. Some value there is not being set. I’d need to know what that is.

Just in case, can you tell me what version you’re using? Is it M2.1?

I made model.summary(); here is the console:

[main] INFO org.nd4j.linalg.factory.Nd4jBackend - Loaded [CpuBackend] backend
[main] INFO org.nd4j.nativeblas.NativeOpsHolder - Number of threads used for linear algebra: 4
[main] INFO org.nd4j.linalg.cpu.nativecpu.CpuNDArrayFactory - Binary level Generic x86 optimization level AVX512
[main] INFO org.nd4j.nativeblas.Nd4jBlas - Number of threads used for OpenMP BLAS: 4
[main] INFO org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Backend used: [CPU]; OS: [Windows 11]
[main] INFO org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Cores: [8]; Memory: [1,9GB];
[main] INFO org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Blas vendor: [OPENBLAS]
[main] INFO org.nd4j.linalg.cpu.nativecpu.CpuBackend - Backend build information:
GCC: “12.1.0”
STD version: 201103L
DEFAULT_ENGINE: samediff::ENGINE_CPU
HAVE_FLATBUFFERS
HAVE_OPENBLAS
[main] WARN org.deeplearning4j.nn.conf.inputs.InputType - Assigning height of 0. Normally this is not valid. Exceptions for this are generally relatedto model import and unknown dimensions
[main] WARN org.deeplearning4j.nn.conf.inputs.InputType - Assigning a size of zero. This is normally only valid in model import cases with unknown dimensions.
Exception in thread “main” java.lang.NullPointerException: Cannot load from object array because “this.vertices” is null
at org.deeplearning4j.nn.graph.ComputationGraph.summary(ComputationGraph.java:4389)
at org.deeplearning4j.nn.graph.ComputationGraph.summary(ComputationGraph.java:4348)
at org.mishaneyt.dlrecognizer.Main.createModel(Main.java:125)
at org.mishaneyt.dlrecognizer.Main.main(Main.java:55)

and yes it is M2.1

@whaile-off Ensure you change your padding:

     ComputationGraphConfiguration config = new NeuralNetConfiguration.Builder()
                .seed(123)
                .gradientNormalization(GradientNormalization.RenormalizeL2PerLayer)
                .l2(1e-3)
                .updater(new Adam(1e-3))
                .weightInit( WeightInit.XAVIER_UNIFORM)
                .graphBuilder()
                .addInputs("trainFeatures")
                .setOutputs("out1", "out2", "out3", "out4", "out5", "out6")
                .addLayer("cnn1",  new ConvolutionLayer.Builder(new int[]{5, 5}, new int[]{1, 1}, new int[]{1, 1})
                        .nIn(1).nOut(48).activation( Activation.RELU).build(), "trainFeatures")
                .addLayer("maxpool1",  new SubsamplingLayer.Builder(PoolingType.MAX, new int[]{2,2}, new int[]{2, 2}, new int[]{1, 1})
                        .build(), "cnn1")
                .addLayer("cnn2",  new ConvolutionLayer.Builder(new int[]{5, 5}, new int[]{1, 1}, new int[]{1, 1})
                        .nOut(64).activation( Activation.RELU).build(), "maxpool1")
                .addLayer("maxpool2",  new SubsamplingLayer.Builder(PoolingType.MAX, new int[]{2,1}, new int[]{2, 1}, new int[]{1, 1})
                        .build(), "cnn2")
                .addLayer("cnn3",  new ConvolutionLayer.Builder(new int[]{3, 3}, new int[]{1, 1}, new int[]{1, 1})
                        .nOut(128).activation( Activation.RELU).build(), "maxpool2")
                .addLayer("maxpool3",  new SubsamplingLayer.Builder(PoolingType.MAX, new int[]{2,2}, new int[]{2, 2}, new int[]{1, 1})
                        .build(), "cnn3")
                .addLayer("cnn4",  new ConvolutionLayer.Builder(new int[]{4, 4}, new int[]{1, 1}, new int[]{0, 0})
                        .nOut(256).activation( Activation.RELU).build(), "maxpool3")
                .addLayer("maxpool4",  new SubsamplingLayer.Builder(PoolingType.MAX, new int[]{2,2}, new int[]{2, 2}, new int[]{1, 1})
                        .build(), "cnn4")
                .addLayer("ffn0",  new DenseLayer.Builder().nOut(3072)
                        .build(), "maxpool4")
                .addLayer("ffn1",  new DenseLayer.Builder().nOut(3072)
                        .build(), "ffn0")
                .addLayer("out1", new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                        .nOut(10).activation(Activation.SOFTMAX).build(), "ffn1")
                .addLayer("out2", new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                        .nOut(10).activation(Activation.SOFTMAX).build(), "ffn1")
                .addLayer("out3", new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                        .nOut(10).activation(Activation.SOFTMAX).build(), "ffn1")
                .addLayer("out4", new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                        .nOut(10).activation(Activation.SOFTMAX).build(), "ffn1")
                .addLayer("out5", new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                        .nOut(10).activation(Activation.SOFTMAX).build(), "ffn1")
                .addLayer("out6", new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                        .nOut(10).activation(Activation.SOFTMAX).build(), "ffn1")
                .setInputTypes(InputType.convolutional(50, 130, 3))

                .build();

        // Construct and initialize model
        ComputationGraph model = new ComputationGraph(config);
        model.init();

The configuration just had invalid convolution values. If you don’t know how to setup a cnn I would not be starting with such a complex network first. I don’t know what task you’re doing, but a 6 output network seems a bit much and I’m not even sure why you’d do that. If you’re trying to build a classifier, maybe start with a simple MultiLayerNetwork so you can track the inputs and outputs instead?

When building a network, you still should track what the size at each layer will be.

I need to write a neural network to recognize any text from a picture, I used the code that you gave me and everything worked, but now I have another error

[main] INFO org.deeplearning4j.nn.graph.ComputationGraph - Starting ComputationGraph with WorkspaceModes set to [training: ENABLED; inference: ENABLED], cacheMode set to [NONE]
[main] INFO org.mishaneyt.dlrecognizer.Main - Эпоха запущена
[main] INFO org.deeplearning4j.optimize.listeners.ScoreIterationListener - Score at iteration 0 is NaN
[main] INFO org.deeplearning4j.optimize.listeners.EvaluativeListener - Starting evaluation nr. 1
[AMDSI prefetch thread] ERROR org.mishaneyt.dlrecognizer.dataclasses.MulRecordDataLoader - the next function shows error
java.lang.IllegalStateException: Cannot merge MultiDataSets with different number of output arrays: toMerge[0] has 9 output arrays; toMerge[1] has 14 arrays
at org.nd4j.linalg.dataset.MultiDataSet.merge(MultiDataSet.java:474)
at org.mishaneyt.dlrecognizer.dataclasses.MulRecordDataLoader.convertDataSet(MulRecordDataLoader.java:122)
at org.mishaneyt.dlrecognizer.dataclasses.MulRecordDataLoader.next(MulRecordDataLoader.java:127)
at org.mishaneyt.dlrecognizer.dataclasses.MultiRecordDataSetIterator.next(MultiRecordDataSetIterator.java:32)
at org.mishaneyt.dlrecognizer.dataclasses.MultiRecordDataSetIterator.next(MultiRecordDataSetIterator.java:72)
at org.mishaneyt.dlrecognizer.dataclasses.MultiRecordDataSetIterator.next(MultiRecordDataSetIterator.java:12)
at org.nd4j.linalg.dataset.AsyncMultiDataSetIterator$AsyncPrefetchThread.run(AsyncMultiDataSetIterator.java:354)
[main] INFO org.deeplearning4j.optimize.listeners.EvaluativeListener - Reporting evaluation results:
[main] INFO org.deeplearning4j.optimize.listeners.EvaluativeListener - Evaluation:
Evaluation: No data available (no evaluation has been performed)
[main] INFO org.deeplearning4j.optimize.listeners.EvaluativeListener - Starting evaluation nr. 2
[AMDSI prefetch thread] ERROR org.mishaneyt.dlrecognizer.dataclasses.MulRecordDataLoader - the next function shows error
java.lang.IllegalStateException: Cannot merge MultiDataSets with different number of output arrays: toMerge[0] has 9 output arrays; toMerge[1] has 14 arrays
at org.nd4j.linalg.dataset.MultiDataSet.merge(MultiDataSet.java:474)
at org.mishaneyt.dlrecognizer.dataclasses.MulRecordDataLoader.convertDataSet(MulRecordDataLoader.java:122)
at org.mishaneyt.dlrecognizer.dataclasses.MulRecordDataLoader.next(MulRecordDataLoader.java:127)
at org.mishaneyt.dlrecognizer.dataclasses.MultiRecordDataSetIterator.next(MultiRecordDataSetIterator.java:32)
at org.mishaneyt.dlrecognizer.dataclasses.MultiRecordDataSetIterator.next(MultiRecordDataSetIterator.java:72)
at org.mishaneyt.dlrecognizer.dataclasses.MultiRecordDataSetIterator.next(MultiRecordDataSetIterator.java:12)
at org.nd4j.linalg.dataset.AsyncMultiDataSetIterator$AsyncPrefetchThread.run(AsyncMultiDataSetIterator.java:354)
[main] INFO org.deeplearning4j.optimize.listeners.EvaluativeListener - Reporting evaluation results:
[main] INFO org.deeplearning4j.optimize.listeners.EvaluativeListener - Evaluation:
Evaluation: No data available (no evaluation has been performed)
[main] INFO org.deeplearning4j.optimize.listeners.EvaluativeListener - Starting evaluation nr. 3
[AMDSI prefetch thread] ERROR org.mishaneyt.dlrecognizer.dataclasses.MulRecordDataLoader - the next function shows error
java.lang.IllegalStateException: Cannot merge MultiDataSets with different number of output arrays: toMerge[0] has 9 output arrays; toMerge[1] has 14 arrays
at org.nd4j.linalg.dataset.MultiDataSet.merge(MultiDataSet.java:474)
at org.mishaneyt.dlrecognizer.dataclasses.MulRecordDataLoader.convertDataSet(MulRecordDataLoader.java:122)
at org.mishaneyt.dlrecognizer.dataclasses.MulRecordDataLoader.next(MulRecordDataLoader.java:127)
at org.mishaneyt.dlrecognizer.dataclasses.MultiRecordDataSetIterator.next(MultiRecordDataSetIterator.java:32)
at org.mishaneyt.dlrecognizer.dataclasses.MultiRecordDataSetIterator.next(MultiRecordDataSetIterator.java:72)
at org.mishaneyt.dlrecognizer.dataclasses.MultiRecordDataSetIterator.next(MultiRecordDataSetIterator.java:12)
at org.nd4j.linalg.dataset.AsyncMultiDataSetIterator$AsyncPrefetchThread.run(AsyncMultiDataSetIterator.java:354)
[main] INFO org.deeplearning4j.optimize.listeners.EvaluativeListener - Reporting evaluation results:
[main] INFO org.deeplearning4j.optimize.listeners.EvaluativeListener - Evaluation:
Evaluation: No data available (no evaluation has been performed)
[main] INFO org.deeplearning4j.optimize.listeners.EvaluativeListener - Starting evaluation nr. 4
[AMDSI prefetch thread] ERROR org.mishaneyt.dlrecognizer.dataclasses.MulRecordDataLoader - the next function shows error
java.lang.IllegalStateException: Cannot merge MultiDataSets with different number of output arrays: toMerge[0] has 14 output arrays; toMerge[1] has 9 arrays
at org.nd4j.linalg.dataset.MultiDataSet.merge(MultiDataSet.java:474)
at org.mishaneyt.dlrecognizer.dataclasses.MulRecordDataLoader.convertDataSet(MulRecordDataLoader.java:122)
at org.mishaneyt.dlrecognizer.dataclasses.MulRecordDataLoader.next(MulRecordDataLoader.java:127)
at org.mishaneyt.dlrecognizer.dataclasses.MultiRecordDataSetIterator.next(MultiRecordDataSetIterator.java:32)
at org.mishaneyt.dlrecognizer.dataclasses.MultiRecordDataSetIterator.next(MultiRecordDataSetIterator.java:72)
at org.mishaneyt.dlrecognizer.dataclasses.MultiRecordDataSetIterator.next(MultiRecordDataSetIterator.java:12)
at org.nd4j.linalg.dataset.AsyncMultiDataSetIterator$AsyncPrefetchThread.run(AsyncMultiDataSetIterator.java:354)
[main] INFO org.deeplearning4j.optimize.listeners.EvaluativeListener - Reporting evaluation results:
[main] INFO org.deeplearning4j.optimize.listeners.EvaluativeListener - Evaluation:
Evaluation: No data available (no evaluation has been performed)

Here is another class of my network, to tell the truth, I don’t quite understand why it is needed, I took it from the examples in github, I think it is necessary to clarify that my dataset is pictures (text in the picture).png

package org.mishaneyt.dlrecognizer.dataclasses;

import org.apache.commons.io.FileUtils;
import org.datavec.image.loader.NativeImageLoader;
import org.datavec.image.transform.ImageTransform;
import org.nd4j.linalg.api.concurrency.AffinityManager;
import org.nd4j.linalg.api.ndarray.INDArray;
import org.nd4j.linalg.dataset.MultiDataSet;
import org.nd4j.linalg.factory.Nd4j;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.io.File;
import java.io.Serializable;
import java.util.ArrayList;
import java.util.Collections;
import java.util.Iterator;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**

  • Description This is a DataLoader for multi-digit number recognition.

  • The maximum length is 8 digits.

  • If it is less than 8 digits, then zero is added to the end

  • The images of digits are resized to 60*160,

  • and the black border of the image is cropped

  • Using the current architecture and hyperparameters, the accuracy of the best model prediction validation set is (215-227) / 248 with different epochs

  • of course, if you’re interested, you can continue to optimize.

  • @author WangFeng
    */
    public class MulRecordDataLoader extends NativeImageLoader implements Serializable {

    private static final Logger log = LoggerFactory.getLogger(MulRecordDataLoader.class);

    private static int height = 50; //!!!
    private static int width = 130; //!!!
    private static int channels = 3; //!!!
    private File fullDir = null;
    private Iterator fileIterator;
    private int numExample = 0;

    public MulRecordDataLoader(String dataSetType) {
    this( height, width, channels, null, dataSetType);
    }

    public MulRecordDataLoader(ImageTransform imageTransform, String dataSetType) {
    this( height, width, channels, imageTransform, dataSetType );
    }

    public MulRecordDataLoader(int height, int width, int channels, ImageTransform imageTransform, String dataSetType) {
    super(height, width, channels, imageTransform);
    this.height = height;
    this.width = width;
    this.channels = channels;

     try {
         this.fullDir = fullDir != null && fullDir.exists() ? fullDir : new File("C:\\Users\\whail\\IdeaProjects\\HDWcaptchaGenerator\\vk-captcha");
     } catch (Exception e) {
         // log.error("the datasets directory failed, plz checking", e);
         throw new RuntimeException( e );
     }
     this.fullDir = new File(fullDir, dataSetType);
    
     if (!fullDir.exists()) {
         fullDir.mkdir();
     }
     load();
    

    }

    protected void load() {
    try {
    List dataFiles = (List) FileUtils.listFiles(fullDir, new String{“png”}, true );
    Collections.shuffle(dataFiles);
    fileIterator = dataFiles.iterator();
    numExample = dataFiles.size();
    } catch (Exception var4) {
    throw new RuntimeException( var4 );
    }
    }

    public MultiDataSet convertDataSet(int num) throws Exception {
    int batchNumCount = 0;
    List multiDataSets = new ArrayList<>();

     while (batchNumCount != num && fileIterator.hasNext()) {
         File image = fileIterator.next();
         String imageName = image.getName().substring(0, image.getName().lastIndexOf('.'));
    
         // Дополним имя файла нулями до 8 символов, если оно меньше
         while (imageName.length() < 6) {
             imageName = "0" + imageName;
         }
    
         String[] imageNames = imageName.split("");
         INDArray feature = asMatrix(image);
         INDArray[] features = new INDArray[]{feature};
    
         // Динамически создаем массив labels с соответствующим размером
         int numDigits = imageNames.length;
         INDArray[] labels = new INDArray[numDigits];
    
         Nd4j.getAffinityManager().ensureLocation(feature, AffinityManager.Location.DEVICE);
    
         // Создаем регулярное выражение для фильтрации только цифровых символов
         Pattern digitPattern = Pattern.compile("\\d");
    
         for (int i = 0; i < numDigits; i++) {
             Matcher matcher = digitPattern.matcher(imageNames[i]);
             if (matcher.find()) {
                 int digit = Integer.parseInt(matcher.group());
                 labels[i] = Nd4j.zeros(1, 10).putScalar(new int[]{0, digit}, 1);
             } else {
                 // Если в имени файла нет цифровых символов, создаем нулевую метку
                 labels[i] = Nd4j.zeros(1, 10);
             }
         }
    
         multiDataSets.add(new MultiDataSet(features, labels));
         batchNumCount++;
     }
     return MultiDataSet.merge(multiDataSets);
    

    }

    public MultiDataSet next(int batchSize) {
    try {
    return convertDataSet( batchSize );
    } catch (Exception e) {
    log.error(“the next function shows error”, e);
    }
    return null;
    }

    public void reset() {
    load();
    }

    public int totalExamples() {
    return numExample;
    }
    }

@whaile-off where did you get the idea to extend the NativeImageLoader exactly? I don’t think I’ve ever seen anyone do that. Usually, people use the ImageRecordReader.

Beyond that, you’ll need to make sure everything is the same size. A MultiDataSet reflects one ndarray per input, not a minibatch. Each “input” is an input based on your network specified inputs.

You’ll need to make sure that all of the ndarrays as part of that line up correctly.

I didn’t quite understand what you said, as I said most of the code is an example from github could you explain more specifically what and how I need to do to make everything work :slight_smile:

Keep in mind only GitHub - deeplearning4j/deeplearning4j-examples: Deeplearning4j Examples (DL4J, DL4J Spark, DataVec) contains the official examples. If you pick up code from anywhere else, there is a good chance that the example is outdated, wrong, or both.

I used https://github.com/deeplearning4j/deeplearning4j-examples/tree/051c59bd06b38ed39ca92f5940a6ca43b0f34c0f/dl4j-examples/src/main/java/org/deeplearning4j/examples/advanced/modelling/captcharecognition as a basis, but I still don’t know how to do what I want, as I said, I need to make a neural network that will recognize any text from a picture, and learns on a dataset that consists of 130 by 50 pictures and the name (text in the picture).png I will be very happy if you help with this

@whaile-off try to build a simpler network first. One with multiple inputs does not seem like a very good starting point when you’re just learning.

If you don’t know what I mean by multiple inputs, that’s already going to be a problem if you want to apply anything.

An “input” is something you need to have a concept of if you want to use the more advanced network.

What I mean is you have a graph that looks like this:

input1 → image1 → conv1 -
input2 ->image2 → conv2 - merge → output
input3 → image3 → conv3 -
input4 → image4 → conv3 - …

Each of these is what I like to call a “path” through the network. At some point the paths will merge. Whether that’s on your final outputs followed by some concatneation or otherwise. That is the whole network.

The network you’re using would take potentially several images in at once. In order to work with this you need to understand that each of these images maybe of different shapes. A MultiDataset reflects 1 unified object that contains the inputs and labels for each input to the network. Similarly you’d have 1 output for each input. Therefore a multi dataset underneath would look like:

image1, label for image 1
image2, label for image 2
image3, label for image 3
image4, label for image4

I understand, but can I order a neural network from you that will be able to read any text with a picture in Russian, English and numbers?

@whaile-off I’m not sure what you mean by “order”. If you want to buy consulting, please refer to the form I gave you. We typically only work with large teams though.

If you want this done for you, you will not be getting that. This library and the help on this forum are for people willing to learn how to solve the problem themselves with some guidance. My recommendation was for you to work on this one step at a time and to try to break the problem down. If you just want “copy paste and go without learning” then this probably isn’t the place for you. You want might want to use a pre done command line tool and library like tesseract, not build your own OCR tool.

If you’re ever ready to follow my suggestion and try to learn this on your own I am ready to walk you through some of the basic steps needed. So far I haven’t seen that you’re willing to follow any of the advice. I am not here to give you quick copy and paste for free help.