DenseLayer (index=9, name=ffn0) nIn=0, nOut=3072; nIn and nOut must be > 0

whaile-off · July 29, 2023, 6:16am

Help I have error in my code

public class Main {

private static final Logger log = LoggerFactory.getLogger(Main.class);

private static final int batchSize = 15;

public static void main(String[] args) throws Exception {
    long startTime = System.currentTimeMillis();

    File file = new File("C:\\Users\\whail\\IdeaProjects\\DLRecognizer\\deeplearning\\src\\main\\resources");

    File modelDir = new File(file.getPath() + "/model");

    // create directory
    if (!modelDir.exists()) { //noinspection ResultOfMethodCallIgnored
        modelDir.mkdirs();
    }

    //create model
    ComputationGraph model = createModel();

    //construct the iterator
    MultiDataSetIterator trainMulIterator = new MultiRecordDataSetIterator(batchSize, "6");
    MultiDataSetIterator testMulIterator = new MultiRecordDataSetIterator(batchSize, "test");

    //fit
    model.setListeners(new ScoreIterationListener(10), new EvaluativeListener(testMulIterator, 1, InvocationType.EPOCH_END));

    log.info("Эпоха запущена");
    model.fit(trainMulIterator, 4);

    //save
    model.save(new File(modelDir.getPath() + "/model.zip"), true);
    long endTime = System.currentTimeMillis();

    System.out.println("=============run time===================== " + (endTime - startTime));

    System.out.println("=====eval model=====test==================");
    modelPredict(model, testMulIterator);
}

private static ComputationGraph createModel() {

    ComputationGraphConfiguration config = new NeuralNetConfiguration.Builder()
            .seed(123)
            .gradientNormalization(GradientNormalization.RenormalizeL2PerLayer)
            .l2(1e-3)
            .updater(new Adam(1e-3))
            .weightInit( WeightInit.XAVIER_UNIFORM)
            .graphBuilder()
            .addInputs("trainFeatures")
            .setInputTypes(InputType.convolutional(50, 130, 3))
            .setOutputs("out1", "out2", "out3", "out4", "out5", "out6")
            .addLayer("cnn1",  new ConvolutionLayer.Builder(new int[]{5, 5}, new int[]{1, 1}, new int[]{0, 0})
                    .nIn(1).nOut(48).activation( Activation.RELU).build(), "trainFeatures")
            .addLayer("maxpool1",  new SubsamplingLayer.Builder(PoolingType.MAX, new int[]{2,2}, new int[]{2, 2}, new int[]{0, 0})
                    .build(), "cnn1")
            .addLayer("cnn2",  new ConvolutionLayer.Builder(new int[]{5, 5}, new int[]{1, 1}, new int[]{0, 0})
                    .nOut(64).activation( Activation.RELU).build(), "maxpool1")
            .addLayer("maxpool2",  new SubsamplingLayer.Builder(PoolingType.MAX, new int[]{2,1}, new int[]{2, 1}, new int[]{0, 0})
                    .build(), "cnn2")
            .addLayer("cnn3",  new ConvolutionLayer.Builder(new int[]{3, 3}, new int[]{1, 1}, new int[]{0, 0})
                    .nOut(128).activation( Activation.RELU).build(), "maxpool2")
            .addLayer("maxpool3",  new SubsamplingLayer.Builder(PoolingType.MAX, new int[]{2,2}, new int[]{2, 2}, new int[]{0, 0})
                    .build(), "cnn3")
            .addLayer("cnn4",  new ConvolutionLayer.Builder(new int[]{4, 4}, new int[]{1, 1}, new int[]{0, 0})
                    .nOut(256).activation( Activation.RELU).build(), "maxpool3")
            .addLayer("maxpool4",  new SubsamplingLayer.Builder(PoolingType.MAX, new int[]{2,2}, new int[]{2, 2}, new int[]{0, 0})
                    .build(), "cnn4")
            .addLayer("ffn0",  new DenseLayer.Builder().nOut(3072)
                    .build(), "maxpool4")
            .addLayer("ffn1",  new DenseLayer.Builder().nOut(3072)
                    .build(), "ffn0")
            .addLayer("out1", new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                    .nOut(10).activation(Activation.SOFTMAX).build(), "ffn1")
            .addLayer("out2", new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                    .nOut(10).activation(Activation.SOFTMAX).build(), "ffn1")
            .addLayer("out3", new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                    .nOut(10).activation(Activation.SOFTMAX).build(), "ffn1")
            .addLayer("out4", new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                    .nOut(10).activation(Activation.SOFTMAX).build(), "ffn1")
            .addLayer("out5", new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                    .nOut(10).activation(Activation.SOFTMAX).build(), "ffn1")
            .addLayer("out6", new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                    .nOut(10).activation(Activation.SOFTMAX).build(), "ffn1")
            .build();

    // Construct and initialize model
    ComputationGraph model = new ComputationGraph(config);
    model.init();

    return model;
}

private static void modelPredict(ComputationGraph model, MultiDataSetIterator iterator) {
    int sumCount = 0;
    int correctCount = 0;

    while (iterator.hasNext()) {
        MultiDataSet mds = iterator.next();
        INDArray[]  output = model.output(mds.getFeatures());
        INDArray[] labels = mds.getLabels();
        int dataNum = Math.min(batchSize, output[0].rows());
        for (int dataIndex = 0;  dataIndex < dataNum; dataIndex ++) {
            StringBuilder reLabel = new StringBuilder();
            StringBuilder peLabel = new StringBuilder();
            INDArray preOutput;
            INDArray realLabel;
            for (int digit = 0; digit < 6; digit ++) {
                preOutput = output[digit].getRow(dataIndex);
                peLabel.append(Nd4j.argMax(preOutput).getInt(0));
                realLabel = labels[digit].getRow(dataIndex);
                reLabel.append(Nd4j.argMax(realLabel).getInt(0));
            }
            boolean equals = peLabel.toString().equals(reLabel.toString());
            if (equals) {
                correctCount ++;
            }
            sumCount ++;
            log.info("real image {}  prediction {} status {}", reLabel.toString(), peLabel.toString(), equals);
        }
    }
    iterator.reset();
    System.out.println("validate result : sum count =" + sumCount + " correct count=" + correctCount );
}

}

Error:
Exception in thread “main” org.deeplearning4j.exception.DL4JInvalidConfigException: DenseLayer (index=9, name=ffn0) nIn=0, nOut=3072; nIn and nOut must be > 0
at org.deeplearning4j.nn.conf.layers.LayerValidation.assertNInNOutSet(LayerValidation.java:55)
at org.deeplearning4j.nn.conf.layers.DenseLayer.instantiate(DenseLayer.java:58)
at org.deeplearning4j.nn.conf.graph.LayerVertex.instantiate(LayerVertex.java:106)
at org.deeplearning4j.nn.graph.ComputationGraph.init(ComputationGraph.java:581)
at org.deeplearning4j.nn.graph.ComputationGraph.init(ComputationGraph.java:442)
at org.mishaneyt.dlrecognizer.Main.createModel(Main.java:125)
at org.mishaneyt.dlrecognizer.Main.main(Main.java:55)

agibsonccc · July 29, 2023, 6:18am

@whaile-off You need to add setInputType to the output. Plenty of examples here:
https://github.com/search?q=repo%3Adeeplearning4j%2Fdeeplearning4j-examples%20%20setInputType&type=code

You’ll probably want InputType.convolutional

This will set all of your number of inputs for each layer automatically based on the number of outputs for each layer.

whaile-off · July 29, 2023, 12:20pm

I have InputType.convolutional in my code

agibsonccc · July 29, 2023, 12:36pm

@whaile-off can you show your code where that is? I dont’ see it anywhere.

whaile-off · July 29, 2023, 1:32pm

In 87 string i have " .setInputTypes(InputType.convolutional(50, 130, 3))"

whaile-off · July 29, 2023, 5:39pm

Please help me I do not know what to do

agibsonccc · July 30, 2023, 10:09am

@whaile-off Could you print model.summary() for me? There’s no reason why your nIn should be 0. For readability, try to keep your setInputTypes at the end.

As for you getting help, you’ll get it when I have time. It’s the weekend and you’re not paying for anything. If you want an SLA and guaranteed help, please see the paid offering: Guarantee the success of your AI deployment with Deeplearning4J

whaile-off · July 30, 2023, 11:52am

Where and how to use model.summary()

agibsonccc · July 30, 2023, 12:17pm

@whaile-off hmm…can you try doing it before init? If that throws an error, then I’ll need another way of seeing what it think sis zero…
I know it’s here:

    .addLayer("ffn0",  new DenseLayer.Builder().nOut(3072)
                    .build(), "maxpool4")
            .addLayer("ffn1",  new DenseLayer.Builder().nOut(3072)
                    .build(), "ffn0"

based on the node name. I’m not sure what comes before it though. Some value there is not being set. I’d need to know what that is.

Just in case, can you tell me what version you’re using? Is it M2.1?

whaile-off · July 30, 2023, 12:25pm

I made model.summary(); here is the console:

[main] INFO org.nd4j.linalg.factory.Nd4jBackend - Loaded [CpuBackend] backend
[main] INFO org.nd4j.nativeblas.NativeOpsHolder - Number of threads used for linear algebra: 4
[main] INFO org.nd4j.linalg.cpu.nativecpu.CpuNDArrayFactory - Binary level Generic x86 optimization level AVX512
[main] INFO org.nd4j.nativeblas.Nd4jBlas - Number of threads used for OpenMP BLAS: 4
[main] INFO org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Backend used: [CPU]; OS: [Windows 11]
[main] INFO org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Cores: [8]; Memory: [1,9GB];
[main] INFO org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner - Blas vendor: [OPENBLAS]
[main] INFO org.nd4j.linalg.cpu.nativecpu.CpuBackend - Backend build information:
GCC: “12.1.0”
STD version: 201103L
DEFAULT_ENGINE: samediff::ENGINE_CPU
HAVE_FLATBUFFERS
HAVE_OPENBLAS
[main] WARN org.deeplearning4j.nn.conf.inputs.InputType - Assigning height of 0. Normally this is not valid. Exceptions for this are generally relatedto model import and unknown dimensions
[main] WARN org.deeplearning4j.nn.conf.inputs.InputType - Assigning a size of zero. This is normally only valid in model import cases with unknown dimensions.
Exception in thread “main” java.lang.NullPointerException: Cannot load from object array because “this.vertices” is null
at org.deeplearning4j.nn.graph.ComputationGraph.summary(ComputationGraph.java:4389)
at org.deeplearning4j.nn.graph.ComputationGraph.summary(ComputationGraph.java:4348)
at org.mishaneyt.dlrecognizer.Main.createModel(Main.java:125)
at org.mishaneyt.dlrecognizer.Main.main(Main.java:55)

and yes it is M2.1

agibsonccc · July 30, 2023, 10:30pm

@whaile-off Ensure you change your padding:

     ComputationGraphConfiguration config = new NeuralNetConfiguration.Builder()
                .seed(123)
                .gradientNormalization(GradientNormalization.RenormalizeL2PerLayer)
                .l2(1e-3)
                .updater(new Adam(1e-3))
                .weightInit( WeightInit.XAVIER_UNIFORM)
                .graphBuilder()
                .addInputs("trainFeatures")
                .setOutputs("out1", "out2", "out3", "out4", "out5", "out6")
                .addLayer("cnn1",  new ConvolutionLayer.Builder(new int[]{5, 5}, new int[]{1, 1}, new int[]{1, 1})
                        .nIn(1).nOut(48).activation( Activation.RELU).build(), "trainFeatures")
                .addLayer("maxpool1",  new SubsamplingLayer.Builder(PoolingType.MAX, new int[]{2,2}, new int[]{2, 2}, new int[]{1, 1})
                        .build(), "cnn1")
                .addLayer("cnn2",  new ConvolutionLayer.Builder(new int[]{5, 5}, new int[]{1, 1}, new int[]{1, 1})
                        .nOut(64).activation( Activation.RELU).build(), "maxpool1")
                .addLayer("maxpool2",  new SubsamplingLayer.Builder(PoolingType.MAX, new int[]{2,1}, new int[]{2, 1}, new int[]{1, 1})
                        .build(), "cnn2")
                .addLayer("cnn3",  new ConvolutionLayer.Builder(new int[]{3, 3}, new int[]{1, 1}, new int[]{1, 1})
                        .nOut(128).activation( Activation.RELU).build(), "maxpool2")
                .addLayer("maxpool3",  new SubsamplingLayer.Builder(PoolingType.MAX, new int[]{2,2}, new int[]{2, 2}, new int[]{1, 1})
                        .build(), "cnn3")
                .addLayer("cnn4",  new ConvolutionLayer.Builder(new int[]{4, 4}, new int[]{1, 1}, new int[]{0, 0})
                        .nOut(256).activation( Activation.RELU).build(), "maxpool3")
                .addLayer("maxpool4",  new SubsamplingLayer.Builder(PoolingType.MAX, new int[]{2,2}, new int[]{2, 2}, new int[]{1, 1})
                        .build(), "cnn4")
                .addLayer("ffn0",  new DenseLayer.Builder().nOut(3072)
                        .build(), "maxpool4")
                .addLayer("ffn1",  new DenseLayer.Builder().nOut(3072)
                        .build(), "ffn0")
                .addLayer("out1", new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                        .nOut(10).activation(Activation.SOFTMAX).build(), "ffn1")
                .addLayer("out2", new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                        .nOut(10).activation(Activation.SOFTMAX).build(), "ffn1")
                .addLayer("out3", new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                        .nOut(10).activation(Activation.SOFTMAX).build(), "ffn1")
                .addLayer("out4", new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                        .nOut(10).activation(Activation.SOFTMAX).build(), "ffn1")
                .addLayer("out5", new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                        .nOut(10).activation(Activation.SOFTMAX).build(), "ffn1")
                .addLayer("out6", new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                        .nOut(10).activation(Activation.SOFTMAX).build(), "ffn1")
                .setInputTypes(InputType.convolutional(50, 130, 3))

                .build();

        // Construct and initialize model
        ComputationGraph model = new ComputationGraph(config);
        model.init();

The configuration just had invalid convolution values. If you don’t know how to setup a cnn I would not be starting with such a complex network first. I don’t know what task you’re doing, but a 6 output network seems a bit much and I’m not even sure why you’d do that. If you’re trying to build a classifier, maybe start with a simple MultiLayerNetwork so you can track the inputs and outputs instead?

When building a network, you still should track what the size at each layer will be.

whaile-off · July 31, 2023, 4:14am

I need to write a neural network to recognize any text from a picture, I used the code that you gave me and everything worked, but now I have another error

[main] INFO org.deeplearning4j.nn.graph.ComputationGraph - Starting ComputationGraph with WorkspaceModes set to [training: ENABLED; inference: ENABLED], cacheMode set to [NONE]
[main] INFO org.mishaneyt.dlrecognizer.Main - Эпоха запущена
[main] INFO org.deeplearning4j.optimize.listeners.ScoreIterationListener - Score at iteration 0 is NaN
[main] INFO org.deeplearning4j.optimize.listeners.EvaluativeListener - Starting evaluation nr. 1
[AMDSI prefetch thread] ERROR org.mishaneyt.dlrecognizer.dataclasses.MulRecordDataLoader - the next function shows error
java.lang.IllegalStateException: Cannot merge MultiDataSets with different number of output arrays: toMerge[0] has 9 output arrays; toMerge[1] has 14 arrays
at org.nd4j.linalg.dataset.MultiDataSet.merge(MultiDataSet.java:474)
at org.mishaneyt.dlrecognizer.dataclasses.MulRecordDataLoader.convertDataSet(MulRecordDataLoader.java:122)
at org.mishaneyt.dlrecognizer.dataclasses.MulRecordDataLoader.next(MulRecordDataLoader.java:127)
at org.mishaneyt.dlrecognizer.dataclasses.MultiRecordDataSetIterator.next(MultiRecordDataSetIterator.java:32)
at org.mishaneyt.dlrecognizer.dataclasses.MultiRecordDataSetIterator.next(MultiRecordDataSetIterator.java:72)
at org.mishaneyt.dlrecognizer.dataclasses.MultiRecordDataSetIterator.next(MultiRecordDataSetIterator.java:12)
at org.nd4j.linalg.dataset.AsyncMultiDataSetIterator$AsyncPrefetchThread.run(AsyncMultiDataSetIterator.java:354)
[main] INFO org.deeplearning4j.optimize.listeners.EvaluativeListener - Reporting evaluation results:
[main] INFO org.deeplearning4j.optimize.listeners.EvaluativeListener - Evaluation:
Evaluation: No data available (no evaluation has been performed)
[main] INFO org.deeplearning4j.optimize.listeners.EvaluativeListener - Starting evaluation nr. 2
[AMDSI prefetch thread] ERROR org.mishaneyt.dlrecognizer.dataclasses.MulRecordDataLoader - the next function shows error
java.lang.IllegalStateException: Cannot merge MultiDataSets with different number of output arrays: toMerge[0] has 9 output arrays; toMerge[1] has 14 arrays
at org.nd4j.linalg.dataset.MultiDataSet.merge(MultiDataSet.java:474)
at org.mishaneyt.dlrecognizer.dataclasses.MulRecordDataLoader.convertDataSet(MulRecordDataLoader.java:122)
at org.mishaneyt.dlrecognizer.dataclasses.MulRecordDataLoader.next(MulRecordDataLoader.java:127)
at org.mishaneyt.dlrecognizer.dataclasses.MultiRecordDataSetIterator.next(MultiRecordDataSetIterator.java:32)
at org.mishaneyt.dlrecognizer.dataclasses.MultiRecordDataSetIterator.next(MultiRecordDataSetIterator.java:72)
at org.mishaneyt.dlrecognizer.dataclasses.MultiRecordDataSetIterator.next(MultiRecordDataSetIterator.java:12)
at org.nd4j.linalg.dataset.AsyncMultiDataSetIterator$AsyncPrefetchThread.run(AsyncMultiDataSetIterator.java:354)
[main] INFO org.deeplearning4j.optimize.listeners.EvaluativeListener - Reporting evaluation results:
[main] INFO org.deeplearning4j.optimize.listeners.EvaluativeListener - Evaluation:
Evaluation: No data available (no evaluation has been performed)
[main] INFO org.deeplearning4j.optimize.listeners.EvaluativeListener - Starting evaluation nr. 3
[AMDSI prefetch thread] ERROR org.mishaneyt.dlrecognizer.dataclasses.MulRecordDataLoader - the next function shows error
java.lang.IllegalStateException: Cannot merge MultiDataSets with different number of output arrays: toMerge[0] has 9 output arrays; toMerge[1] has 14 arrays
at org.nd4j.linalg.dataset.MultiDataSet.merge(MultiDataSet.java:474)
at org.mishaneyt.dlrecognizer.dataclasses.MulRecordDataLoader.convertDataSet(MulRecordDataLoader.java:122)
at org.mishaneyt.dlrecognizer.dataclasses.MulRecordDataLoader.next(MulRecordDataLoader.java:127)
at org.mishaneyt.dlrecognizer.dataclasses.MultiRecordDataSetIterator.next(MultiRecordDataSetIterator.java:32)
at org.mishaneyt.dlrecognizer.dataclasses.MultiRecordDataSetIterator.next(MultiRecordDataSetIterator.java:72)
at org.mishaneyt.dlrecognizer.dataclasses.MultiRecordDataSetIterator.next(MultiRecordDataSetIterator.java:12)
at org.nd4j.linalg.dataset.AsyncMultiDataSetIterator$AsyncPrefetchThread.run(AsyncMultiDataSetIterator.java:354)
[main] INFO org.deeplearning4j.optimize.listeners.EvaluativeListener - Reporting evaluation results:
[main] INFO org.deeplearning4j.optimize.listeners.EvaluativeListener - Evaluation:
Evaluation: No data available (no evaluation has been performed)
[main] INFO org.deeplearning4j.optimize.listeners.EvaluativeListener - Starting evaluation nr. 4
[AMDSI prefetch thread] ERROR org.mishaneyt.dlrecognizer.dataclasses.MulRecordDataLoader - the next function shows error
java.lang.IllegalStateException: Cannot merge MultiDataSets with different number of output arrays: toMerge[0] has 14 output arrays; toMerge[1] has 9 arrays
at org.nd4j.linalg.dataset.MultiDataSet.merge(MultiDataSet.java:474)
at org.mishaneyt.dlrecognizer.dataclasses.MulRecordDataLoader.convertDataSet(MulRecordDataLoader.java:122)
at org.mishaneyt.dlrecognizer.dataclasses.MulRecordDataLoader.next(MulRecordDataLoader.java:127)
at org.mishaneyt.dlrecognizer.dataclasses.MultiRecordDataSetIterator.next(MultiRecordDataSetIterator.java:32)
at org.mishaneyt.dlrecognizer.dataclasses.MultiRecordDataSetIterator.next(MultiRecordDataSetIterator.java:72)
at org.mishaneyt.dlrecognizer.dataclasses.MultiRecordDataSetIterator.next(MultiRecordDataSetIterator.java:12)
at org.nd4j.linalg.dataset.AsyncMultiDataSetIterator$AsyncPrefetchThread.run(AsyncMultiDataSetIterator.java:354)
[main] INFO org.deeplearning4j.optimize.listeners.EvaluativeListener - Reporting evaluation results:
[main] INFO org.deeplearning4j.optimize.listeners.EvaluativeListener - Evaluation:
Evaluation: No data available (no evaluation has been performed)

Here is another class of my network, to tell the truth, I don’t quite understand why it is needed, I took it from the examples in github, I think it is necessary to clarify that my dataset is pictures (text in the picture).png

package org.mishaneyt.dlrecognizer.dataclasses;

import org.apache.commons.io.FileUtils;
import org.datavec.image.loader.NativeImageLoader;
import org.datavec.image.transform.ImageTransform;
import org.nd4j.linalg.api.concurrency.AffinityManager;
import org.nd4j.linalg.api.ndarray.INDArray;
import org.nd4j.linalg.dataset.MultiDataSet;
import org.nd4j.linalg.factory.Nd4j;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.io.File;
import java.io.Serializable;
import java.util.ArrayList;
import java.util.Collections;
import java.util.Iterator;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**

Description This is a DataLoader for multi-digit number recognition.
The maximum length is 8 digits.
If it is less than 8 digits, then zero is added to the end
The images of digits are resized to 60*160,
and the black border of the image is cropped
Using the current architecture and hyperparameters, the accuracy of the best model prediction validation set is (215-227) / 248 with different epochs
of course, if you’re interested, you can continue to optimize.

@author WangFeng
*/
public class MulRecordDataLoader extends NativeImageLoader implements Serializable {

private static final Logger log = LoggerFactory.getLogger(MulRecordDataLoader.class);

private static int height = 50; //!!!
private static int width = 130; //!!!
private static int channels = 3; //!!!
private File fullDir = null;
private Iterator fileIterator;
private int numExample = 0;

public MulRecordDataLoader(String dataSetType) {
this( height, width, channels, null, dataSetType);
}

public MulRecordDataLoader(ImageTransform imageTransform, String dataSetType) {
this( height, width, channels, imageTransform, dataSetType );
}

public MulRecordDataLoader(int height, int width, int channels, ImageTransform imageTransform, String dataSetType) {
super(height, width, channels, imageTransform);
this.height = height;
this.width = width;
this.channels = channels;

 try {
     this.fullDir = fullDir != null && fullDir.exists() ? fullDir : new File("C:\\Users\\whail\\IdeaProjects\\HDWcaptchaGenerator\\vk-captcha");
 } catch (Exception e) {
     // log.error("the datasets directory failed, plz checking", e);
     throw new RuntimeException( e );
 }
 this.fullDir = new File(fullDir, dataSetType);

 if (!fullDir.exists()) {
     fullDir.mkdir();
 }
 load();

}

protected void load() {
try {
List dataFiles = (List) FileUtils.listFiles(fullDir, new String{“png”}, true );
Collections.shuffle(dataFiles);
fileIterator = dataFiles.iterator();
numExample = dataFiles.size();
} catch (Exception var4) {
throw new RuntimeException( var4 );
}
}

public MultiDataSet convertDataSet(int num) throws Exception {
int batchNumCount = 0;
List multiDataSets = new ArrayList<>();

 while (batchNumCount != num && fileIterator.hasNext()) {
     File image = fileIterator.next();
     String imageName = image.getName().substring(0, image.getName().lastIndexOf('.'));

     // Дополним имя файла нулями до 8 символов, если оно меньше
     while (imageName.length() < 6) {
         imageName = "0" + imageName;
     }

     String[] imageNames = imageName.split("");
     INDArray feature = asMatrix(image);
     INDArray[] features = new INDArray[]{feature};

     // Динамически создаем массив labels с соответствующим размером
     int numDigits = imageNames.length;
     INDArray[] labels = new INDArray[numDigits];

     Nd4j.getAffinityManager().ensureLocation(feature, AffinityManager.Location.DEVICE);

     // Создаем регулярное выражение для фильтрации только цифровых символов
     Pattern digitPattern = Pattern.compile("\\d");

     for (int i = 0; i < numDigits; i++) {
         Matcher matcher = digitPattern.matcher(imageNames[i]);
         if (matcher.find()) {
             int digit = Integer.parseInt(matcher.group());
             labels[i] = Nd4j.zeros(1, 10).putScalar(new int[]{0, digit}, 1);
         } else {
             // Если в имени файла нет цифровых символов, создаем нулевую метку
             labels[i] = Nd4j.zeros(1, 10);
         }
     }

     multiDataSets.add(new MultiDataSet(features, labels));
     batchNumCount++;
 }
 return MultiDataSet.merge(multiDataSets);

}

public MultiDataSet next(int batchSize) {
try {
return convertDataSet( batchSize );
} catch (Exception e) {
log.error(“the next function shows error”, e);
}
return null;
}

public void reset() {
load();
}

public int totalExamples() {
return numExample;
}
}

agibsonccc · July 31, 2023, 11:30am

@whaile-off where did you get the idea to extend the NativeImageLoader exactly? I don’t think I’ve ever seen anyone do that. Usually, people use the ImageRecordReader.

Beyond that, you’ll need to make sure everything is the same size. A MultiDataSet reflects one ndarray per input, not a minibatch. Each “input” is an input based on your network specified inputs.

You’ll need to make sure that all of the ndarrays as part of that line up correctly.

whaile-off · July 31, 2023, 11:34am

I didn’t quite understand what you said, as I said most of the code is an example from github could you explain more specifically what and how I need to do to make everything work

treo · July 31, 2023, 11:36am

Keep in mind only GitHub - deeplearning4j/deeplearning4j-examples: Deeplearning4j Examples (DL4J, DL4J Spark, DataVec) contains the official examples. If you pick up code from anywhere else, there is a good chance that the example is outdated, wrong, or both.

whaile-off · July 31, 2023, 11:39am

I used https://github.com/deeplearning4j/deeplearning4j-examples/tree/051c59bd06b38ed39ca92f5940a6ca43b0f34c0f/dl4j-examples/src/main/java/org/deeplearning4j/examples/advanced/modelling/captcharecognition as a basis, but I still don’t know how to do what I want, as I said, I need to make a neural network that will recognize any text from a picture, and learns on a dataset that consists of 130 by 50 pictures and the name (text in the picture).png I will be very happy if you help with this

agibsonccc · July 31, 2023, 12:28pm

@whaile-off try to build a simpler network first. One with multiple inputs does not seem like a very good starting point when you’re just learning.

If you don’t know what I mean by multiple inputs, that’s already going to be a problem if you want to apply anything.

An “input” is something you need to have a concept of if you want to use the more advanced network.

What I mean is you have a graph that looks like this:

input1 → image1 → conv1 -
input2 ->image2 → conv2 - merge → output
input3 → image3 → conv3 -
input4 → image4 → conv3 - …

Each of these is what I like to call a “path” through the network. At some point the paths will merge. Whether that’s on your final outputs followed by some concatneation or otherwise. That is the whole network.

The network you’re using would take potentially several images in at once. In order to work with this you need to understand that each of these images maybe of different shapes. A MultiDataset reflects 1 unified object that contains the inputs and labels for each input to the network. Similarly you’d have 1 output for each input. Therefore a multi dataset underneath would look like:

image1, label for image 1
image2, label for image 2
image3, label for image 3
image4, label for image4

whaile-off · July 31, 2023, 12:50pm

I understand, but can I order a neural network from you that will be able to read any text with a picture in Russian, English and numbers?

agibsonccc · August 1, 2023, 11:54pm

@whaile-off I’m not sure what you mean by “order”. If you want to buy consulting, please refer to the form I gave you. We typically only work with large teams though.

If you want this done for you, you will not be getting that. This library and the help on this forum are for people willing to learn how to solve the problem themselves with some guidance. My recommendation was for you to work on this one step at a time and to try to break the problem down. If you just want “copy paste and go without learning” then this probably isn’t the place for you. You want might want to use a pre done command line tool and library like tesseract, not build your own OCR tool.

If you’re ever ready to follow my suggestion and try to learn this on your own I am ready to walk you through some of the basic steps needed. So far I haven’t seen that you’re willing to follow any of the advice. I am not here to give you quick copy and paste for free help.

Topic		Replies	Views
Trying to load a simple sequential tf.keras model saved as .h5 file DL4J	0	158	April 7, 2023
Regression Problem DL4J	9	376	July 15, 2022
Allocation failed: [[DEVICE] allocation failed; Error code: [2]] ND4J	4	513	May 24, 2022
Multilabel classifier with ComputationGraphConfig gives error DL4J	1	408	June 30, 2020
No such method error when init model on snapshot DL4J	19	1280	June 1, 2021

DenseLayer (index=9, name=ffn0) nIn=0, nOut=3072; nIn and nOut must be > 0

Related topics