Regression to find the center of a rectangle/oval


i am trying to predict the center of a black rectangle/oval in an image.
My training data contains 10000 randomly generated 100x100 images with either a rectangle or oval with random size somewhere on the image. 17

The problem is that it seems like the network has some troubles to find the optimal solution and the score spikes up sometimes during training (learning rate to high?).

My Code:

public static void main(String[] args) throws Exception {
        int height = 100;    // height of the picture in px
        int width = 100;     // width of the picture in px
        int channels = 1;   // single channel for grayscale images
        int outputNum = 2; 
        int batchSize = 54; // number of samples that will be propagated through the network in each iteration
        int nEpochs = 1;    // number of training epochs

        int seed = 1234;    // number used to initialize a pseudorandom number generator.
        Random randNumGen = new Random(seed);

        File trainData = new File("D:\\Deeplearning4j\\squares\\pics");
        FileSplit trainSplit = new FileSplit(trainData, NativeImageLoader.ALLOWED_FORMATS, randNumGen);
        MySquareMultiLabelGen labelMaker = new MySquareMultiLabelGen(); // use parent directory name as the image label
        ImageRecordReader trainRR = new ImageRecordReader(height, width, channels, labelMaker);
        DataSetIterator trainIter = new RecordReaderDataSetIterator.Builder(trainRR, batchSize)

        DataNormalization imageScaler = new ImagePreProcessingScaler();;

        MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
            .l2(0.0005) // ridge regression value
            .updater(new AdaDelta())
            .layer(new ConvolutionLayer.Builder(5, 5)
                .stride(1, 1)
            .layer(new DenseLayer.Builder().activation(Activation.RELU)
            .layer(new DenseLayer.Builder().activation(Activation.RELU)
            .layer(new OutputLayer.Builder(LossFunctions.LossFunction.MSE)
            .setInputType(InputType.convolutional(height, width, channels)) // InputType.convolutional for normal image

        MultiLayerNetwork net = new MultiLayerNetwork(conf);
        net.setListeners(new ScoreIterationListener(10));

        System.out.println("Total num of params: " + net.numParams());, 4);

What i tried so far:

  • (not)normalizing the labels
  • (not)normalizing the inputs
  • decrease/increase complexity of the neural network
  • different updaters
  • more/less epochs for training
  • increase/decrease batchsize

Edit: make code readable

Your labels should be between 0 and 1, that way you will get a result relative to your input size.

The network architecture however looks very suspicious. Because you have only a single convolution of 5x5 pixels, it can at best find: edges, background and foreground. Take a look at the model zoo how the more complex models are built, and maybe read up on how CNNs are usually structured (esp. the progression of layers)

1 Like