DCNN Designed for EMNIST-Letters implementation

Hi all,

I’m looking to make an implementation on EMNIST letters. My actual model predict good answer 90%.
I see this paper :

In paragraph 4.4 i try to implement this architecture but doesn’t work (for 1 epoch) :
image

My implementation :

    MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder().seed(seed).updater(new RmsProp())
				.weightInit(WeightInit.XAVIER).list()
				// premier bloc
				.layer(new ConvolutionLayer.Builder(3, 3).nIn(channels).nOut(473).build())
				.layer(new SubsamplingLayer.Builder(SubsamplingLayer.PoolingType.AVG).kernelSize(2, 2).build())
				.layer(new BatchNormalization.Builder().activation(Activation.RELU).dropOut(0.15).build())
				// Deuxième bloc
				.layer(new ConvolutionLayer.Builder(3, 3).nOut(238).build())
				.layer(new BatchNormalization.Builder().activation(Activation.LEAKYRELU).dropOut(0.20).build())
				// Troisième bloc
				.layer(new ConvolutionLayer.Builder(3, 3).nOut(133).build())
				.layer(new BatchNormalization.Builder().activation(Activation.RELU).dropOut(0.10).build())
				// quatrième bloc
				.layer(new ConvolutionLayer.Builder(3, 3).nOut(387).build())
				.layer(new BatchNormalization.Builder().activation(Activation.THRESHOLDEDRELU).dropOut(0.10).build())
				// cinquième bloc
				.layer(new ConvolutionLayer.Builder(5, 5).nOut(187).build())
				.layer(new BatchNormalization.Builder().activation(Activation.ELU).dropOut(0.50).build())
				// Première couche fully
				.layer(new DenseLayer.Builder().activation(Activation.RELU).nOut(313).build())
				.layer(new BatchNormalization.Builder().dropOut(0.20).build())
				// Deuxième couche fully
//				.layer(new DenseLayer.Builder().activation(Activation.ELU).nOut(252).build())
				.layer(new DenseLayer.Builder().activation(Activation.SIGMOID).nOut(252).build())
				.layer(new BatchNormalization.Builder().dropOut(0.20).build())
				// couche d'activation
				.layer(new OutputLayer.Builder().nOut(outputNum).lossFunction(new LossNegativeLogLikelihood(weights))
						.activation(Activation.SOFTMAX).build())
				.setInputType(InputType.convolutionalFlat(height, width, channels)).build();

I put LossNegativeLogLikelihood because in need to get weigths on my unbalanced data.
Some ideas ?

Sry i don’t understand how to format code :confused:

You do it like that

```java
java code here
```

Anyway, I see that you are using just the no-arg constructor of the RmsProp updater, so you are using the default learning rate of 0.1 instead of the learning rate the paper uses, which is 0.0001.

1 Like

It’s works better indeed ! You’re the best :slight_smile: