Tuning Image Classifier

Hi everyone,

From the help of @treo and various others in this community, I’ve been able to get my CNN up and running very quickly! I was just wondering if there’s anything I can change to increase the accuracy of the model. I’m currently running a setup on the following data:

  • 180 classes/labels/birds
  • ~25,000 training images (224x224, 3 channels)
  • 900 testing images
  • 900 validation images

I’m using a CNN with the following characteristics:

7 layers (incl. output layer):

  • Convulutional (1,1 strides, weightInit Xavier, activation Relu)
  • Subsampling layer (2,2 strides, pooling type MAX)
  • Convulutional (1,1 strides, weightInit Xavier, activation Relu)
  • Subsampling layer (2,2 strides, pooling type MAX)
  • Convulutional (1,1 strides, weightInit Xavier, activation Relu)
  • Subsampling layer (2,2 strides, pooling type MAX)
  • Output layer (LossFN negative-log-likelihood, activation softmax, weightinit xavier)

Misc parameters:

  • Optimization using stochastic gradient descent
  • l2 of 0.004
  • Learning rate of 0.001
  • Momentum of 0.9
  • Early Stopping Trainer terminating after 50 epochs or 60 mins, evaluating on validation set every 2 epochs

With this setup, I’m currently getting an average of approximately 45-55% accuracy across the testing set after producing the model from the earlyStoppingTrainer. It might also be important to note that I’m running this network using CUDA 10.2 with a GTX 1080.

The full source code is uploaded to Github here.

Thanks for your help!

The dataset you are working with is significantly more complex than MNIST, so I’d start with an architecture that is more suited for the task.

Take a look at the LeNet based model in the Animal Classification example, or the architecture from the Cifar Example.

1 Like

I’ve implemented an architecture similar to the one from the CIFAR animal classification example with much better results now, thank you for your help!