4 classes were never predicted by the model and were excluded from average precision

xianyu1120 · April 9, 2020, 4:57am

When I try to multi-classify images, the model training process always prompts 4 classes were never predicted by the model and were excluded from average precision, please help me

    public class ModelCreate {
        private static final Logger log= LoggerFactory.getLogger(ModelCreate.class);

        //图片的格式由allowedExtension指定
        private static final String [] allowedExtensions = BaseImageLoader.ALLOWED_FORMATS;

        private static final int nChannels=3;//输入的通道个数；彩图3
        private static final int width=227;//输入宽度
        private static final int height=227;//高度
        private static final int batchSize=64;//测试批量的大小
        private static final int nEpochs=300;//训练轮数
        private static int numLabels;//训练标签数量

        private static final int seed=123456;

        private static final Random rng=new Random(seed);//随机种子

        public static void main(String[] args) throws Exception {

            /**
             * 加载图像数据
             */

            log.info("开始加载图像文件");
            File mainPath = new File("D:\\testdata");//本机数据集地址

            //在父目录下的目录中具有“允许扩展名”分割的文件，在将文件分割为训练和测试时，需要使用随机数生成器以确保可重复性
            FileSplit filesInDir =  new FileSplit(mainPath,allowedExtensions,rng);

            System.out.println("数量"+ filesInDir.length());
            //解析父目录，并使用子目录的名称作为标签/类名称
            ParentPathLabelGenerator labelMaker =  new ParentPathLabelGenerator();

            //随机采样
            RandomPathFilter randomPathFilter=new RandomPathFilter(rng);
            //将图像文件拆分为训练和测试。将测试比例指定为80％，20％
            InputSplit [] filesInDirSplit = filesInDir.sample(randomPathFilter,90,10);
            InputSplit trainData = filesInDirSplit [0];//训练集
            InputSplit testData = filesInDirSplit [1]; //测试集

            log.info("trainData URI String Length={},testData URI String Length={}", trainData.length(), testData.length());

            log.info("开始数据增强");

            //数据标准化
            DataNormalization scaler=new ImagePreProcessingScaler(0,1);

            //创建和初始化ImageRecordReader，为图像记录加载器指定高宽度，调整数据集中所有图像尺寸
            ImageRecordReader recordReader=new ImageRecordReader(height,width,nChannels,labelMaker);

            //使用训练集数据和转换器初始化记录读取器
            recordReader.initialize(trainData);//原始训练集

            numLabels=recordReader.numLabels();
            System.out.println(numLabels);
            //构造训练迭代器
            DataSetIterator trainIter=new RecordReaderDataSetIterator(recordReader,batchSize,1,numLabels);//原始

            scaler.fit(trainIter);//标准化
            trainIter.setPreProcessor(scaler);

            System.out.println("是否支持重置"+trainIter.resetSupported());

            //test iterator
            ImageRecordReader testrr=new ImageRecordReader(height,width,nChannels,labelMaker);
            testrr.initialize(testData);

            DataSetIterator testIter=new RecordReaderDataSetIterator(testrr,batchSize,1,numLabels);
            scaler.fit(testIter);
            testIter.setPreProcessor(scaler);


            log.info("Build model");
            MultiLayerNetwork network=alexnenModel();
    //        MultiLayerNetwork network=lenetModel();
            network.init();

            //初始化用户界面后端
            VertxUIServer uiServer = VertxUIServer.getInstance();
            uiServer.start();
            //设置网络信息（随时间变化的梯度、分值等）的存储位置。这里将其存储于内存。
           // StatsStorage statsStorage=new InMemoryStatsStorage();
            StatsStorage statsStorage = new FileStatsStorage(new File("ui/uilenetgpu2000409alex.dl4j"));         //或者： new FileStatsStorage(File)，用于后续的保存和载入        //将StatsStorage实例连接至用户界面，让StatsStorage的内容能够被可视化
            uiServer.attach(statsStorage);

            log.info("Train model ......");

            // 添加监听
            network.setListeners(new StatsListener( statsStorage), new ScoreIterationListener(1), new EvaluativeListener(testIter, 1, InvocationType.EPOCH_END));

            network.fit(trainIter, nEpochs);

            //方案1
    //        for (int i = 0; i < nEpochs; i++) {
    //            network.fit(trainIter);
    //            log.info("Completed epoch " + i);
    //            Evaluation trainEval = network.evaluate(trainIter);
    //            Evaluation eval = network.evaluate(testIter);
    //            log.info("train: " + trainEval.precision());
    //            log.info("val: " + eval.precision());
    //            trainIter.reset();
    //            testIter.reset();
    //        }

            //方案二
            //评价模型
            log.info("Evaluate model....");
            Evaluation eval = network.evaluate(testIter);
            log.info(eval.stats(true));

            // 取出第一条数据进行预测
            testIter.reset();
            DataSet testDataSet = testIter.next();
            List<String> allClassLabels = recordReader.getLabels();
            int labelIndex = testDataSet.getLabels().argMax(1).getInt(0);
            int[] predictedClasses = network.predict(testDataSet.getFeatures());
            String expectedResult = allClassLabels.get(labelIndex);
            String modelPrediction = allClassLabels.get(predictedClasses[0]);
            System.out.print("\nFor a single example that is labeled " + expectedResult + " the model predicted " + modelPrediction + "\n\n");


            log.info("Save model....");
            ModelSerializer.writeModel(network,new File("model/LenetNet.zip"),true,scaler);

            log.info("****************Example finished********************");
        }



        /**
         * 构建神经网络
         */
        private static ConvolutionLayer convInit(String name, int in, int out, int[] kernel, int[] stride, int[] pad, double bias) {
            return new ConvolutionLayer.Builder(kernel, stride, pad).name(name).nIn(in).nOut(out).biasInit(bias).build();
        }

        private static ConvolutionLayer conv3x3(String name, int out, double bias) {
            return new ConvolutionLayer.Builder(new int[]{3,3}, new int[] {1,1}, new int[] {1,1}).name(name).nOut(out).biasInit(bias).build();
        }
        private static ConvolutionLayer conv5x5(String name, int out, int[] stride, int[] pad, double bias) {
            return new ConvolutionLayer.Builder(new int[]{5,5}, stride, pad).name(name).nOut(out).biasInit(bias).build();
        }

        private static SubsamplingLayer maxPool(String name, int[] kernel) {
            return new SubsamplingLayer.Builder(kernel, new int[]{2,2}).name(name).build();
        }

        private static DenseLayer fullyConnected(String name, int out, double bias, double dropOut, Distribution dist) {
            return new DenseLayer.Builder()
                    .name(name)
                    .nOut(out)
                    .biasInit(bias)
                    .dropOut(dropOut)
                    .weightInit(new WeightInitDistribution(dist))
                    .build();
        }
        private static MultiLayerNetwork alexnenModel() {

            double nonZeroBias = 1;
            double dropOut = 0.8;//80概率保留
            log.info("Build model......");
            MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
                    .seed(seed)
                    .weightInit(new NormalDistribution(0.0, 0.01))
                    .activation(Activation.RELU)
                    .updater(new Nesterovs(new StepSchedule(ScheduleType.ITERATION, 0.1,0.1, 10000), 0.9))//nesterovs 动量,1000次后降到0.01，改0.1，10000
                    .biasUpdater(new Nesterovs(new StepSchedule(ScheduleType.ITERATION, 0.2, 0.1, 10000),0.9))

                    .gradientNormalization(GradientNormalization.RenormalizeL2PerLayer) // 归一化以防止梯度消失或爆炸
                    .l2(5*1e-4)
                    .list()
                    .layer(convInit("cnn1", nChannels, 96, new int[]{11, 11}, new int[]{4, 4}, new int[]{3, 3}, 0))
                    .layer(new LocalResponseNormalization.Builder().name("lrn1").build())
                    .layer(maxPool("maxpool1", new int[]{3,3}))
                    .layer(conv5x5("cnn2", 256, new int[] {1,1}, new int[] {2,2}, nonZeroBias))
                    .layer(new LocalResponseNormalization.Builder().name("lrn2").build())
                    .layer(maxPool("maxpool2", new int[]{3,3}))
                    .layer(conv3x3("cnn3", 384, 0))
                    .layer(conv3x3("cnn4", 384, nonZeroBias))
                    .layer(conv3x3("cnn5", 256, nonZeroBias))
                    .layer(maxPool("maxpool3", new int[]{3,3}))
                    .layer(fullyConnected("ffn1", 4096, nonZeroBias, dropOut, new NormalDistribution(0, 0.005)))
                    .layer(fullyConnected("ffn2", 4096, nonZeroBias, dropOut, new NormalDistribution(0, 0.005)))
                    .layer(new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                            .name("output")
                            .nOut(numLabels)
                            .activation(Activation.SOFTMAX)
                            .build())
                    .setInputType(InputType.convolutional(height, width, nChannels))
                    .build();

            return new MultiLayerNetwork(conf);

        }
        private static MultiLayerNetwork lenetModel() {
            /*
             * Revisde Lenet Model approach developed by ramgo2 achieves slightly above random
             * Reference: https://gist.github.com/ramgo2/833f12e92359a2da9e5c2fb6333351c5
             */
            MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
                    .seed(seed)
                    .l2(0.005)
                    .activation(Activation.RELU)
                    .weightInit(WeightInit.XAVIER)
                    .updater(new AdaDelta())
                    .list()
                    .layer(0, convInit("cnn1", nChannels, 50 ,  new int[]{5, 5}, new int[]{1, 1}, new int[]{0, 0}, 0))
                    .layer(1, maxPool("maxpool1", new int[]{2,2}))
                    .layer(2, conv5x5("cnn2", 100, new int[]{5, 5}, new int[]{1, 1}, 0))
                    .layer(3, maxPool("maxool2", new int[]{2,2}))
                    .layer(4, new DenseLayer.Builder().nOut(500).build())
                    .layer(5, new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                            .nOut(numLabels)
                            .activation(Activation.SOFTMAX)
                            .build())
                    .setInputType(InputType.convolutional(height, width, nChannels))
                    .build();

            return new MultiLayerNetwork(conf);

        }

xianyu1120 · April 9, 2020, 4:59am

Tips during training

========================Evaluation Metrics========================
 # of classes:    5
 Accuracy:        0.3717
 Precision:       0.3717	(4 classes excluded from average)
 Recall:          0.2000
 F1 Score:        0.5419	(4 classes excluded from average)
Precision, recall & F1: macro-averaged (equally weighted avg. of 5 classes)

Warning: 4 classes were never predicted by the model and were excluded from average precision
Classes excluded from average precision: [0, 1, 2, 3]

=========================Confusion Matrix=========================
  0  1  2  3  4
----------------
  0  0  0  0 12 | 0 = 0
  0  0  0  0 20 | 1 = 1
  0  0  0  0 21 | 2 = 2
  0  0  0  0 18 | 3 = 3
  0  0  0  0 42 | 4 = 4

Confusion matrix format: Actual (rowClass) predicted as (columnClass) N times

treo · April 9, 2020, 6:29am

The main reason you would usually see this is because of an unbalanced data set: You have relatively many examples of class 4, and the model learns that if it always predicts class 4 it will be correct in most cases.
How does the distribution of the labels in your training data look like?

Also I see that you are using an evaluative listener, so is the output you are complaining after the whole training, or is this the first output you see?

And you are using quite elaborate learning rate and initialization settings, why have you chosen them?

Also another issue with your code:

Don’t ever fit your scaler on the test data. In this case it doesn’t do anything, because the scaler you are using doesn’t actually require any statistical data, but if you ever decide to use a different scaler, this will give you problems.

xianyu1120 · April 9, 2020, 12:25pm

As a beginner, thank you very much for your answer. The code refers to another multi-class neural network, so this is the case. Thanks again

Topic		Replies	Views
Warning: 1 class was never predicted by the model and was excluded from average precision Classes excluded from average precision: [1] DL4J	1	1330	June 18, 2020
1 class was never predicted by the model and were excluded DL4J	3	1641	May 11, 2020
Warning: 9 classes were never predicted by the model and were excluded from average precision DL4J	3	646	April 26, 2020
Error running dl4j library in android studio DL4J	42	612	June 9, 2022
Low accuracy compared to model trained with Keras DL4J	8	782	August 21, 2020

4 classes were never predicted by the model and were excluded from average precision

Related topics