Cannot split DataSet with <= 1 rows

Hi,
I am struggling for 2 days, when I try to read JSON file (containing 1000 lines)
it returns me only 1 ligne when I try to use this code bellow

do i miss something , why I getting this error: Cannot split DataSet with <= 1 rows (data set has 1 example)

why INPUT doesn’t have all lines?

even if my json file have 1000 line,

when i try to see the content of all data

===========INPUT===================
[[ 1062.3409, 1063.0620, 1066.3210, 1062.2310,    0.6198,    0.6198,    0.6198,         0,         0,         0,         0,         0,         0,    1.0000,    1.0000,    1.0000,    1.0000,   72.1000,   72.1000,    1.0000,    7.0000,    3.0000,   23.0000, 1063.0620, 1063.0620,         0, 1063.0620, 1063.0620,         0,   72.1000,         0,         0,  100.0000,         0,         0,         0,         0,         0,         0,         0,         0]]
=================OUTPUT==================
[[         0,    1.0000]


JacksonLineRecordReader jacksonLineRecordReader = new JacksonLineRecordReader(getFielSelection(),new ObjectMapper(new JsonFactory()));
        Configuration configuration = new Configuration();
        configuration.set(JacksonLineRecordReader.LABELS,"OPERATORPF");
        jacksonLineRecordReader.initialize(new InputStreamInputSplit(new FileInputStream(file)));

    final DataSetIterator dataSetIterator = new RecordReaderDataSetIterator.Builder(recordReader,batchSize)
                                                                    .classification(labelIndex,numClasses)
                                                                    .build();

            //DataSetIterator dataSetIterator = new RecordReaderDataSetIterator(recordReader,batchSize,labelIndex,numClasses);

            DataSet allData = dataSetIterator.next();
            allData.shuffle();
            SplitTestAndTrain testAndTrain = allData.splitTestAndTrain(0.65);  //Use 65% of data for training

@azanux could you file an issue at Issues · eclipse/deeplearning4j · GitHub with a sample dataset I can use? Thanks!

Can you please elaborate on the values of your variables? It looks very much like you have set batchSize = 1.

If you want to load all of your data into memory, and use splitTestAndTrain you need to set the batchSize to be exactly equal to your total data size.

Hi @treo
thanks for your message, the value of batchSize was 128.
I finally transform the JSON to CVS and work with the CSV file, the code is now working well.

1 Like