I’m trying to build a Recurrent Neural Network for a regression problem, my dataset (csv file with a header) has this structure f1,f2,f3,f4,label
and I’m using CSVNLinesSequenceRecordReader to read 5 lines to group into a sequence. This is my code:
SequenceRecordReader trainRecordReader = new CSVNLinesSequenceRecordReader(5, 1, ",");
trainRecordReader.initialize(new FileSplit(new File("train.csv")));
DataSetIterator trainIterator = new SequenceRecordReaderDataSetIterator(
trainRecordReader,
64,
-1,
4,
true
);
I’m also using LabelLastTimeStepPreProcessor as preprocessor to use only the last time step label as label for the whole sequence. My problem is that I get some sequences with empty features, I checked the csv multiple times and I don’t think there are issues like empty rows, invalid data or missing columns. I also tried to use the record reader iterator with two seperate files, one for features and one for labels, but nothing changed. Here is an example of what one of these “erroneous” sequences look like
===========INPUT===================
[[[ 0.4220, 0.4214, 0.4179, 0.4174, 0.4193],
[],
[],
[]]]
=================OUTPUT==================
[[8.4328e4]]
I also made the total lines count in the csv multiple of 5 so the reader can read complete sequences with no missing lines but nothing changed. What could be the problem here?