Hello,
the neural network I’m implementing has an Integer 2D (square) array as input (on which applies convolution) and a Double as output.
My dataset is now a CSV file with each line composed by n integers and 1 double (i.e., each line contains the flatten version of the input array followed by the double of the desired output). The dataset is meant to grow quite large.
What would be a good way to feed the data into the network to train it?
As far as I’ve understood I should try to use a RecordReader
, but I’m not sure how to join convolution with the CSVRecordReader
, doing the intermediate flat to 2D array transformation. Moreover, should I treat those Integers as Double for uniformity of ND4j type?
Thank you kindly in advance
@EquanimeAugello you would use the ImageRecordeder plus RecordReaderDataSetIterator. Here’s some tests covering different cases: deeplearning4j/TestImageRecordReader.java at 1b595d363685d979096d0ffff42f5b36d3ef6c24 · deeplearning4j/deeplearning4j · GitHub
Thank you very much.
Though, it’s still not clear to me how I should feed my data to ImageRecordReader, since my data isn’t images but instead a single CSV file where each line constitute an input-output data-piece. The constructors don’t seem to me to give this possibility, unless I’m mistaken.
Thank you
@EquanimeAugello sorry I’m not clear what you’re trying to do exactly. Rather than me piecing together everything could you describe your problem and input data?
What do you have CSV? Images?
Thank you. Yes I’ll try to be more clear: my input is a CSV like this
int1,int2,int3,…,intN,double
int1,int2,int3,…,intN,double
…
Each line of the csv is a complete data-piece. The first N integers are the flatten version of a 2D square matrix, that represents a single input to the first convolutional layer of the network. The final double of each line represents the desired output given as input the matrix. What I actually want to do, as usual, is to give batches of (lines of) data to the network during training, i.e. an array of matrices as input, and an array of double as output, constructed from the CSV file. I had actually implemented a custom dataSetIterator that managed to do this, but I’ve seen it is not advised as good practice(?).
@EquanimeAugello not normally but for your case it’s fairly unusual so it’s advisble. The only thing I’d say is you could also implement a custom record reader instead. That would allow you to leverage most of the extra infrastructure we wrote in that iterator.
Sorry for replying so late: thank you very much.