Hi there, I’m just getting started with dl4j. After going through some of the examples I thought I would have a go at implementing a simple convolutional network to classify a subset of images from the Google Quick, Draw! Dataset. The DataIterators I have been using from the examples are customised for specific datasets (MNIST, CIFAR etc.).
How would I create a custom DataIterator that converts the bunch of .npy files into a format that I can load into my network. More importantly, is this the correct approach?
I dont know of a .npy or json iterator that comes with the library. A quick internet search shows that there may be some options but they are slim. If you do end up making one please post it here as I didnt know about the dataset you posted but may play with it mysef. Yes your on the right path IMO
If you do try and build it I have two thought things for you
Here is a datasetiterator for images that is rather simple. I use it for all kinds of image datsets instead of the ones you mentioned above
public DataSetIterator getIter()
{
try{
ImageRecordReader recordReader = new ImageRecordReader(height,width, 3);
recordReader.initialize(new FileSplit(new File(source)));
second here is a really simple class that acts as an iterator. I use it to provide random arrays of a certain shape. But you could write a simple class that converts the .npy or json file and puts it into the iter.next method. If the classes are labeled the you could return a dataset object instead of an array and use dataset.setfeatures and dataset.setlabels instead
private class GaussianIterator implements Iterator<INDArray> {
int width;
int height;
public GaussianIterator(int w, int h){
width = w;
height = h;
}
public void setWidth(int w){
width = w;
}
public void setHeight(int h)
{
height = h;
}
@Override
public boolean hasNext() {
return true;
}
@Override
public INDArray[] next() {
return new INDArray[] {getArr()};
}
private INDArray getArr(){
INDArray ind = Nd4j.rand(minibatch,512,1,1);
//ind = ind.mul(255);
return ind;
}