Correct resetable DataSetIterator for a Dataset?

i’ve been using new IteratorDataSetIterator(trainingData.iterator(),batchSize) with my models up till now which has been fine but now I am using a much larger training set of 1 million examples which is too big to normalize in one go by running normalizer.fit(trainingData) as I run out of GPU memory>

Exception in thread “main” java.lang.RuntimeException: Memory allocation for tadOffsets failed; Error code: [2]
at org.nd4j.linalg.jcublas.ops.executioner.CudaExecutioner.tadShapeInfoAndOffsets(CudaExecutioner.java:2179)
at org.nd4j.jita.allocator.tad.BasicTADManager.getTADOnlyShapeInfo(BasicTADManager.java:49)
at org.nd4j.jita.allocator.tad.DeviceTADManager.getTADOnlyShapeInfo(DeviceTADManager.java:89)
at org.nd4j.linalg.jcublas.ops.executioner.CudaExecutioner.exec(CudaExecutioner.java:149)
at org.nd4j.linalg.api.ops.executioner.DefaultOpExecutioner.execAndReturn(DefaultOpExecutioner.java:173)
at org.nd4j.linalg.dataset.api.preprocessor.StandardizeStrategy.preProcess(StandardizeStrategy.java:56)
at org.nd4j.linalg.dataset.api.preprocessor.StandardizeStrategy.preProcess(StandardizeStrategy.java:38)
at org.nd4j.linalg.dataset.api.preprocessor.AbstractDataSetNormalizer.transform(AbstractDataSetNormalizer.java:163)
at org.nd4j.linalg.dataset.api.preprocessor.AbstractDataSetNormalizer.preProcess(AbstractDataSetNormalizer.java:131)
at org.nd4j.linalg.dataset.api.preprocessor.AbstractDataSetNormalizer.transform(AbstractDataSetNormalizer.java:142)
at org.nd4j.linalg.dataset.api.preprocessor.AbstractDataSetNormalizer.transform(AbstractDataSetNormalizer.java:36)

Unfortunatly neither IteratorDataSetIterator nor trainingData.iterateWithMiniBatches() works with normalizer.fit as they are not resetable

Exception in thread “main” java.lang.NullPointerException: Cannot invoke “org.nd4j.linalg.dataset.api.iterator.DataSetIterator.reset()” because “iterator” is null
at org.nd4j.linalg.dataset.api.preprocessor.AbstractDataSetNormalizer.fit(AbstractDataSetNormalizer.java:107)

i’ve been looking at all the DataSetIterators available at DataSetIterator (deeplearning4j 1.0.0-beta7 API)

but can’t seem to find one that is applicable for just a vanilla Dataset.

Any recommendations?

on a side note, I’ve noticed that when I save my normalizer settings after using fit() using >
StandardizeSerializerStrategy ns = new StandardizeSerializerStrategy();
try {
ns.write(normalizer, new FileOutputStream(new File("/normalizeData.data")));
} catch (IOException e) {
e.printStackTrace();
}

the resulting file is only 267 bytes! is this how large its meant to be?

Thanks in advance