Reading CSV file with Strings


I am very new to deeplearning4j, however, I wasn’t able to find anything about this on Google.
I am reading a CSV file, get its data as a RecordReaderDataSetIterator and iterate over this. However, a column of the CSV contains strings such as “Male” and “Female”.
While iterating, it throws an Exception -> “value is non-numeric”.

RecordReader recordReader = new CSVRecordReader(0, ',');
recordReader.initialize(new FileSplit(new File("C://Users//Username//Desktop//Vocab Train.csv")));
int labelIndex = 0;
int numClasses = 10;
int batchSize = 24280;

RecordReaderDataSetIterator iterator = new RecordReaderDataSetIterator(recordReader,batchSize,labelIndex,numClasses);

DataSet allData =;

How do I convert the string in this file to a numeric value and get a dataset as output?

Thank you very much in advance.

Download the examples git. There is a simple classifier example that covers this exact challenge.

package org.deeplearning4j.examples.dataexample.BasicCSVClassifier