Modifying UCI Example

treo · May 20, 2020, 4:28pm

I think the big misunderstanding comes from the way the original data looks:
The original data is formatted as 600 rows.

At UCISequenceClassificationExample line 180 those 600 rows are split.

The for loop in line 193 to 198 then does two things:

it transposes each row into a column by replacing the whitespace between the numbers with new lines
it uses the fact that integer division results only in integer numbers, and that way creates a pair of column and label

Then the shuffle on line 201 shuffles those pairs, because the data is going to be read in linear order later on, and we want shuffled batches.

Finally, the for loop in line 206 to 222 writes the data to output files, when enough training data has been written it writes the test data. For each pair it writes two files:

train/features/#.csv contains several lines of numbers, with each line being a single timestep in the sequence with just a single feature
train/labels/#.csv contains just a single number on a single line, the label for the whole sequence

This is then later on read by CSVSequenceRecordReader and joined in such a way that the label aligns with the end of the sequence.

As far as I can tell, it works exactly as it should, and you have already achieved that goal.

Topic		Replies	Views
Quickstart using GPS trajectories file from UCI DL4J	109	1069	May 15, 2023
DL4J Need help with my input data DL4J	5	460	December 25, 2020
Beginner's question DL4J	2	438	May 13, 2020
Saving trained neural nets in csv or txt format	17	916	April 14, 2020
Basic deeplearning4j classification example DL4J	4	1000	February 3, 2020

Modifying UCI Example

Related topics