Oh ok, so the “test set” would in my case be real time data, right?
The data I would read from a CSV would be data from our sensor that I will log this weekend. The data used at run time will be received via said sensor. I sadly can’t go into much detail how exactly this data looks but I can try an abstract version.
We receive messages from a different devices, multiple per second. These messages can be different types, let’s make up some that do not represent what data we really receive: X, Y, and Z.
We start at index i = 0; The sliding window is of size 10.
The array I am filling is created like this:
INDArray array = ND4j.create(10, 4) // 10 samples, 4 fields per sample
When I receive X or Y I set
temp_x_value = message.getData();
or
temp_y_value = message.getData();
When I receive Z messages I set
array.put(i, 0, temp_x_value);
array.put(i, 1, temp_y_value);
array.put(i, 2, message.getData());
i++; // Increase counter when receiving Z messages
now when the sliding window is full, I’d like to pass the data into the network and afterwards compare the tresholds with the output
normalizer.transform(array); // Normalizer is at this stage loaded from disk
INDArray output = model.outputSingle(array); // Do I want outputSingle() ? I need 1x4 array output
boolean valuesAreValid = true;
for (int n = 0; n < FEATURE_COUNT; n++) {
// I bet there is a better way?
valuesAreValid &= output.getDouble(n) <= tresholds.getDoubel(n);
}
After all this I set the index i to 0 again and put a new INDArray in arrays place. In reality we have multiple devices, which can dynamically register and unregister, so the INDArrays are attached to the “instances” of these connections. Something along the lines of:
Message message = receiveMessage();
int id = message.getDeviceId();
INDArray slidingWindow = devides[id].pushMessage(message); // The counting and buffer filling part
if (null != slidingWindow) {
bool deviceValid = detector.detect(slidingWindow); // The preprocessing and evaluation part
if(!deviceValid) {
deviceManager.signalInvalidDevice(id);
}
}
Now I have to admit, I’ve never worked with pandas before and as far as I understand ND4j is designed to be a Java version of said library. I am not sure if what I am doing with the INDArrays is correct.