Thanks. The clarification regarding time steps is very helpful.
I attempted to change then number of time steps in the definition if the
inputINDArray from 1 to 3. Execution failed with the following error:
“org.nd4j.linalg.exception.ND4JIllegalStateException: Shape of the new
array [35112, 5, 3] doesn’t match data length: 175560 - prod(shape) must
equal the number of values provided”
I am not sure I understand. Could you help me understand?
Thanks for the example of a sequence iterator.
With regards to the results I am getting from running the neural network
in my app, any suggestions as to how I can interpret it?
The documentation for output(INDArray) states:
" /**
* Perform inference on the provided input/features - i.e., perform
forward pass using the provided input/features
* and return the output of the final layer. Equivalent to {@link #output(INDArray, boolean)} with train=false - i.e.,
* this method is used for inference.
@adonnini ndarrays are a fixed length.Reshape is used in certain parts of the network to manage the inputs being the correct shape to make the right computation. The shape here being something like rows x columns.
Your data probably doesn’t have 3 time steps in it. You have to make sure the input data shape is consistent with the amount of data you’re giving it. The created array probably doesn’t have that shape. You should double check the input shapes that are created from the iterator.
@adonnini you’ll want to normalize your labels. Usually data from 0 to 1 is the best kind of output for a neural network. You can train a normalizer on your data.
MultiNormalizerMinMaxScaler SUT = new MultiNormalizerMinMaxScaler();
SUT.fitLabel(true);
MultiDataSet transformed = ...;
SUT.fit(data);
//transform data
SUT.preProcess(transformed);
INDArray reverted = transformed.getLabels(0).dup();
SUT.revertLabels(reverted, null, 0);
This shows how to train a normalizer on your data as well as your labels.
You’ll want to use this to pre process your data before putting it in to a neural network.
I just set up a repository I invited you to. It contains both the neural
network creation/training/testing code I un on my desktop, and the code
I added to my application to run the network.
I thought it was time given what you wrote below. That’s code I already
implemented in a slightly different form:
NormalizerMinMaxScaler normalizer = new
NormalizerMinMaxScaler(0, 1);
normalizer.fitLabel(true);
normalizer.fit(trainData); //Collect training data
statistics
trainData.reset();
I think the main difference is that I did not use the Multi… version
of the functions. I am not sure it makes a difference. Please let me know.
Let me know if you did not receive the invitation to the repository.
At a high level, what is the difference between the “Multi” and
non-Multi" versions of interfaces and classes. For example, what is the
difference between MultiDataSetIterator and DataSetIterator. I looked at
the code for both. There are differences but I am not sure what they
mean when the two are used in an application.
@adonnini you only need the singular one most of the time. “Multi” is for multi input neural networks. More complex ones that take in say: images and text all at once. Only harder problems really need that.
I have been reading about autoencoders. Specifically, I have read a few
times this:
I thought the approach it describes might be applicable to my problem
because of the input data (latitude, longitude, timestamp).
As I read it a few times over, I wasn’t so sure. Specifically, it’s not
at all clear what the purpose/output of the autoencoder whose
implementation is described in the tutorial is. I found this sentence
“we can train a network to compress a trajectoryof a ship using a
seq2seq autoencoder,…”
particularly puzzling. What’s the value of a network whose output is a
“compressed trajectory”? What is actually meant by a "compressed trajectory?
What does the network do to “compress” trajectories how does it do it?
Perhaps, I missed the part of the tutorial where the above is expanded
upon. Perhaps, I should read about basic auto encoders. Not sure.
@adonnini that’s just for using a neural net’s representation of data (think of it as a compressed state) for your data. You can then use that in clustering or as part of a classifier? Usually what’s called representation learning here is used to replace some sort of manual feature engineering. For your 5 columns though it’s not really worth it.
I think I got the answer to my question about the prupose/goal of the
network described oin the advanced autoencoder tutorial by reading the
basic autoencoder documentation where it states:
“In deep learning, an autoencoder is a neural network that “attempts” to
reconstruct its input. It can serve as a form of feature extraction, and
autoencoders can be stacked to create “deep” networks. Features
generated by an autoencoder can be fed into other algorithms for
classification, clustering, and anomaly detection.”
So, they are not something I can use for my application.
Thanks. I understand. I found a good explanation in the simple
autoencoder documentation page.
I have not been able to find a way to take the output produced by the
network and “translate” it to longitude,latitude pair geohashes. Any
pointers? If I am not able to understand the output I can’t figure out
where I may be going wrong in my input data processing and in my network
definition.
Do you think it would be worthwhile adding the UI to my network
implementation
@adonnini are you trying to use all 3 columns as inputs and outputs? Usually what you do with regression based neural networks is it outputs normalized values thatn you then “restore” using your normalizer.
I’m not sure if geohash will work very well with normalization though.
No, I am using the columns in the feature files as my independent
variables and the the columns in the label files as my dependent variable.
Perhaps, I am making a mistake. What do you think? Why would I want to
have the dependent variable in the same file as the dependent variable?
I know that that is one possible approach . I preferred keeping
dependent and independent variables separate.
With regards to restoring output of thenetwork, then I should simply use
the normalizer I used when setting up the network, right?
After a lot of work in the DMs we’ve come up with a working solution:
Use NormalizeStandardize on input data. This allows for a better set of data for the network to learn from
Use lat/long for predictions. Goehashes didn’t work as well.
Use a simple LSTM architecture for the sequence prediction.
Ensure that we allow more than 1 label. Before we had a misunderstanding about lstms only allowing 1 label for prediction when that is really just a shortcut for most classification problems.
I am running (again) into apk size issues. I added datavec-api,
deeplearning4j-datasets and deeplearning4j-core (see below for details)
to my app on Android. Size ballooned from ~471 MB to more than 1.6 GB!
Gradle and maven both have the concept of transitive dependencies. Any developer using these tools should know what that is. If you look at the transitive dependencies of those things you’ll see deeplearning4j-core just brings in computer vision based dependencies. That means that you’ll be bringing in dependencies for every platform. There’s no reason for you to do that.
Also, please don’t bring in deeplearning4j-datasets and the like. It’s not needed in any production app. I’m not sure why you added that but there’s no reason to add anything you did.