we are working on improving the documentation, and we’d like to collect questions that you want us to address.
So if you have any questions that the documentation isn’t addressing currently, or that you find aren’t adequately addressed, please list them here, so we can take them into consideration when creating the new docs.
Just some questions that i had trouble figuring out just by reading the DL4J docs and maybe needs some clarifications (and maybe i don’t fully get because i just started learning DL4J):
DL4J IN GENERAL
If my data are not in CSVs file what is the best way to build record readers (and sequence record readers)?. Let’s suppose they come from somewhere else (a database, an HTTP service…) in the end i would have data that are lists of numbers (Integer, Doubles, BigDecimals…). For classification i need to build datasets where each element is a list [feature; label] and for regression i need to build datasets from “ordered lists” where each element is itself a list that contains [nth_element; nth+1_element]. Using InMemoryRecordReader could be a solution for small datasets but for large datasets i can’t load all the data in memory.
If i can’t load all the data in memory (and usually i can’t if i use a big dataset) i can create a class that implements RecordReader or SequenceRecordReader and implement respectively their methods List<Writable> next() and List<List<Writable>> sequenceRecord(). In a “naive” approach, i can implement them to produce lists of [feature; label] (or lists of lists of [feature; label] where in regression “feature” is the nth element and “label” is n+1th element) and pass my reader as a parameter for a record reader dataset iterator (example using SequenceRecordReaderDataSetIterator(SequenceRecordReader reader, int miniBatchSize, int numPossibleLabels, int labelIndex, boolean regression)) but in this way i loose some “benefits” that other already implemented record readers supports (batch processing, distributed learning, alignment of variable lenght timeseries…). Maybe there could be more examples in building custom record readers.
If my “elements” are complex types (eg. classes that contains multiple fields) what is the best way to use datavec ETL to transform a list (or multiple lists) of theese elements into a usable DataSet? Especially in the classification problem what is the way to choose parameters (maybe if i don’t know a priori the number of classes and so i cannot estimate them or run other classification algorithms like K-means to have a rough starting point).
Once my model is trained and saved (maybe in a file) what is the best way to just “use it in a black box way” eg. pass an input to the network and get the predicted output in response (eg. i trained an image classifier, i download an image from the internet and i expect that the network classify it outputting a probability vector that i can use to print a single string or maybe the top 3 predicted classes and so on). Should i each time “reinitialize” the network with some training data and then pass my new input? Especially for regression with RNN some more examples could be very helpful
RNNs
In the case of using a SequenceRecordReaderDataSetIterator that produces multiple time series of variable lenght what is the best way to normalize them? I tried to use two record reader one for the features and one for the labels, and used SequenceRecordReaderDataSetIterator(SequenceRecordReader featuresReader, SequenceRecordReader labels, int miniBatchSize, int numPossibleLabels, boolean regression, AlignmentMode alignmentMode) passing AlignementMode.ALIGN_END. Is this the right way? Especially in the case of custom record readers maybe some more examples could be great
Based on examples SingleTimestepRegressionExample and MultiTimestepRegressionExample when i train a RNN for regression at the end of each epoch i should use a RegressionEvaluation to evaluate the differences between predicted data and “real” test data and using the result to further train the network. This is fine but i can’t see a way to “automate” this process using an EarlyStoppingTrainer. If this is possible maybe there could be an example for this?
At last (but this is a matter of a personal preference) to me the tutorials could be organized more clearly to introduce the user in the “DL4J workflow” (“point” at the dataset → extract and transform the fields → normalize inputs → choose what kind of problem i would like to solve → build the corresponding network architecture and configure it properly → train the network → evaluate results). Maybe instead of using the historical “built-in” dataset iterators (MNIST, Iris and so on) using the custom record readers from before could be more helpful in understanding the DL4J specific concepts. Then after the tutorials there could be the sections that goes in deep (no pun intended) with more technical details: common types of layers (FF, CNN and RNN) and their “subtypes”, different activation and loss functions with a description of their purposes, advantages and disadvantages, and after this present the Computational Graph that to me is more of an “advanced” concept and then go full blown with the details on ND4J, DataVec and Samediff that usually new users are not interested in.
For some of these things maybe it’s sufficient to provide links to external sources (eg. Wikipedia for activation functions).
Just as a suggestion there could be in the tutorials a “common thread”, for example a data center that have S servers each with C cpus and hourly they produce a report of load average per cpu (L) and users connected (U) and from that the tutorials starts to introduce problems gradually more complex:
Classify servers for which i have no history (i have but i use it as test data) in classes like “overused”, “underused” or “normal load” based on all history of other servers
Predict if a server is going to be “overused” in the next T hour based on the last few reports
Predict how many servers are going to be “overused” in the next T hour based on the last few reports from that servers
Doing step 3 but supposing that the servers are commissioned at different times so their report sequences have variable lenght
Predict the load of a server (network output is not a class but the actual number) for the next T hours based on the history of the last N hours
And so on…
The datacenter is just an example any scenario could be used (shipping company, stock prices, diseases in a country…)
Sorry for the long post, theese are just my thoughts, hope they could be useful for discussion.
RL4j
1.More docs about RL,some introduction to RL can be added in tutorial
2.Example about how to define custom RL problem using RL4j(including defining MDP,action space,…)
@DarioArena87
Thank you for the list. This addresses many of the difficulties I am facing getting to adopt DL4J.
For example, if I am reading the data from remote web services, or nosql DB, do I need to export them first to CSV files ? Additionally, the shape of the data for each model is not very clear. For example, https://deeplearning4j.konduit.ai/models/recurrent describes the shapes for both Feedforward and RNN, but it doesn’t show any example of what is meant by “example”, “feature”, timeStep … etc.
What does the [numExamples,inputSize,timeSeriesLength] mean ? A simple table with some data may be a lot of help to understand what is going on.
Hey folks, JFYI: we’ll be factoring all of this feedback in to the redo of the website. It’ll take a bit to get all of this processed (there’s still a bit of backlog to do with respect to the release notes among other things) but we’re about there.
Thanks again for being patient as we get things on track here.
Appreciate the feedback: we’ll try to break the docs up in to sections based on use case and platform so the advanced info isn’t the only thing up there. If people have any other suggestions, please let us know. We will have to overhaul the docs in phases.
Building a model from scratch including shaping and reshaping the training data. Currently shaping the data is just magic, where many files are involved.
We’re waiting for a day or 2 to switch the default. Please let us know if you would like something added. Otherwise, we’ll be adding new tutorials and other things over the next week or so to cover the new features.
The link would have just taken you a few paragraphs below. I’ve fixed it now.
I’m a bit annoyed that this kind of simple same page links still got broken on the last documentation reorganization, so I fully appreciate it when people highlight those broken links so we can fix it.