Quickstart using GPS trajectories file from UCI

@adonnini that’s not how we label things. Each label needs to be 1 per line. Then you specify a label column index.
If you do that then you can use the normal datasetiterators if you want to move away from UCI. Otherwise you still have to make sure that your data has 1 row with features + label.

With modeling I mean the features + label (the label you’re trying to predict). Since you’ve been a bit all over the place and have just been following tutorials it doesn’t appear to me you’ve just picked a doable problem to work on and ensure you can make it work.

To add to my confusion, when readin the documentation you sent me the
link to:

it states:

"
When importing time series data using the class CSVSequenceRecordReader
each line in the data files represents one time step with the earliest
time series observation in the first row (or first row after header if
present) and the most recent observation in the last row of the csv.
Each feature time series is a separate column of the of the csv file.
For example if you have five features in time series, each with 120
observations, and a training & test set of size 53 then there will be
106 input csv files(53 input, 53 labels). The 53 input csv files will
each have five columns and 120 rows. The label csv files will have one
column (the label) and one row.
"
My data is structured as described above.

Should I not use CSVSequenceRecordReader? If not, what should I use to
import the data?

Sorry about this. It would be really helpful if you could send me an
example of what a row in my feature files and in my label files should
look like given the (variables) column headings I have
(id,latitude,longitude,track_id,time).

Thanks you

I have re-read your message below. On second reading it is clearer.

  1. Are you saying that the label is the variable I am trying to predict?
    In my case then, it would be latitude and longitude which I can combine
    using geohash (or OpenLocationCode) into a single variable

  2. Should I not have labels and features in separate files?

  3. Should I move away from UCI? Is there a good reason?

  4. Where can I find a glossary of the terms used in DL4J? Specifically,
    I am looking for a formal definition of label. It’s clear to me that I
    do not quite understand what it means in the DL4J world

  5. It would still be very useful and a BIG time saver if you could send
    me a sample entry for a feature file and a label file using a time
    series record entry sample with my record structure
    (id,latitude,longitude,track_id,time).

  6. Why do you think that the problem of predicting the next location is
    not doable? Is it a question of how I have formulated the problem? A
    problem with my time series data?

Thanks

@adonnini

  1. Yes. That needs to be clear before you even write code. That is what I mentioned was a regression problem. If you want to try classification as a starting point go ahead but be clear you are doing that and do 1 problem at a time. It’s hard to track your code and questions/problems otherwise.

  2. It depends on your dataset iterator. Our default general purpose iterator works off of columns
    indices. The main idea you should have either way is 1 or more labels per line.

  3. UCI isn’t very general purpose. If you want to continue with it go ahead. Just be clear what you’re doing . You’ll also have to own knowing how to customize it.

  4. There’s various documentation out there but there’s a certain point where you just need to know machine learning. As long as you are clear on your problem first (which is what I keep trying to get you to do) I can give you the terminology you need + references. It’s just been hard to help you when you are jumping between different problems with random stack traces and being unclear what you do/don’t know. For now stick to the order I’m telling you:
    define your problem
    understand what your input data points and labels are
    learn how to convert your data for input in to a neural network
    learn how to understand what a neural network outputs (eg: a floating point number or a class label depending on objective)

  5. I can point you in the right direction but please understand that randomly hacking a sample without understanding what it’s doing just makes you harder to help. You’re trying to fly a plane before you can walk. Please follow my instructions above and pick a problem and way of converting your input data.

  6. It’s not necessarily the case. You have just been all over the place and don’t seem to be picking up on my recommendations. I already asked you above to pick a regression problem to work on if that’s what you want to do and no indicator you’ve done so.

Lastly please consider posting your code and I"d be happy to take a quick clone and run it when I get some time. Otherwise I’m stuck guessing what you do/don’t know.

Thanks for your patience and response. Let me try and address the points
you outlined as my next steps.

Your roadmap:

define your problem

FYI. Progress. Changing the label from track id to a lat/lon hash code
expressed as a long made a significant difference. Now, the model is
producing relatively/somewhat meaningful results.
Still a lot of work to do especially on features.

@adonnini great job! It sounds like after you slowed down a bit you were able to assemble something coherent. Thanks for listening and I hope it all goes well.

Thank you Adam. I appreciated your support and ideas greatly. They were
a great help.

I hope you won’t mind. I have a couple of questions:

  1. Is the information on this page still current and valid?

especially the section on saving and loading a network

  1. Perhaps the documentation covers this question. I could not find any
    info on it.
    When I run my network on an Android device I need to feed it input.

Is it as simple as changing the code to update the location of the input
files? Doing this would mean that running the network on a desktop would
fail (file locations being on the android device), and saving the
network would need to happen before running it.

As you can see, I am confused as to how you give the network input when
it runs on a device keeping in mind that the network will be running not
in real-time but frequently.

  1. If my understanding is correct, in a regression network the labels
    are the dependent variables. In training and testing the network, I
    loaded each track point (geohash in long format) as a label which
    technically is not wrong. However, the track points were obtained from
    GPS observations not through regression. I don’t have a way to produce
    next locations as I don’t have the regression formula used by the
    network. What do you think. Where is the flaw in what I am
    thinking/saying about this question of the labels/dependent variables
    and the regression formula?

Thanks,

Alex

@adonnini

  1. Yeah just update the versions there.

  2. Regarding the files an inference pipeline can vary. Real time pipelines usually come from sensor input. If so you’ll need to write that code yourself. You should try to do something more efficient than CSV for this though. Something like sensor → in memory ndarray → input → output

  3. I would focus on being able to predict raw trajectory or direction + distance instead. That should be coordinate invariant. You can start from a base set of coordinates, get the true target and subtract that to get the distance. It should be able to learn a trend that you can apply relative to your variables.

Thanks.

I have already implemented “real-time” processing of blockchain
transactions and execution of my own “next location” algorithm. I’ll use
the same approach for the network. All in-memory.

Based on the documentation and a look at the source code, the input for
the network has to be an ndarray. This means I need to be able to create
one on Android. I’ll look for a library that lets me do that on Android.
Does this sound right?

You are right, initially I should focus on predicting raw trajectory.
Knowing the target point is mostly not possible, the target location is
what ultimately I would like the network to predict.

@adonnini just be careful and make sure you load the network and keep it cached. Otherwise you might be bottlenecked by disk.

Regarding the ndarray and that’s already in the dl4j suite. You’ve been using it the whole time. You can call Nd4j.create etc just like you do on desktop.

If you just have a CSV based pipeline just create a class that encapsulates the input data from a sensor and converts it to an ndarray.

Hi,
I think I may have hit a show stopper. I added all the dependencies
required to run dl4j on Android. After a few hours of work (unusually
long for adding dependencies), I was finally able to build the
application successfully.
The size of the apk blew up to 1.8GB!!!
Not surprisingly, when I tried to install it on the device it failed.
I am pretty much stuck at this point. As I am sure you agree, a 1.8 GB
apk is not acceptable.
From a quick review of the dependencies, it’s pretty clear that adding
all of these (below) causes the problem coupled with the support of
“all” platforms under the sun (windows, android, x86, linux, arm,
arm64,…). I cannot eblieve that ALL inclusive dependencies are really
necessary. Why do I need windows and macos dependencies on an Android
device??

As I think you can guess, I am pretty frustrated by what packaging
choices which in my opinion do not reflect the reality of the Android
world. Unless I did something wrong, I would say that it is not correct
to say that dl4j supports Android. All my work of the past weeks appears
to have been wasted.

By the way, please keep in mind the my app does not just run a neural
network. And. you could not realistically expect that I remove all the
other functions to enable use of dl4j.

And, removing dependencies I do not need on Android devices is
impossible given how dl4j is packaged.

Any ideas for resolving this problem would be greatly appreciated.

Thanks you,

Alex

implementation (group: ‘org.deeplearning4j’, name:
‘deeplearning4j-core’, version: ‘{{page.version}}’) {
exclude group: ‘org.bytedeco’, module: ‘opencv-platform’
exclude group: ‘org.bytedeco’, module: ‘leptonica-platform’
exclude group: ‘org.bytedeco’, module: ‘hdf5-platform’
}
implementation group: ‘org.nd4j’, name: ‘nd4j-native’, version:
‘1.0.0-beta6’
implementation group: ‘org.nd4j’, name: ‘nd4j-native’, version:
‘1.0.0-beta6’, classifier: “android-arm”
implementation group: ‘org.nd4j’, name: ‘nd4j-native’, version:
‘1.0.0-beta6’, classifier: “android-arm64”
implementation group: ‘org.nd4j’, name: ‘nd4j-native’, version:
‘1.0.0-beta6’, classifier: “android-x86”
implementation group: ‘org.nd4j’, name: ‘nd4j-native’, version:
‘1.0.0-beta6’, classifier: “android-x86_64”
implementation group: ‘org.bytedeco’, name: ‘openblas’, version:
‘0.3.7-1.5.2’
implementation group: ‘org.bytedeco’, name: ‘openblas’, version:
‘0.3.7-1.5.2’, classifier: “android-arm”
implementation group: ‘org.bytedeco’, name: ‘openblas’, version:
‘0.3.7-1.5.2’, classifier: “android-arm64”
implementation group: ‘org.bytedeco’, name: ‘openblas’, version:
‘0.3.7-1.5.2’, classifier: “android-x86”
implementation group: ‘org.bytedeco’, name: ‘openblas’, version:
‘0.3.7-1.5.2’, classifier: “android-x86_64”
implementation group: ‘org.bytedeco’, name: ‘opencv’, version: ‘4.1.2-1.5.2’
implementation group: ‘org.bytedeco’, name: ‘opencv’, version:
‘4.1.2-1.5.2’, classifier: “android-arm”
implementation group: ‘org.bytedeco’, name: ‘opencv’, version:
‘4.1.2-1.5.2’, classifier: “android-arm64”
implementation group: ‘org.bytedeco’, name: ‘opencv’, version:
‘4.1.2-1.5.2’, classifier: “android-x86”
implementation group: ‘org.bytedeco’, name: ‘opencv’, version:
‘4.1.2-1.5.2’, classifier: “android-x86_64”
implementation group: ‘org.bytedeco’, name: ‘leptonica’, version:
‘1.78.0-1.5.2’
implementation group: ‘org.bytedeco’, name: ‘leptonica’, version:
‘1.78.0-1.5.2’, classifier: “android-arm”
implementation group: ‘org.bytedeco’, name: ‘leptonica’, version:
‘1.78.0-1.5.2’, classifier: “android-arm64”
implementation group: ‘org.bytedeco’, name: ‘leptonica’, version:
‘1.78.0-1.5.2’, classifier: “android-x86”
implementation group: ‘org.bytedeco’, name: ‘leptonica’, version:
‘1.78.0-1.5.2’, classifier: “android-x86_64”
testimplementation ‘junit:junit:4.12’

@adonnini you’re using 2 year old dependencies. I think I already mentioned not to copy and paste from that page and update your versions. I’d like to point out that your training versions and your deploy version should be the same.

Also, no one’s asking you to strip our all your dependencies just to use dl4j. Please calm down for a second and let’s step through this. I’m aware it’s a bit overwhelming but try to work from basics first.

What size does your APK need to be? What dependencies do you need?

For example, you don’t need deeplearning4j-core. You can also just use deeplearning4j-nn and nd4j-native.

Looking at this because you copy and pasted and didn’t ask you pasted not only 2 year old dependencies but a bunch of extra computer vision dependencies that have nothing to do with your problem.

@adonnini to make this a bit more actionable try just using the relevant classifier. I doubt you need android-x86. You probably just need android-arm64 for 90% of your use cases.

Get rid of all the extra dependencies and use openblas and nd4j-native with the relevant classifiers. Ensure all the versions are up to date. You’ll want M2.1 (like in our examples which you should be using):

Here’s an example build.gradle:

Please try to read this though. The basic example uses computer vision. You don’t need leptonica or

the other dependencies for that.
Use this to understand how to do the classifiers.

@adonnini I also wanted to add in case you run in to size issues using openblas the next release (which should be out here in a few weeks) we also have a new nd4j-minimal backend which strips out openblas from your apk so you can just use what you need which should help with the apk size as well. Let’s step through one bit at a time and make sure you can deploy this first.

Then we can try this as well if we still run in to size issues.

Adam,

You continue to underestimate me.

I did read your messages. I did update versions of all dependencies.

I followed the docs. It claims that ALL dependencies in the list I sent
you (with updated versions) are required, and more depending on the type
of network I want to run.

Perhaps the document should be updated with the information you wrote
in your message below.

I have no idea which ones are really needed. You know dl4j. Which ones
do I need?

I did add

deeplearning4j-nn

So, are deeplearning4j-nn and nd4j-native the only ones I need?

Thank you,

Alex

@adonnini sorry but your build.gradles you’re posting don’t reflect that. I’ve asked multiple times for code or build files directly from your project and you haven’t shown me anything.

If you are indeed using beta6 and 1.5.2 though then those are from the old dependencies. It’s natural I’m going to focus on that.

Yeah you’re definitely right and I’ll spend some time on that apologies.

Your dependencies actually depend on the use case. For computer vision extra ones like opencv are indeed needed.

If you could I’d appreciate a post of your actual build.gradle so I can confirm that’s what you need.

As I mentioned above openblas, nd4j-native with the right classifiers and deeplearning4j-nn should be all you need if you are using a neural network.

Thank you.

“use openblas and nd4j-native with the relevant classifiers” example of
relevant classifier?

So, is this the list of dependencies (updated versions) I need?

  • deeplearning4j-nn

  • nd4j-native
    implementation group: ‘org.nd4j’, name: ‘nd4j-native’, version:
    ‘1.0.0-beta6’, classifier: “android-arm64”

and

  • openblas
    implementation group:‘org.bytedeco’,name:‘openblas’,version:‘0.3.7-1.5.2’
    implementation group: ‘org.bytedeco’, name: ‘openblas’, version:
    ‘0.3.7-1.5.2’, classifier: “android-arm64”

Thank you,

Alex

@adonnini again I already pointed out you’re using 2 year old dependencies. Please use M2.1 and openblas/java cpp 1.5.8. That’s why I linked that build.gradle so you could see our more up to date example there.

Always include non classifier and classifier based dependencies. Classifiers contain platform specific c++ code. The build.gradle I linked above shows you how to do that.

The rest should be correct.

@adonnini

  def dl4jVersion = '1.0.0-M2.1'
    def openblasVersion = '0.3.21-1.5.8'
 implementation  group: 'org.deeplearning4j', name: 'deeplearning4j-nn', version: dl4jVersion
   implementation  group: 'org.nd4j', name: 'nd4j-native', version: dl4jVersion
    implementation  group: 'org.nd4j', name: 'nd4j-native', version:  dl4jVersion, classifier: "android-arm64"

    implementation  group: 'org.bytedeco', name: 'openblas', version: openblasVersion
    implementation  group: 'org.bytedeco', name: 'openblas', version: openblasVersion, classifier: "android-arm64"

This should do it for the updated versions. Please again keep in mind that the classifier such as (android-arm64,android-arm,…) might be needed depending on your use case.

For testing on an emulator you may also need android-x86 for your local testing but not for the deployed version.

I’m not familiar with all the details but you should look in to something like this: 다중 APK 빌드  |  Android 개발자  |  Android Developers

This would allow you to set something up where you can split dependencies by platfom and reduce the overall size of the apk for each platform.