I started playing around with deeplearning4j. I want to load a pretrained word2vec model (in Dutch) and do some tests. Loading the model (binary, 1 Gb) took my system over two hours, which is way too long, I presume?
This the code (in Eclipse). The line with the invoke-statement takes over two hours:
Your dependencies look weird, why do you have an explicit dependency on nd4j-buffer? Why are you mixing cuda versions?
Where are you reading the data from? What kind of storage is it?
Even though your pom.xml says beta6, are you entirely sure you aren’t somehow on beta7 (tried to downgrade after seeing something like this: Beta 7 - Glove Word Vector, but Eclipse didn’t properly pick up on that)
Typically loading a 1GB binary takes about as long as it takes to read the file - 2 hours obviously is way too long.
If you can, running your application with a profiler should also shed some light into why it takes so long to load it.
Thanks a lot for looking into my situation. My answers:
What do you mean with ‘reflection’? And where am I using that?
Mixed Cuda versions, you’re right. I checked my version and took 10.1 out. I also deleted the nd4j-buffer dependency.
I’m reading from a file that I downloaded from http://vectors.nlpl.eu/repository/ (tested both the Dutch .bin and .txt downloads, same long loading time). I assumed that I can use these files also for deeplearning4j. If not: do you know of a Duch model that is suited? I understand that the model from Google news is English only.
How can I check “where I’m on”?
NB. I also tested this line for loading the model, with the same effect: