I’m using the widely known Google vector model for sentiment analysis along with a CNN neural network.
It’s working after a lot of effort and it’s working fine. However, for unit testing, I’d like to use a small part of that file hence a subset. I tried with gensim but the file was not correct, the number of words and the size were not correctly got when I try to use the load static method.
So, is there a way to use a small subset of that file using DL4J?
What exactly have you tried? For the text file based word vectors, you can easily reduce its size by removing the lines that you don’t need.
Hi, @treo thanks for answering.
I solved my problem by editing the file manually to add the number of words and the size of the vectors. However, there is something curious about gensim: even although I say to gensim to use only 200 words, it took 184; I don’t understand why, but that was the issue.
Thanks for your help, @treo