I copy my same question from StackOverflow here, hoping for a more sure response:
Kindly, forgive that I am a noob in word embedding in general.
I have not only a word to predict similarity, but a list of words sometimes (sentence). All examples of
deeplearning4j train with a list of words but predict with a single word.
I know word2vec is capable for doing complex logical inferences, like what’s most similar to a king and a woman that is not a queen. But for my case, I assume it is just additions, like what the most similar to “a big cat”, it would be a tiger.
INDArray wordVector1 = wordVectors.getWordVectorMatrix(sentence); Collection<String> lst_2 = wordVectors.wordsNearest(wordVector1, 10);
Where sentence is a String. I also tried
Collection<String> lst_2 = wordVectors.wordsNearest(sentence, 10);
But it yields in the first
nullPointerException and in the second an empty array, where I believe it shouldn’t.
I see another signature of
wordVectors#wordsNearest accepting an “INDArray” but I don’t know what that is except that is a high performant data-structure.
Also seeing example here in “Tried using the wordNearest with only positive vectors” snippet, I tried:
Collection<String> lst_2 = wordVectors.wordsNearest(Arrays.asList(sentence.split(" ")), 10);
And org.deeplearning4j version = 0.9.1 is complaining about the type. Like one input is a collection, another collection is needed.
Any hints ?