Pointers how to build a recommender for binary data

Hi, I started reading the documentation - and there’s quite a lot of interesting things to read… :slight_smile:

As I go, I would like to start building a recommender that comes close to my domain (since this makes learning much easier). However, I could not find an example that comes close to my setup, and I wonder if someone here in the community can help out with a link to an example or maybe a starter-snippet?

My simplified setup is as follows:

I’ve binary input-data. From that input I’d like to compute binary recommendations for a fixed set of options. There can be multiple recommendations, and I would like to compute the probability of each recommendation (or at least want to rank them).

How can i do that with dl4j?

Many thanks in advance!

Generally recommendation engines can come in many flavors. Document recommendation is very different than user purchases. What is your “scenario” exactly?

I’ve several different applications in mind. The one I’d like to start with is to recommend code completions inside an IDE (“given the code the developer is currently working on, which methods is he likely to use next?)”.

More concrete: Given that a user triggers code completion on a variable of Type t inside method m, which method is he likely using next? (where “next” can be interpreted literally but it would be suffice (for now) to recommend which methods he’s likely to use in principal inside that method). These slides may illustrate the idea better.

Input can be many things, e.g., the name of the enclosing method, whether it’s overridden or not, which (types of) variables have been declared in the current method body, which other methods have been used already etc. etc. The recommenders then should output which methods invocable on a variable of type t the developer is likely to use inside that method.

So, the input are typically binary (sometime unary) facts derived from source code. The output are members of a type, binary as well.

Does that explain my current use case well enough?

You might want to come back after looking in to the feature engineering a bit. Like how do you tokenize and do the feature engineering? From there, it could come down to a wide variety of techniques. Start with the simplest neural net you think you can. I would also pick a smaller problem to start. Like pick a language and see if you can make specific assumptions about that. I would also look in to modeling with something simpler like language models. Like can you predict what they are most likely to type just based on something simpler like that first?

Thanks for getting back.

You might want to come back after looking in to the feature engineering a bit

I’m not sure which part of my question makes you think I don’t understand my feature space. Maybe you can help me out here? FWIW: I’m trying to map something that already works well with other algorithms to DL4J-based neural networks.

Primarily, I’m seeking advice on how to build networks that process binary input data and produces recommendations as the ones I mentioned before.

Second, I’m looking for advice how to train a network when the inputs and the outputs are from the same feature space (in my case ‘methods called’, but it’s is basically similar to binary movie ratings).

Can you help out here?