Using NLP with DL4J

I am looking to build my own NLP model and potentially combine with others later that have been created from say tensorflow etc. Support for this looks good and well.
My main questions and coming here is to look for advice and understanding of choose DL4J which I have used for OCR in the past. But now that I am doing heavy related NLP work.
Particularly looking to use BERT implementation, that whether Deeplearning4j is a platform which I can count on or rather is a good choice for this among other competitive platforms. Top competitors that I am trying to get feedback as to whether I should or shouldn’t use DL4J would be sparkNLP and spacy. Would I be woefully efforted in going with this library for the things I have mentioned? Can someone give me some good context here? Maybe massive tradeoffs or blockers they see, or just too much having to reinvent the wheel if I do go with DL4J for this effort. I appreciate responses from anyone experienced with these goals in the past.

@thekevshow Spacy definitely has a lot more built in. DL4J is lower level and is more meant for people who want to build their own custom setups. I would strongly recommend using spacy if you have the choice.
SparkNLP is heavily focused on spark and is usable from python. That spark dependency may or may not be favorable for you.

DL4J has a clear focus on a few things:

  1. Interop with the python ecosystem (python4j, model import of keras, tensorflow and pytorch/onnx models, natively parsing numpy arrays with no additional overhead)
  2. Deployability: ahead of time compilation with graalvm, easily deploy in your favorite java environments
  3. Simple and fast: We try to make it easy to deploy models and train your own custom models while providing a thin layer of interop over different backends like onednn, cudnn, various blas libraries (cublas, mkl, openblas,armcompute) while handling packaging.

We had a lot more but the scope has been reduced to focus on what has traction within the ecosystem.
That usually comes down to:

  1. Android deployment
  2. Micro service development
  3. Custom NLP (People building their own models and pipelines)
1 Like

So for point 3, I am all for this effort. I really appreciate your response. Definitely helping me weigh the options. This is definitely a model I will want a good amount of control over, it’s medical related data in what it will be performing on. I am trying to ultimately build my own model. I have been trying to avoid sparkNLP as they are very clearly trying to monetize their models and what not. Building a platform with exclusivity for payment and service payment etc and private models. Which isn’t a matter or cost to use it per say, but you can’t even see how well it performs etc, on their model page of what is available. So I guess to point 3. Do you see any limitations with using DL4J, even if I there is more effort here to get a pipeline setup along with other processes to making the model vs other platforms. I also feel DL4J is just on the right path with documentation and other things that are quite messy in other frameworks. I ultimately am a huge fan of it from the use cases I had before with my YOLO OCR model some years ago. And again noticed that the NLP stuff seemed to have come some ways since then as well. Sorry if this question seems redundant. I will most definitely be sinking a large amount of time into this project. And am just trying to get a foot hold, and just some unbias opinions on whether again, it would be a woeful attempt, or it would be abusing DL4J for something it really has short comings in.

@thekevshow no not at all! Actually I was looking at an importer for their models.
Depending on what you’re looking to do something there might be possible.

I’m here to support conversion use cases as well.

As DL4J is part of the eclipse foundation it prevents relicensing and other governance problems you might run in to with commercialized models. Our monetization is mainly through consulting and building products on top but the core won’t be touched just improved.

I can say you’d have flexibility and support to build what you’re looking for. I would investigate our samediff api and model import interface to see if it’s flexible enough for you. That should be your entry point for the newer models.

Ping me if you decide to do a proof of concept.

1 Like

Absolutely, I am going to be spending tonight researching a bit. And hopefully a large portion tomorrow looking into what I view as the structure of how it will look. I will keep you in the loop :).