Transformer/Attention-NLP

Hello,

Can you please point me to end to end examples that would use transformer layer and multihead attention with dl4j.

Thanks
Ravi.

Trying to build following example: Text classification with Transformer using dl4j

All attention models can be found here:

I would also suggest running keras import to see if that fits your use case:
https://deeplearning4j.konduit.ai/keras-import/overview

1 Like

Thanks for the response - I am trying to build attention/transformer natively using java only. I couldn’t find any example that would show how to do that.
Are you suggesting/encouraging that attention models be implemented in keras, and loaded as keras layer in dl4j something like a hybrid approach?

2 Likes

@data-llectual (sorry just catching up on the forum a bit and just realized I didn’t get a notification for this, I’ll just answer this for future readers), your best bet in this case would be to look at some of our test cases:

We need to build out docs, but there are actually layers for this:

1 Like