Decision Transformer

I was wondering if anyone has already tried to build a decision transformer using DL4J? And, or, tried to extend GPT-2, for example, from a Hugging Face model

@kgoderis we import and finetune BERT. I am finishing out the next release which will have a lot of improvements around import and the like. You’d have to build my development branch from source right now though. I need to separate that branch out in to a series of major PRS once I’m done with testing.

Will transformer be available in next realease ? I mean it in the dl4j/samediff way, rather then importing tf/pytorch file.

If it can be imported, it can also be built from scratch, but it may be a bit cumbersome to do so.

You can use the model’s summary() function to see how it is structured and reimplement it on your own if you want to.

what is the name of your dev branch? there are many ab_* branches in existence

It seems like @agibsonccc he mentioned that he wants to pay attention to transformers.

Yeah I need to look at the low level ops there first.
I have mostly been focusing on a dev branch fixing some minor errors we have been having.

We technically have all the transformer ops but I want to add some optimized routines and better pretrained models in samediff.

It will be good if it appears in a future (or slightly later) release.