Reusing SD variables

adonnini · July 29, 2023, 6:44pm

Hi,
I suspect the latest problem I have run into is another case of my not fully understanding SameDiff and misusing it.

I have a created a class to create the multi-head attention layer for encoder and decoder. At one point in my code, I instantiate this class twice as required by the implementation of the decoder layer.

When I do this, not surprisingly, execution of the code fails with
Another variable with the name <name of variable> already exists
error

Is the only solution for this problem to create two multi-head attention classes (within the same SameDiff)? How am I misusing samediff?

What do you suggest I do?

Thanks

agibsonccc · July 30, 2023, 12:19pm

@adonnini you can’t use the same variable twice. You’d need to create a separate variable name. Consider adding a suffix if you want to set the output name. Samediff also has name scopes. Can you tell me a bit more about your use case? If you have a variable you’re trying to use twice, just use sd.getVariable(“yourName”) and that will give you the managed instance from the model.

adonnini · July 30, 2023, 9:02pm

@agibsonccc Thanks. I need to set up self-attention twice once in encoder block, then twice in the decoder block. I am (slowly) trying to implement the “Attention is all you Need” for my application

agibsonccc · July 30, 2023, 10:32pm

@adonnini you might want to check out: GitHub - partarstu/transformers-in-java: Experimental project for AI and NLP based on Transformer Architecture by @partarstu as well

adonnini · July 31, 2023, 10:31am

@agibsonccc I know of @partarstu’s project. I’ve installed it on my system and looked at it in some detail.

At this point, using his code as starting point seems to be more work than starting from scratch.

You make a good point. Before contacting you about this, I took a look at his implementation. If I understand it correctly, he has a full implementation of only the decoder block without encoder input. If I am right, this means that he does not have to deal with the question of two multi head attentions in a single block

But, I could be wrong.

Thanks

agibsonccc · July 31, 2023, 10:59am

@adonnini feel free to use sd.getVariable(“yourName”) if you want access to the variable again and you don’t have a reference to it then.

Topic		Replies	Views
Missing SD variable issue SameDiff	9	213	August 15, 2023
Modifying SD variables SameDiff	9	226	July 20, 2023
Using gradient as an intermediate SDVariable SameDiff	11	392	June 14, 2022
Reduce rank and set values according to 3rd dimension SameDiff	12	315	September 28, 2022
Variable Length Output Layer for FeedForward Network DL4J	2	467	March 5, 2020

Reusing SD variables

Related topics