I suspect the latest problem I have run into is another case of my not fully understanding SameDiff and misusing it.
I have a created a class to create the multi-head attention layer for encoder and decoder. At one point in my code, I instantiate this class twice as required by the implementation of the decoder layer.
When I do this, not surprisingly, execution of the code fails with
Another variable with the name <name of variable> already exists
Is the only solution for this problem to create two multi-head attention classes (within the same SameDiff)? How am I misusing samediff?
What do you suggest I do?
@adonnini you can’t use the same variable twice. You’d need to create a separate variable name. Consider adding a suffix if you want to set the output name. Samediff also has name scopes. Can you tell me a bit more about your use case? If you have a variable you’re trying to use twice, just use sd.getVariable(“yourName”) and that will give you the managed instance from the model.
@agibsonccc Thanks. I need to set up self-attention twice once in encoder block, then twice in the decoder block. I am (slowly) trying to implement the “Attention is all you Need” for my application
@agibsonccc I know of @partarstu’s project. I’ve installed it on my system and looked at it in some detail.
At this point, using his code as starting point seems to be more work than starting from scratch.
You make a good point. Before contacting you about this, I took a look at his implementation. If I understand it correctly, he has a full implementation of only the decoder block without encoder input. If I am right, this means that he does not have to deal with the question of two multi head attentions in a single block
But, I could be wrong.
@adonnini feel free to use sd.getVariable(“yourName”) if you want access to the variable again and you don’t have a reference to it then.