Shape of weight

sameDiff.loss.softmaxCrossEntropy(“loss”, label=[8,128,6], out=[8,128,6], weight=[?,?])

What’t the shape of the weight should be here?

output=[miniBatch, sentenceLength, positionCharLabels], What’t the shape of the weights should be then?

@SidneyLann generally it’s 1 weight per class output. Weights are also broadcastable.

But the miniBatch is dynamic and the weights are static, how to weight the sentences?

@SidneyLann so you want per example weights. We only support output oriented weights which you can find here:

You can approximate this with output based weighting + resampling which we do support so in that case it would be the same shape as the output.

weights = [128, 6] and loss = [8, 128] → not work
weights = [6, 8] and loss = [8, 128] → not work
weights = [1024] and loss = [1024] → work

So the weights must be scalar or rank ==1?

@SidneyLann are you saying weights that don’t have the same shape as the output? No it doesn’t have to only be scalar. I gave you the source code there and you can see we support different shapes. It sounds like the labels are the wrong shape and the weights are mis calculated. Could you give me something end to end where you actually try what I suggested with the weights as a placeholder?

sameDiff.loss.softmaxCrossEntropy(“loss”, label=[8,128,6], out=[miniBatch, sentenceLength, positionCharLabels]=[8,128,6], weight=[128,6])
===>
shapes of weights and loss arrays should be broadcastable, but got weights = [128, 6] and loss = [8, 128] instead!

sameDiff.loss.softmaxCrossEntropy(“loss”, label=[8,128,6], out=[8,128,6], weight=[6, 8])
===>
shapes of weights and loss arrays should be broadcastable, but got weights = [6, 8] and loss = [8, 128] instead!

sameDiff.loss.softmaxCrossEntropy(“loss”, label=[1024,6], out=[1024,6], weight=[1024])
===> NO ERROR

Such then the weights must be set based on miniBatch, how to set it not based on the miniBatch? I want to set static weight, not dynamic.

Can weight fixed sentences, but can’t weight N sentences, where N is dynamic.

@SidneyLann could you clarify? The 1 sentence replies/answers don’t really help me much. I told you a way of passing in dynamic weights and I don’t have any indication you actually tried it.
You use weights as placeholders alongside inputs and labels just like you would for training.

1 Like

sameDiff.loss.softmaxCrossEntropy(“loss”, label=[8,128,6], out=[8,128,6], weight=[8, 128])

This line can be run in SNAPSHOT now. Thanks.

But this should weight every char in a sentence(128), not weight the labels(6). Am I right?

@SidneyLann if the labels are the characters then yes that should be fine.

The labels are not the characters, a char has 6 possible labels, I want to weight the labels, below settings should work?
label=[8,128,6], out=[8,128,6], weight=[8, 128]