Memory-efficient graph construction

Hi there. I’ve just got into a problem with building an SD-graph which uses a huge adjacent matrix (graph-related network, in my case it has almost 1 bln. elements). I try to trigger the node activation for the batch in a loop, similar as in a recurrent network (it’s not a conventional way, but still). Each iteration creates thus variables specific to it (let’s say variable_0, variable_1 etc.). In my case the adjacent matrix is a variable and I can’t get a decent amount of iterations, because for each iteration I use the Gather Op in order to fetch the needed indexes from the adjacent matrix and this Op creates automatically the gradients variable which is of the same size as the adjacent matrix itself for each iteration. Is there any way to fetch those indexes (they are provided as a placeholder value as a 2D matrix) for each iteration so that there won’t be any duplication of gradients variables for the adjacent matrix? I saw the SDVariable methods like getView() but I have no idea if it helps me somehow and if it’s better than Gather Op

Thanks a lot in advance!

@partarstu feel free to file an issue. There might need to be a special op we implement for that.

There might be a batch mode we could also do there. That’s what we allow for skipgram/cbow in paragraph vectors.

@agibsonccc, thanks! I’ll create a ticket next week with a corresponding description.

For now I’ve managed to change the workflow by first splitting the matrix into rows and then using SDVariable::get as an alternative for Gather Op. It changed the memory footprint in the way that now the number of recurrent steps almost doesn’t influence amount of used RAM (which is an opposite to what happened while using Gather Op) - but now I have the problem with the batch size, increasing it even a little makes a surprising RAM usage hop - which obviously is related to using the combination of SameDiff.split() and SDVariable::get (the amount of split/merged rows is equal to the batch size).