Using gradient as an intermediate SDVariable

valb3r · June 3, 2022, 6:06am

How is it possible to use gradient as intermediate variable in SameDiff?

Problem: I need to have SameDiff layer that has gradient of previous layers as its input, currently it does not seem to be possible due to

Exception in thread "main" java.lang.IllegalStateException
	at org.nd4j.common.base.Preconditions.checkState(Preconditions.java:253)
	at org.nd4j.autodiff.util.SameDiffUtils.validateDifferentialFunctionSameDiff(SameDiffUtils.java:134)
	at org.nd4j.linalg.api.ops.BaseReduceOp.<init>(BaseReduceOp.java:85)
	at org.nd4j.linalg.api.ops.BaseReduceOp.<init>(BaseReduceOp.java:114)

Code sample of the issue:

package com.valb3r.idr.networks;

import org.nd4j.autodiff.samediff.SDVariable;
import org.nd4j.autodiff.samediff.SameDiff;
import org.nd4j.linalg.api.buffer.DataType;
import org.nd4j.weightinit.impl.XavierInitScheme;

public class Issue {

    public static void main(String[] args) {
        SameDiff sd = SameDiff.create();
        //Create input and label variables
        SDVariable sdfPoint = sd.placeHolder("point", DataType.FLOAT, -1, 3); //Shape: [?, 3]
        SDVariable ray = sd.placeHolder("ray", DataType.FLOAT, -1, 3); //Shape: [?, 3]
        SDVariable expectedColor = sd.placeHolder("expected-color", DataType.FLOAT, -1, 3); //Shape: [?, 3]

        SDVariable sdfInput = denseLayer(sd, 10, 3, sdfPoint);
        SDVariable sdf = denseLayer(sd, 3, 10, sdfInput);
        sdf.markAsLoss();

        SDVariable idrRenderGradient = sd.grad(sdfPoint.name());
        SDVariable dotGrad = idrRenderGradient.dot(ray); // org.nd4j.autodiff.util.SameDiffUtils.validateDifferentialFunctionSameDiff(SameDiffUtils.java:134)

        sd.loss().meanSquaredError(expectedColor, dotGrad, null);
    }

    private static SDVariable denseLayer(SameDiff sd, int nOut, int nIn, SDVariable input) {
        SDVariable w = sd.var(input.name() + "-w1", new XavierInitScheme('c', nIn, nOut), DataType.FLOAT, nIn, nOut);
        SDVariable b = sd.zero(input.name() + "-b1", 1, nOut);
        SDVariable z = input.mmul(w).add(b);
        return sd.nn().tanh(z);
    }
}

Is it at all possible or should I use detached gradient via calculateGradientsAndOutputs as the only option?

agibsonccc · June 4, 2022, 1:54pm

@valb3r that means that the variable you are passing in is not attached to the right samediff instance. Let’s focus on that first.

That doesn’t necessarily prevent anything it just means you need to manage references of the variables you’re passing around.

Make sure to just use samediff here. Could you elaborate on what denseLayer is? Thanks!

valb3r · June 5, 2022, 7:06am

@agibsonccc
Sorry, not really sure I understand the question as denseLayer implementation is provided below in the code block
Here is self-contained repository to reproduce the issue:

github.com

valb3r/same-diff/blob/master/src/main/java/com/valb3r/idr/networks/Issue.java

package com.valb3r.idr.networks;

import org.nd4j.autodiff.samediff.SDVariable;
import org.nd4j.autodiff.samediff.SameDiff;
import org.nd4j.linalg.api.buffer.DataType;
import org.nd4j.weightinit.impl.XavierInitScheme;

public class Issue {

    public static void main(String[] args) {
        SameDiff sd = SameDiff.create();
        //Create input and label variables
        SDVariable sdfPoint = sd.placeHolder("point", DataType.FLOAT, -1, 3); //Shape: [?, 3]
        SDVariable ray = sd.placeHolder("ray", DataType.FLOAT, -1, 3); //Shape: [?, 3]
        SDVariable expectedColor = sd.placeHolder("expected-color", DataType.FLOAT, -1, 3); //Shape: [?, 3]

        SDVariable sdfInput = denseLayer(sd, 10, 3, sdfPoint);
        SDVariable sdf = denseLayer(sd, 3, 10, sdfInput);
        sdf.markAsLoss();

This file has been truncated. show original

I’m attempting to build the following feed-forward network:

1. Input (point, ....)
2. Dense layer
3. Dense layer -> Is also output of signed distance function, also used as an interim output
(Below not shown in example)
4. Custom layer that transforms gradient of layer (3) as well as its output
5. Dense layer
6. Dense layer
7. Actual Output

As far as I can understand call to sd.grad creates new SameDiff instance for idrRenderGradient variable inside SameDiff.createGradFunction at

defineFunction:3987, SameDiff (org.nd4j.autodiff.samediff)
defineFunction:3975, SameDiff (org.nd4j.autodiff.samediff)
createGradFunction:4186, SameDiff (org.nd4j.autodiff.samediff)
createGradFunction:4093, SameDiff (org.nd4j.autodiff.samediff)
grad:3602, SameDiff (org.nd4j.autodiff.samediff)
main:21, Issue (com.valb3r.idr.networks)

agibsonccc · June 6, 2022, 4:34am

@valb3r yes gradients are stored in a sub dictionary. A new samediff instance for computing gradients is stored and that’s what used to lookup gradient states as well.

valb3r · June 6, 2022, 5:10am

So in order to use that gradient I need to invoke it on the original SameDiff instance with invokeGraphOn?

agibsonccc · June 6, 2022, 5:42am

@valb3r no that’s mainly meant to be internal. You should be able to just call grad(…) and we do the rest.

Internally, it calls doDiff on any relevant function which then extends the relevant graph.

If you need to access any gradient just call grad(…) on the samediff instance you created.

You might be able to also use some of the tests here for examples:

github.com

eclipse/deeplearning4j/blob/0e6c5a9c675ac69e22a26074ec4c9129ac498ce4/platform-tests/src/test/java/org/eclipse/deeplearning4j/nd4j/autodiff/samediff/SameDiffTests.java#L2570


      
          }
          
          
@ParameterizedTest
          @MethodSource("org.nd4j.linalg.BaseNd4jTestWithBackends#configs")
          @Disabled
          public void testExternalErrorsSimple(Nd4jBackend backend) {
              INDArray externalGrad = Nd4j.linspace(1, 12, 12).reshape(3, 4);
          
          
    SameDiff sd = SameDiff.create();
              SDVariable var = sd.var("var", externalGrad);
              SDVariable out = var.mul("out", 0.5);
          
          
    Map<String, INDArray> gradMap = new HashMap<>();
              gradMap.put("out", externalGrad);
              ExternalErrorsFunction fn = SameDiffUtils.externalErrors(sd, null, out);
          
          
    Map<String, INDArray> m = new HashMap<>();
              m.put("out-grad", externalGrad);
              Map<String, INDArray> grads = sd.calculateGradients(m, sd.getVariables().keySet());
          
          
    INDArray gradVar = grads.get(var.name());

One other relevant concept here might be ExternalErrors which allows you to compute gradients somewhere else and pass them in to a samediff instance.

Either way this might be able to help you figure out how to use the various custom gradient behavior internals. Try not to focus too much on the internal details beyond understanding what kind of error might be caused (eg: the internal structure of the gradients) for debugging purposes.

Anything else should be handled by the maintainers.

valb3r · June 6, 2022, 6:24am

It seems like it is not working currently, should I create issue in Github for that?

agibsonccc · June 6, 2022, 6:37am

@valb3r yes please create an issue

valb3r · June 9, 2022, 5:23am

Done, Support for using sd.grad output as an intermediate variable · Issue #9710 · eclipse/deeplearning4j · GitHub

agibsonccc · June 9, 2022, 12:09pm

@valb3r thanks. For now try just using separate samediff instances and external errors as a workaround for now. Continuing a samediff gradient calc based on an intermediate result will take a bit of thinking in the mean time. If you want to save everything together just call

samediff.putSubFunction(“your_grad_intermediate”,separateSamediffInstance);

Could you remind me of the use case here a bit so I can think of how this might be used in other contexts as well so I can generalize this a bit? One immediate way I could think of doing this would be just doing sub functions under the hood and making it seamless.

There’s a new Invoke op that might make that fairly easy. Still not sure yet though.

valb3r · June 11, 2022, 5:22am

Could you remind me of the use case here a bit so I can think of how this might be used in other contexts as well so I can generalize this a bit?

It is for so called implicit differential rendering of 3d scenes, in this task one overfits neural network to a scene projections (images or RGB-D) in order to obtain implicit 3d model baked inside neural network. There are following components acting in this process:

Neural network that represents signed distance function (1)
Special layer that makes rendering process differentiable. As the input, it takes gradient of (1) and its output and generates input to (2). It is mostly linear operations.
Neural network that represents shape and texture (2) - takes input from special layer and produces color value.

For more details you can check Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance - it has PyTorch implementation available

agibsonccc · June 14, 2022, 5:02am

@valb3r then I think an external errors could work.

All you would have is 1 net per component with breaking up the problem so you don’t have the intermediate variable in the same graph but instead just consume it in another graph.

Combining the parts together should achieve the same result.

Topic		Replies	Views
The ND4j model convert to SameDiff SameDiff	1	229	April 19, 2023
NPE when use sameDiff.nn.dropout SameDiff	2	237	May 30, 2022
Implementation of Dropout SameDiff	1	587	July 3, 2020
Reduce rank and set values according to 3rd dimension SameDiff	12	315	September 28, 2022
Yolo2OutputLayer for 3D DL4J	1	211	May 18, 2022

Using gradient as an intermediate SDVariable

Related topics