Error using tensorMmul on permuted tensor

Dris101 · July 1, 2020, 4:18pm

Hi there. I’m building a SameDiffVertex and have run into an issue where calling tensorMmul on a tensor which has previously been permuted causes a NPE in reduce.TensorMul.doDiff when calculating gradients

java.lang.NullPointerException: org.nd4j.linalg.api.ops.impl.reduce.TensorMmul.doDiff(TensorMmul.java:146)

which seems to be:-

int aAxes = range(0, larg().getShape().length);

I have tried a test case where I just use an sd.constant as the first arg to tensorMmul and try it with and without permuting it first, and the error goes away if I don’t do the permute. Unfortunately this is not an option in my real world case. Is it possible these two ops don’t play well together or am I doing something dumb somewhere?

treo · July 1, 2020, 6:04pm

It looks like that operation is still using an old java based backprop implementation, and it may be that there is a problem.

Without the code that you’ve tried it is hard to tell what exactly is going on though.

Dris101 · July 1, 2020, 6:51pm

Thanks for quick response. Understood. It’s probably a showstopper for me unless there is a workaround. Let me try to pull a minimal gist together (will be in scala if that’s OK) since it looks like it might be a bug, but in brief if I have (with weight an appropriately shaped matrix):-

val testInputNotPermuted = sd.constant(“test_input_not”, Nd4j.ones(5, 3, 4, 7))
sd.tensorMmul(“output”, testInputNotPermuted, weight, Array(3), Array(0), false, false, false)

the grads will calc using sd.calculateGradients. But if I do:-

val testInputPermuted = sd.constant(“test_input”, Nd4j.ones(5, 7, 4, 3)).permute(0, 3, 2, 1)
sd.tensorMmul(“output”, testInputPermuted, weight, Array(3), Array(0), false, false, false)

I will get the NPE when I call calculateGradients. The forward pass seems correct in both cases however.

treo · July 1, 2020, 7:34pm

It is possible to work around it in a similar way to how the workaround in this case worked:

However, you’ve got to understand how the native op works. If I get the opportunity, I’ll write up a workaround for this case too.

Dris101 · July 1, 2020, 10:05pm

That would be great! Is the issue likely to be with sd.permute or sd.tensorMmul? I am guessing it’s tensorMmul as I’m pretty sure I’ve have tried mmul (rather than tensorMmul) on permuted tensors and it has been fine. Somewhat beyond my current understanding of the codebase unfortunately.

Would you like me to create an issue for this or wait until you have had a further look?

Dris101 · July 2, 2020, 2:25pm

gist.github.com

https://gist.github.com/Dris101/c2663600f3d35733cef2b9331d424ec7

TensorMmulTest.scala

  import scala.collection.JavaConverters._

  Nd4j.getRandom().setSeed(1234L)
  val wInit    = Nd4j.rand(5, 7)
  val testInit = Nd4j.rand(7, 3, 4, 5)

  // Not permuted. Works
  val sd1                  = SameDiff.create()
  var w1                   = sd1.`var`("w", wInit)
  val testInputNotPermuted = sd1.constant("test_input_not", testInit)

This file has been truncated. show original

Dris101 · July 20, 2020, 11:17am

I’ve been taking a look at this and as you suggested have overridden the doDiff in TensorMmul to use the
existing C++ implementation in libnd4j/include/ops/declarable/generic/blas/tensormmul.cpp.
This has been a partial success. The issue I have now is that, from what I can see, the implementation of CUSTOM_OP_IMPL(tensormmul_bp) is not working for all cases. Specifically it will generally error out if
the ranks of the the two input tensors differ. The forward pass is fine however in those cases. From
adding some extra nd4j_verbose calls in my own fork of the libnd4j code it appears that the exception is being raised by shapeUtils when called from the section

// calculate dLdA

MmulHelper::tensorDot(dLdC, B, dLdA, axesBdLdC, axesB, permutAt);

A “ShapeUtils::evalShapeForTensorDot method: the numbers of a axes and b axes to make dot product along must have identical values !” runtime error will be thrown at this point.

To test further I used TensorFlow from python to generate a series of random test cases for input tensors A and B of different ranks, making sure that they were sized so that I could contract over at least one index, and dumping the resulting tensordot output and the A,B grads to .npy files and loading them back into ND4J and comparing the results with what I was getting from my overridden tensormMul op which uses the C++ implementation for both the forward and backward passes. What I have so far, taking all combinations of tensor ranks for A and B from 2 to 6 is:-

Output from ScalaTest. The input shapes are in the square brackets and I have contracted over the second last index in all cases (though that is easy to change or randomise)

Pass
[info] a=[2,1], b=[2,1] Calc pass Grad pass
[info] a=[2,1], b=[3,2,1] Calc pass Grad pass
[info] a=[1,4,4], b=[2,4,4] Calc pass Grad pass
[info] a=[1,4,4], b=[1,4,4,4] Calc pass Grad pass
[info] a=[2,3,2,2], b=[1,3,2,2] Calc pass Grad pass
[info] a=[2,3,2,2], b=[1,1,4,2,2] Calc pass Grad pass
[info] a=[4,4,4,4,2], b=[4,1,3,4,2] Calc pass Grad pass
[info] a=[4,4,4,4,2], b=[2,2,2,1,4,2] Calc pass Grad pass
[info] a=[3,3,2,1,4,3], b=[4,2,4,2,4,3] Calc pass Grad pass

Fail
[info] a=[2,1], b=[2,3,2,1] Calc pass Grad crash
[info] a=[2,1], b=[1,2,1,2,1] Calc pass Grad crash
[info] a=[2,1], b=[3,3,1,1,2,1] Calc pass Grad crash
[info] a=[1,4,4], b=[4,4] Calc pass Grad crash
[info] a=[1,4,4], b=[1,1,1,4,4] Calc pass Grad crash
[info] a=[1,4,4], b=[3,3,4,4,4,4] Calc pass Grad crash
[info] a=[2,3,2,2], b=[2,2] Calc pass Grad crash
[info] a=[2,3,2,2], b=[2,2,2] Calc pass Grad crash
[info] a=[2,3,2,2], b=[2,2,4,3,2,2] Calc pass Grad crash
[info] a=[4,4,4,4,2], b=[4,2] Calc pass Grad crash
[info] a=[4,4,4,4,2], b=[2,4,2] Calc pass Grad crash
[info] a=[4,4,4,4,2], b=[3,2,4,2] Calc pass Grad crash
[info] a=[3,3,2,1,4,3], b=[4,3] Calc pass Grad crash
[info] a=[3,3,2,1,4,3], b=[2,4,3] Calc pass Grad crash
[info] a=[3,3,2,1,4,3], b=[2,4,4,3] Calc pass Grad crash
[info] a=[3,3,2,1,4,3], b=[3,3,3,4,3] Calc pass Grad crash

The good news is that the forward pass (“Calc”) passes for all cases. The grads agree when they don’t error out as described above, but for most cases the grads will not calculate. All the cases where A and B are of the same rank do work, but this should not be a requirement in the general case. I note that all the tests of tensormul_bp in
libnd4j/tests_cpu/layers_tests/DeclarableOpsTests15.cpp
only consider cases where the rank of A and B are the same, so this would not have been picked up by those tests.

If this is genuinely an issue and I haven’t got the wrong idea, let me know what would be useful. I can raise an issue, put some code up in a gist etc. Given that TensorFlow implements a dense layer in terms of tensordot (I believe), not being able to backprop it would seem to be potentially problematic, not just for my somewhat esoteric use case.

agibsonccc · July 21, 2020, 12:28am

Hi, yes please file an issue. Thanks!

Dris101 · July 21, 2020, 9:28am

That’s done. Gist at:-

gist.github.com

https://gist.github.com/Dris101/9983f0aedaa78d9477b4d3a1c8770bd0

tensorMmul_bp_tests.md

## Python code to generate random tensors and calculate tensordot and gradients using TensorFlow

```python
import tensorflow as tf
import numpy as np
from pathlib import Path

g = tf.random.Generator.from_seed(123456)
commonSize = 2
for aRank in range(2,7):

This file has been truncated. show original

Topic		Replies	Views
NPE when use sameDiff.nn.dropout SameDiff	2	237	May 30, 2022
NPE when load TF to SameDiff SameDiff	22	639	December 3, 2021
Nd4j throwing errors during maven goal execution ND4J	7	463	December 9, 2021
"memoryManagerClazz" is null SameDiff	3	266	September 30, 2022
Spark evaluationRegression causes NullPointerException DL4J	2	454	May 13, 2020

Error using tensorMmul on permuted tensor

Related topics