Working with INDArrays using integer datatype

vranwez · May 7, 2020, 6:50am

Hi,

Hi, I saw that ND4J is supposed to support multiple data types. I am able to create INDArrays with INT or LONG datatype, but I can’t multiply them using mmul (I got an operand unexpected datatype INT instead of HALF error). Following the function calls leads to the gemm methods

@Override
    public void gemm(INDArray A, INDArray B, INDArray C, boolean transposeA, boolean transposeB, double alpha,
                    double beta) {
        if (Nd4j.getExecutioner().getProfilingMode() == OpExecutioner.ProfilingMode.ALL)
            OpProfiler.getInstance().processBlasCall(true, A, B, C);

        GemmParams params = new GemmParams(A, B, C, transposeA, transposeB);
        if (A.data().dataType() == DataType.DOUBLE) {
            DefaultOpExecutioner.validateDataType(DataType.DOUBLE, params.getA(), params.getB(), C);
            dgemm(A.ordering(), params.getTransA(), params.getTransB(), params.getM(), params.getN(), params.getK(),
                            alpha, params.getA(), params.getLda(), params.getB(), params.getLdb(), beta, C,
                            params.getLdc());
        } else if (A.data().dataType() == DataType.FLOAT) {
            DefaultOpExecutioner.validateDataType(DataType.FLOAT, params.getA(), params.getB(), C);
            sgemm(A.ordering(), params.getTransA(), params.getTransB(), params.getM(), params.getN(), params.getK(),
                            (float) alpha, params.getA(), params.getLda(), params.getB(), params.getLdb(), (float) beta,
                            C, params.getLdc());
        } else {
            DefaultOpExecutioner.validateDataType(DataType.HALF, params.getA(), params.getB(), C);
            hgemm(A.ordering(), params.getTransA(), params.getTransB(), params.getM(), params.getN(), params.getK(),
                            (float) alpha, params.getA(), params.getLda(), params.getB(), params.getLdb(), (float) beta,
                            C, params.getLdc());
        }

        OpExecutionerUtil.checkForAny(C);
    }

this method get the HALF datatype (default) as the expected datatype for the operand leading the “validateDataType” method to trigger and exception for INT or LONG datatype INDArray

So, how are we suppose to use INT or LONG datatype INDArrays? Any help would be welcome.

raver119 · May 7, 2020, 6:57am

Right, you’re trying to use BLAS, so it’s not going to work.

Use MatMul op directly.

vranwez · May 7, 2020, 7:55am

Thanks for your answer. I am currently using m1.mmul(m2), how can I use the MatMul op directly?
I am sorry to bother you with this, I am new to ND4J, I tried to search the ND4J tutorial and made some google queries but I didn’t find the solution.

raver119 · May 7, 2020, 7:57am

Nd4j.exec(new MatMul(x, y));

Something like that ^^^

However, in upcoming release (and in current snapshots) MatMul is already used as default method.

vranwez · May 7, 2020, 8:33am

So, I tried:

m1.matmul(m2) and get
org.nd4j.linalg.exception.ND4JIllegalStateException: Op name mmul failed to execute. You can't execute non-inplace CustomOp without outputs being specified

I then tried to provide the output matrix as follow

INDArray res= Nd4j.empty();
Nd4j.exec(new  Mmul(m1, m2,res , null));

and get

Op [matmul] failed check for input [0], DataType: [INT32]
Validation error at D:/jenkins/ws/dl4j-deeplearning4j-1.0.0-beta6-windows-x86_64-cpu/libnd4j/include/ops/declarable/impl/DeclarableOp.cpp:515 code=34() "this->validateDataTypes(*block)"

I just need to multiply and add some matrices containing long/int values I didn’t expect this to be so challenging
I’m currently using

<dependency>
    	<groupId>org.nd4j</groupId>
    		<artifactId>nd4j-native</artifactId>
    	<version>1.0.0-beta6</version>
</dependency>

I could switch to an alternative version if this make things easier.
Thanks again for your help.

raver119 · May 7, 2020, 8:35am

Why you think res should be empty in this case? Don’t specify output array if you don’t know the outcome shape. Let the op to take care of it.

raver119 · May 7, 2020, 8:38am

I.e. this signature: deeplearning4j/Mmul.java at 88d3c4867fb87ec760b445c6b9459ecf353cec47 · KonduitAI/deeplearning4j · GitHub

vranwez · May 7, 2020, 8:45am

I would love to not specify the output array.
m1.matmul(m2) give me an error saying that the output should be specified. I looked at the code of the function called by matmul

public static INDArray matmul(INDArray a, INDArray b){
        return matmul(a,b, null);
    }
public static INDArray matmul(INDArray a, INDArray b){
        return matmul(a,b, null);
    }
 public static INDArray matmul(INDArray a, INDArray b, INDArray result){
        final Mmul op = new Mmul(a, b, result, null);
        return exec(op)[0];
    }

and hence I tried to solve the problem by calling directly the Mmul(.,.,.,.) constructor with a non null result INDarray (hoping that the method will reshape it and that I do not have to create a INDArray of the correct size myself before calling the Mmul constructor

treo · May 7, 2020, 8:50am

You should be using this signature:

 public Mmul(INDArray x, INDArray y, boolean transposeX, boolean transposeY,  boolean transposeZ) {
        this(x, y, 1.0, 0.0, transposeX, transposeY, transposeZ);
    }

so all you have to do is just call:
Nd4j.exec(new Mmul(m1, m2, false, false, false))

raver119 · May 7, 2020, 8:51am

Alternatively, you can just do what i’ve told you to

INDArray result = Nd4j.exec(new MMul(x, y, false, false, false))[0];

vranwez · May 7, 2020, 8:59am

OK so it seems that our misunderstanding is related to the fact that I don’t have the same release as you, and the constructors you are suggesting are not present in the library I use (cf my maven configuration above). Here are the constructors I have for Mmul and none of those you are suggesting are there:

public class Mmul extends DynamicCustomOp {

    protected MMulTranspose mt;

    /**
     *
     * @param sameDiff
     * @param i_v1
     * @param i_v2
     * @param mt
     */
    public Mmul(SameDiff sameDiff,
                SDVariable i_v1,
                SDVariable i_v2,
                MMulTranspose mt) {
        super(null,sameDiff,new SDVariable[]{i_v1,i_v2});
        this.mt = mt;
        addIArgument(ArrayUtil.fromBoolean(mt.isTransposeA()), ArrayUtil.fromBoolean(mt.isTransposeB()), ArrayUtil.fromBoolean(mt.isTransposeResult()));
    }


    /**
     *
     * @param sameDiff
     * @param i_v1
     * @param i_v2
     */
    public Mmul(SameDiff sameDiff,
                SDVariable i_v1,
                SDVariable i_v2) {
        this(sameDiff,i_v1,i_v2,MMulTranspose.allFalse());
    }

    /**
     *
     * @param x
     * @param y
     * @param z
     */
    public Mmul(INDArray x,
                INDArray y,
                INDArray z,
                MMulTranspose mt) {
        super(null, new INDArray[]{x, y}, z == null ? null : new INDArray[]{z});
        if (mt != null) {
          this.mt = mt;
          addIArgument(ArrayUtil.fromBoolean(mt.isTransposeA()),
                       ArrayUtil.fromBoolean(mt.isTransposeB()),
                       ArrayUtil.fromBoolean(mt.isTransposeResult()));
        }
    }


    public Mmul() {}

   @Override
    public Object getValue(Field property) {
        if (mt == null) {
....

raver119 · May 7, 2020, 9:01am

Ouch. Sorry. Working on it.

vranwez · May 7, 2020, 10:12am

Thanks, let me know when you got something.

Meanwhile I tried a workound by creating a result matrix, using zeros factory and the expected dimension, but even that failed

Op [matmul] failed check for input [0], DataType: [INT32]
Validation error at D:/jenkins/ws/dl4j-deeplearning4j-1.0.0-beta6-windows-x86_64-cpu/libnd4j/include/ops/declarable/impl/DeclarableOp.cpp:515 code=34() "this->validateDataTypes(*block)"

so I am stuck and waiting for your feedbacks

treo · May 7, 2020, 10:37am

In theory, you should be able to call it like that, when you run into a case where the op isn’t mapped to java yet:

Nd4j.exec(DynamicCustomOp.builder("mmul")
                .addInputs(Nd4j.rand(3, 4).castTo(DataType.INT32), Nd4j.rand(4, 5).castTo(DataType.INT32))
                .addBooleanArguments(false, false, false)
                .build())[0]

But, in your case, the actual problem is at a different place. The cpp definition of the op explicitly only allows floats:

github.com

KonduitAI/deeplearning4j/blob/master/libnd4j/include/ops/declarable/generic/blas/matmul.cpp#L139-L144


      
          DECLARE_TYPES(matmul) {
              getOpDescriptor()
                      ->setAllowedInputTypes(0, {ALL_FLOATS, ALL_INTS})
                      ->setAllowedInputTypes(1, {ALL_FLOATS, ALL_INTS})
                      ->setAllowedOutputTypes(0, {ALL_FLOATS, ALL_INTS});
          }

So even updating to Snapshots isn’t going to help you just yet.

For the longest time ND4J only supported floating point typed tensors as that is what is the most useful in deep learning. Integers and other types were introduced only (relatively) recently and unfortunately you will run into problems like that.

I’ve created an issue to track the progress for this here: MatMul Op only supports float typed inputs · Issue #8926 · eclipse/deeplearning4j · GitHub

vranwez · May 7, 2020, 11:58am

Indeed, I am not working on deep learning but I would like to speed up my java program thanks to an efficient matrix library and ND4J seems a good solution for that. As it seems that other alternative java libraries do not have integer support either and that it is likely that matrix calculus using double INDarray are still way faster than my naive loop implementation with integer, I envisage to try using float/double INDarray (that I could easily replace by a int/long INDarray when available on ND4J).

However I got concern regarding rounding errors. In the past I got cases where func(y) was considered smaller that func(x) due to rounding errors while the correct result should have been func(y)= func(x)+1 . I am thus wondering if this is something that is better handled in ND4J than in naive home made java computation.

treo · May 7, 2020, 12:06pm

That gets into intricacies with numerical stability and the actual sizes of your numbers. So if your numbers get large enough, x and x+1 may actually be equal. And depending on the order of mathematically commutative operations, you may get actually different results when using floating point numbers.

vranwez · May 7, 2020, 12:24pm

That is what I thought… Do you have an idea concerning the time required to fix this issue in ND4J? I know this could be hard to estimate but I am wondering if it is reasonnable to wait for a fix, or if ND4J is such a large project and this is so low in the priorities that no solution should be expected anytime soon and I should try to find an alternative solution.

raver119 · May 7, 2020, 12:27pm

It’s already fixed here: compression ops by raver119 · Pull Request #436 · KonduitAI/deeplearning4j · GitHub

So once CI finishes - it’ll get merged, and it’ll eventually get into snapshots.

vranwez · May 7, 2020, 12:30pm

Wouaou! impressive

Topic		Replies	Views
INDArray power method ND4J	2	351	January 20, 2022
HALF data type with GPU backend DL4J	1	373	March 3, 2020
Inserting double value into a INDArray	1	460	October 16, 2020
Reduction of accuracy ND4J	3	520	August 14, 2020
& operation for Nd4J ND4J	4	459	August 3, 2021

Working with INDArrays using integer datatype

Related topics