Why isn't strmv Blas routine supported

Hi.

I am wondering why in the BlasWrapper dtrmv is supported but strmv isn’t.

My build on an aarch64 linux from maven uses an openblas downloaded from the repo by default without me linking a system library so there is no reason for this to be unsupported as far as I can see?

Thanks

@cawthorne Actually it’s available right here:

We just made it data type agnostic.
You can access this with:
Nd4j.getBlasWrapper().level2().trmv

Hi thanks for getting back to me.

If I run this with a float data type I get:

java.lang.UnsupportedOperationException

As strmv isn’t linked at a lower level when running a cpu backend blas library:

But dtrmv works fine.

I was able to make strmv work trivially by changing 2 lines.

Would you accept a PR request to fill in all the blas and lapack routines you are missing? It would just be a simple attempt to link to the cblas_ methods.

I believe I saw posv, geev, sygvd… were missing too. These could be added as they are used in openblas too?

Definitely thanks for offering!

I am trying to use a different blas shared object library as the cpu backend. I see an openblas javacpp one is used by default. Could you summarise the steps to get an alternative one used instead? Does it require a re compile of nd4j or can it be swapped out after?

@cawthorne basically we can dynamically link against 2 different implementations (mkl and openblas) already. More about that here:

For cuda, you’ll end up needing to do something else. I wanted to check cuda after we got the CPU PR right.

I want to get the cpu arm performance libraries to work. Will have a look at the link thanks!

@cawthorne we might have to do that separately. We’re still working on optimizing our stack for ARM yet: Libnd4j and Pi ArmCompute additions. Strided and Padded NdArray: by quickwritereader · Pull Request #503 · KonduitAI/deeplearning4j · GitHub

We do have the openblas jars compiled for ARM. You can see an example of that here:
https://repo1.maven.org/maven2/org/bytedeco/openblas/0.3.10-1.5.4/

That will have to work for now, but I understand the use case.