How to work around missing features?

Hi,

ND4J is really useful in Java, but sometimes certain features that exist in Numpy seem to be missing (e.g. argsort). Is there any practical way to overcome such deficiency (e.g. a fallback to Numpy or something else)?

It’d be preferable to have a canonical mechanism than each user rolling out their own ad-hoc solution.

Thanks!

There is no canonical fallback. If it was easy to use numpy from java, there wouldn’t be a need for ND4J.

If you are missing functionality and want to share the solution with the community, you are also always welcome to create pull requests for it :slight_smile:

I sometimes roll my own solutions, but I typically use ND4J from Maven rather than building from source - is there a way to contribute them without editing the ND4J code?

It seems that a lot of Numpy Python’s code is a wrapper around a C implementation (as the case for argsort https://github.com/numpy/numpy/blob/v1.18.1/numpy/core/fromnumeric.py#L997-L1105 ).
Is there a way to access these from Java (e.g. via JNI)?
It could provide a lot of functionality based on existing code.

If you absolutely want to do that, you actually can:

But be warned, it isn’t quite trivial to interop between Nd4j and numpy arrays.

Note when @treo says “not trivial” it’s not impossible.
Here’s the logic for it:


You’d basically have to have a numpy array in memory already.
You could use some mix of our python runner and this method to accomplish what you want.

solutions is possible but we generally have everything. Rather than just assuming it doesn’t exist maybe try asking us? We know the docs need some work and are working on that (automatically generated ones in fact) for all of the operations.

So if you can work with us a bit we probably can do what you need.
All I ask is that you be receptive if we suggest something. Generally you want to solve a problem a specific way because we don’t support something you have or you assume a feature doesn’t exist. Let us work together on a solution that fits a problem you have rather than a specific way of doing it?

Edit: Wrong operation. Misread.

He is talking about argsort:

numpy. argsort ( a , axis=-1 , kind=None , order=None)

Returns the indices that would sort an array.

Perform an indirect sort along the given axis using the algorithm specified by the kind keyword. It returns an array of indices of the same shape as a that index data along the given axis in sorted order.

And as far as I can tell we really don’t have that.

Ah derp. Sorry misread. Just skimmed and read that as argmax :slight_smile:
I’ll edit my post, rest still stands.

HI Adam,

Thanks for the informative explanations. I was indeed talking about argsort (specifically I use it for nearest-neighbor calculations), which I could not find. I come from data-science/machine learning where Numpy is very common, so a lot of the implementation/suggestions we encounter are designed with that in mind. I have used ND4J in the past two years and had great experience with it, but sometimes we run into things that are more challenging. I wonder what would be a good way to address such cases - roll-out our own solution (probably not optimized, since I am not an expert in linear algebra)? call Numpy?

Such discussion could be useful for others who happen to run into a corner case.

PS In our internal benchmarks ND4J was faster than Numpy, and distribution of JARs was much easier than distributing Numpy packages when working with enterprise customers with mixed Windows/Linux environment.

Thanks a lot for replying. Sorry I didn’t see this soon enough. I’m glad we’re having this discussion for sure! I’d be happy to collect feedback from your team, do you mind DMing me? I’d love to learn more about what your team is doing.

Thanks! saw this just now. I’ll be happy to elaborate on our use case and the various solutions we use. I don’t seem to be able to message on this website, so you are welcomed to message/email me dopicar gmail