How to work around missing features?

tkorach · May 6, 2020, 1:54pm

Hi,

ND4J is really useful in Java, but sometimes certain features that exist in Numpy seem to be missing (e.g. argsort). Is there any practical way to overcome such deficiency (e.g. a fallback to Numpy or something else)?

It’d be preferable to have a canonical mechanism than each user rolling out their own ad-hoc solution.

Thanks!

treo · May 6, 2020, 2:06pm

There is no canonical fallback. If it was easy to use numpy from java, there wouldn’t be a need for ND4J.

If you are missing functionality and want to share the solution with the community, you are also always welcome to create pull requests for it

tkorach · May 6, 2020, 2:32pm

I sometimes roll my own solutions, but I typically use ND4J from Maven rather than building from source - is there a way to contribute them without editing the ND4J code?

It seems that a lot of Numpy Python’s code is a wrapper around a C implementation (as the case for argsort numpy/fromnumeric.py at v1.18.1 · numpy/numpy · GitHub ).
Is there a way to access these from Java (e.g. via JNI)?
It could provide a lot of functionality based on existing code.

treo · May 6, 2020, 3:12pm

If you absolutely want to do that, you actually can:

But be warned, it isn’t quite trivial to interop between Nd4j and numpy arrays.

agibsonccc · May 6, 2020, 10:00pm

Note when @treo says “not trivial” it’s not impossible.
Here’s the logic for it:

github.com

eclipse/deeplearning4j/blob/7a203241058451098c7d0c631fa2689f27b6fb58/nd4j/nd4j-backends/nd4j-api-parent/nd4j-native-api/src/main/java/org/nd4j/nativeblas/BaseNativeNDArrayFactory.java

/*******************************************************************************
 * Copyright (c) 2015-2018 Skymind, Inc.
 *
 * This program and the accompanying materials are made available under the
 * terms of the Apache License, Version 2.0 which is available at
 * https://www.apache.org/licenses/LICENSE-2.0.
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
 * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
 * License for the specific language governing permissions and limitations
 * under the License.
 *
 * SPDX-License-Identifier: Apache-2.0
 ******************************************************************************/

package org.nd4j.nativeblas;

import lombok.extern.slf4j.Slf4j;
import lombok.val;

This file has been truncated. show original

You’d basically have to have a numpy array in memory already.
You could use some mix of our python runner and this method to accomplish what you want.

solutions is possible but we generally have everything. Rather than just assuming it doesn’t exist maybe try asking us? We know the docs need some work and are working on that (automatically generated ones in fact) for all of the operations.

So if you can work with us a bit we probably can do what you need.
All I ask is that you be receptive if we suggest something. Generally you want to solve a problem a specific way because we don’t support something you have or you assume a feature doesn’t exist. Let us work together on a solution that fits a problem you have rather than a specific way of doing it?

Edit: Wrong operation. Misread.

treo · May 7, 2020, 7:00am

He is talking about argsort:

numpy. argsort ( a , axis=-1 , kind=None , order=None)

Returns the indices that would sort an array.

Perform an indirect sort along the given axis using the algorithm specified by the kind keyword. It returns an array of indices of the same shape as a that index data along the given axis in sorted order.

And as far as I can tell we really don’t have that.

agibsonccc · May 7, 2020, 7:18am

Ah derp. Sorry misread. Just skimmed and read that as argmax
I’ll edit my post, rest still stands.

tkorach · May 7, 2020, 1:00pm

HI Adam,

Thanks for the informative explanations. I was indeed talking about argsort (specifically I use it for nearest-neighbor calculations), which I could not find. I come from data-science/machine learning where Numpy is very common, so a lot of the implementation/suggestions we encounter are designed with that in mind. I have used ND4J in the past two years and had great experience with it, but sometimes we run into things that are more challenging. I wonder what would be a good way to address such cases - roll-out our own solution (probably not optimized, since I am not an expert in linear algebra)? call Numpy?

Such discussion could be useful for others who happen to run into a corner case.

PS In our internal benchmarks ND4J was faster than Numpy, and distribution of JARs was much easier than distributing Numpy packages when working with enterprise customers with mixed Windows/Linux environment.

agibsonccc · May 11, 2020, 9:37am

Thanks a lot for replying. Sorry I didn’t see this soon enough. I’m glad we’re having this discussion for sure! I’d be happy to collect feedback from your team, do you mind DMing me? I’d love to learn more about what your team is doing.

tkorach · May 18, 2020, 6:17pm

Thanks! saw this just now. I’ll be happy to elaborate on our use case and the various solutions we use. I don’t seem to be able to message on this website, so you are welcomed to message/email me dopicar gmail

Topic		Replies	Views
How to use NDArray like NumPy? ND4J	3	1719	October 18, 2020
Unexpected Slow Performance ND4J	28	779	October 15, 2021
Is ND4J still active? ND4J	4	431	October 30, 2023
Android nd4j not work？ DL4J	6	711	October 9, 2023
Compiling on ARM ND4J	29	3902	April 7, 2021

How to work around missing features?

Related topics