Linux performance issues in highly threaded environment

cgrabows · July 29, 2021, 1:45pm

Hi all,

I’m currently seeing performance issues that only arise in highly threaded environments. I’m using the linux avx512 library with version 1.0.0-beta7.

In an individual thread, I measured execution time using timers and the performance appears to be ok, but the overall CPU usage of the machine drastically increases when the network is running, and appears to do so exponentially with respect to workload (CPU is static without dl4j code snippet).

I’m using ParallelInference to run requests through a small model with workers set equal to the number of cores on the server.

Given the above, I suspect this to be related to increased GC overhead, or some execution time added to the beginning of each thread that somehow escapes my timers. I’m leaning towards a memory / workspace issue, but am unsure of how to debug.

Please let me know what other information would be helpful to help me debug this and thanks in advance!

agibsonccc · July 29, 2021, 1:55pm

@cgrabows could you let us know your numbers with the newest 1.0.0-M1.1? There are more classifiers to choose from as well. See release notes:

You can find available classifiers here:
https://repo1.maven.org/maven2/org/nd4j/nd4j-native/1.0.0-M1.1/

You can also find new docs here:

Topic		Replies	Views
AMD Ryzen 5000 CPU - Poor Performance DL4J	16	938	August 14, 2021
Only two threads are running when switching training to GPU DL4J	2	419	May 29, 2020
Perfomance issue DL4J	8	457	October 2, 2020
Libnd4j AVX512 build environment ND4J	0	458	September 15, 2020
Forward pass has very high execution times DL4J	4	215	March 5, 2023

Linux performance issues in highly threaded environment

Related topics