Can't visualize training process

Hello! I’m trying to visualize the model training but on http://localhost:9000/train/overview I don’t see any changes, just blank sections for graphs without any useful info; all I was able to see is diagram with layers on /train/model but again without any graphs.
Here’s how I setup listener:

UIServer uiServer = UIServer.getInstance();
StatsStorage statsStorage = new InMemoryStatsStorage();
uiServer.attach(statsStorage);
MultiLayerNetwork model = ..... //create model, LSTM+RNN output layers
model.setListeners(new StatsListener(statsStorage), new ScoreIterationListener(10));
SequenceRecordReaderDataSetIterator trainDatasetIterator = .... // create iterator, regression=true
model.fit(trainDatasetIterator, 5);

And this dependency is in my pom.xml:

       <dependency>
            <groupId>org.deeplearning4j</groupId>
            <artifactId>deeplearning4j-ui</artifactId>
            <version>1.0.0-beta7</version>
        </dependency>

What can be the problem that I can’t see any output in graphs?

my question on Gitter

uiServer.start();

org.deeplearning4j.ui.api.UIServer doesn’t have method start()

VertxUIServer uiServer = VertxUIServer.getInstance();
uiServer.start();

this throws NullPointerException - VertxUIServer.getInstance() returns null. I tried this:

VertxUIServer uiServer = new VertxUIServer();
uiServer.start();

instead of

UIServer uiServer = UIServer.getInstance();

but the UI server won’t start at all, can’t access it on http://localhost:9000

@miterial Can you try to run this? deeplearning4j-examples/BasicUIExample.java at master · eclipse/deeplearning4j-examples · GitHub

Just see what it does out of the box. If our defaults are broken, then we can take a closer look.

Thank you for the suggestion of the official example but I think it doesn’t work for me either. I downloaded full repository, opened it in Intellij, run BasicUIExample, opened http://localhost:9000, waited for a while and saw only infinite page loading while the model was training; then the app closed.

Also tried UIStorageExample (with collectStats=true and then false), same result

@miterial ok if the default example isn’t working we can have a discussion. Could you clarify a bit about your environment and how you’re running it? Browser and stack traces on both front end and backend would really help.

Are you seeing anything in the chrome dev tools?

This is the log when I run BasicUIExample (I start it using Run button in Intellij IDE):

"D:\Program Files\JetBrains\IntelliJ IDEA 2019.2.4\jbr\bin\java.exe" "-javaagent:D:\Program Files\JetBrains\IntelliJ IDEA 2019.2.4\lib\idea_rt.jar=54347:D:\Program Files\JetBrains\IntelliJ IDEA 2019.2.4\bin" -Dfile.encoding=UTF-8 -classpath C:\Users\XXX\AppData\Local\Temp\classpath1744542958.jar org.deeplearning4j.examples.quickstart.features.userinterface.BasicUIExample
o.n.l.f.Nd4jBackend - Loaded [CpuBackend] backend
o.n.n.NativeOpsHolder - Number of threads used for linear algebra: 4
o.n.l.c.n.CpuNDArrayFactory - *********************************** CPU Feature Check Warning ***********************************
o.n.l.c.n.CpuNDArrayFactory - Warning: Initializing ND4J with Generic x86 binary on a CPU with AVX/AVX2 support
o.n.l.c.n.CpuNDArrayFactory - Using ND4J with AVX/AVX2 will improve performance. See deeplearning4j.org/cpu for more details
o.n.l.c.n.CpuNDArrayFactory - Or set environment variable ND4J_IGNORE_AVX=true to suppress this warning
o.n.l.c.n.CpuNDArrayFactory - *************************************************************************************************
o.n.n.Nd4jBlas - Number of threads used for OpenMP BLAS: 4
o.n.l.a.o.e.DefaultOpExecutioner - Backend used: [CPU]; OS: [Windows 10]
o.n.l.a.o.e.DefaultOpExecutioner - Cores: [8]; Memory: [2.0GB];
o.n.l.a.o.e.DefaultOpExecutioner - Blas vendor: [OPENBLAS]
o.d.n.m.MultiLayerNetwork - Starting MultiLayerNetwork with WorkspaceModes set to [training: ENABLED; inference: ENABLED], cacheMode set to [NONE]
o.d.u.VertxUIServer - Deeplearning4j UI server started at: http://localhost:9000
o.d.u.VertxUIServer - StatsStorage instance attached to UI: FileStatsStorage(C:\Users\XXX\AppData\Local\Temp\ui-stats.dl4j)
o.d.u.VertxUIServer - Deeplearning4j UI server is auto-stopping after thread (name: main) died.
o.d.u.VertxUIServer - Deeplearning4j UI server stopped.

Process finished with exit code 0

After log line “StatsStorage instance attached to UI” I open http://localhost:9000 and there is nothing in browser console and this on Network tab:

  VertxUIServer uiServer = VertxUIServer.getInstance(9000, false, null);
  uiServer.start();

this work for me

@SidneyLann thanks for chipping in!

@miterial
This stands out for me:

o.d.u.VertxUIServer - Deeplearning4j UI server is auto-stopping after thread (name: main) died.

I wonder why the thread dies. We need to dig in to that a bit.

The main thread is died, so the UI thread stop! the file path may not exist.

If I put Scanner sc = new Scanner(System.in); sc.nextLine(); after the training is done, it prevents the thread from dying but the page is still not loaded.

@SidneyLann VertxUIServer.getInstance(9000, false, null) gives the same result as UIServer.getInstance()

  VertxUIServer uiServer = VertxUIServer.getInstance(Integer.valueOf(port), false, null);
  uiServer.start();
  StatsStorage statsStorage = new InMemoryStatsStorage();
  uiServer.attach(statsStorage);
  net.setListeners(new StatsListener(statsStorage));

try it in memory first.

  VertxUIServer uiServer = VertxUIServer.getInstance(9000, false, null);
  uiServer.start();
  StatsStorage statsStorage = new InMemoryStatsStorage();
  uiServer.attach(statsStorage);
  net.setListeners(new StatsListener(statsStorage));

try it in memory

Thanks for the full example but the result is unresponding page anyway.

And if I get back to my initial example - the page is loaded but nothing is displayed:

Maybe something is wrong with my training data? What should I take a look at to check why values are not plotted?

@miterial no the main thread shouldn’t be dying. Something is triggering it and I’m not sure what. Out of memory? I would need more information or something I can use to reproduce it. As @SidneyLann is showing the UI works fine.

What other information do I need to provide so that you could reproduce it?

P.S. I’m not the only one who has this problem: DL4J UI can fail to launch under some circumstances · Issue #9011 · eclipse/deeplearning4j · GitHub

1 Like

@miterial I don’t understand how did you get from the unresponding UI (instant auto-stop) to the loaded page. You can upload your code in a GitHub Gist.
As you said, instant auto-stop is a known issue in 1.0.0-beta7 release.
However, it would be interesting to see if you would get an exception different from what I got in the above issue with the code that produced the unresponding UI, using the 1.0.0-SNAPSHOT version of DL4J in the pom.xml. UI Server would stop a bit later in a shutdown hook and may produce more useful info this way.