"hello world 2020": muzero like training

Suppose you have 3 networks n1, n2, n3 defined to your needs with a ComputationGraphConfiguration such that “feedForward” and “fit” works on each network itself.

If you now want to do a muzero like (training) forward with input (i1, …, iK) resulting in output (o1,…,oK)

o1 = n2(s1); s1 = n1(i1)
o2 = n2(s2); s2 = n3(i2,s1)

oK = n2(sK); sK = n3(iK,sK-1)

and train the combined network against target (t1, …, tK).

How to do this “hello world 2020” with DL4J?

Sidecondition: It should be optimized to run on a high end CUDA device
(muzero … with Google’s 16 training TPUs in mind).

I wouldn’t call this a “hello world”, what you are trying to do requires quite a bit of understanding in any case, and hello world is usually used to just test if your environment is set up correctly.

Anyway, there are multiple ways you could approach that.

  1. You can use DL4J with external errors, i.e. you perform the chaining yourself and use the external errors approach to move things around.

  2. You use SameDiff and define everything in one computation graph.

Both approaches have their benefits and problems. I would probably start out with the external errors way, as SameDiff isn’t as well documented as it deserves at the moment. But if you are feeling adventurous and want to try SameDiff, I suggest you start from this example: SameDiffMNISTTrainingExample

Thanks a lot for your instant reply with the helpful starting points … I will have a look at the examples and the source code …

“hello world” was a little provocative but meant sincerely in the sense, that in some years we might see the simple but powerful MuZero algorithm as a “hello” to the world of planning agents. To my perception it is therefore no corner case - worth asking for a state of the art solution with DL4J.

Looking at the examples for SameDiff I found the following lines

“Currently, most ops in SameDiff execute on CPU only - GPU support for all ops is in the process of being implemented and will be available in a future release.”

in readme dating to 07/09/2019.

For me high performance CUDA support is mandatory. Seems to rule out SameDiff for me for now.

That readme is out of date. As of beta5 most of the SameDiff ops should support CUDA, see the release log:

Thanks a lot for that info. Very cool. I will dive into SameDiff.