"hello world 2020": muzero like training

enpasos · April 5, 2020, 5:15am

Suppose you have 3 networks n1, n2, n3 defined to your needs with a ComputationGraphConfiguration such that “feedForward” and “fit” works on each network itself.

If you now want to do a muzero like (training) forward with input (i1, …, iK) resulting in output (o1,…,oK)

o1 = n2(s1); s1 = n1(i1)
o2 = n2(s2); s2 = n3(i2,s1)
…
oK = n2(sK); sK = n3(iK,sK-1)

and train the combined network against target (t1, …, tK).

How to do this “hello world 2020” with DL4J?

Sidecondition: It should be optimized to run on a high end CUDA device
(muzero … with Google’s 16 training TPUs in mind).

treo · April 5, 2020, 9:58am

I wouldn’t call this a “hello world”, what you are trying to do requires quite a bit of understanding in any case, and hello world is usually used to just test if your environment is set up correctly.

Anyway, there are multiple ways you could approach that.

You can use DL4J with external errors, i.e. you perform the chaining yourself and use the external errors approach to move things around.
You use SameDiff and define everything in one computation graph.

Both approaches have their benefits and problems. I would probably start out with the external errors way, as SameDiff isn’t as well documented as it deserves at the moment. But if you are feeling adventurous and want to try SameDiff, I suggest you start from this example: SameDiffMNISTTrainingExample

enpasos · April 5, 2020, 11:15am

Thanks a lot for your instant reply with the helpful starting points … I will have a look at the examples and the source code …

“hello world” was a little provocative but meant sincerely in the sense, that in some years we might see the simple but powerful MuZero algorithm as a “hello” to the world of planning agents. To my perception it is therefore no corner case - worth asking for a state of the art solution with DL4J.

enpasos · April 5, 2020, 12:08pm

Looking at the examples for SameDiff I found the following lines

“Currently, most ops in SameDiff execute on CPU only - GPU support for all ops is in the process of being implemented and will be available in a future release.”

in readme dating to 07/09/2019.

For me high performance CUDA support is mandatory. Seems to rule out SameDiff for me for now.

treo · April 5, 2020, 12:11pm

That readme is out of date. As of beta5 most of the SameDiff ops should support CUDA, see the release log:
https://deeplearning4j.konduit.ai/getting-started/release-notes#nd-4-j-samediff-features-and-enhancements

enpasos · April 5, 2020, 12:19pm

Thanks a lot for that info. Very cool. I will dive into SameDiff.

Topic		Replies	Views
Issues about modifying the source code DL4J	64	1590	June 23, 2022
Questions on DL4J application DL4J	3	353	March 24, 2021
Optimization question DL4J	20	916	May 30, 2021
Temporal Convolutional Network DL4J	11	889	February 6, 2020
ComputationGraph or non-sequential example? DL4J	6	340	January 5, 2022

"hello world 2020": muzero like training

Related topics