Suppose you have 3 networks n1, n2, n3 defined to your needs with a ComputationGraphConfiguration such that “feedForward” and “fit” works on each network itself.
If you now want to do a muzero like (training) forward with input (i1, …, iK) resulting in output (o1,…,oK)
o1 = n2(s1); s1 = n1(i1)
o2 = n2(s2); s2 = n3(i2,s1)
oK = n2(sK); sK = n3(iK,sK-1)
and train the combined network against target (t1, …, tK).
How to do this “hello world 2020” with DL4J?
Sidecondition: It should be optimized to run on a high end CUDA device
(muzero … with Google’s 16 training TPUs in mind).