Recreating a network with "Residual Blocks"


I’m recreating a network based on a paper (viewable here - to help solve a Rubiks cube. The network starts off with a series of one-hot encoded inputs (one set for each tile), and estimates the amount of work still to be done to solve it.

I can’t find the relevant source code, but the description is:

The first two hidden layers of the DNNs have sizes
of 5,000 and 1,000, respectively, with full connectivity. These are then followed
by four residual blocks, where each residual block has two hidden layers of size
1,000. Finally, the output layer consists of a single linear unit representing the costto-
go estimate (Supplementary Fig. 3). We used batch normalization40 and rectified
linear activation functions in all hidden layers. The DNN was trained with a batch
size of 10,000, optimized with ADAM42, and did not use any regularization. The
maximum number of random moves applied to any training state K was set to 30.
The error threshold ε was set to 0.05.

I get the fully connected layers, but can’t see any way to create “residual blocks” (based on the descriptions I’ve found on the internet). Will I have to use a ComputationGraph rather than a MultiLayerNetwork?



The paper links to this codeocean environment:

And there in code/ml_utils/ at line 238 to line 325 you can see the definition of the neural network (or rather to line 271, if you disregard all the other stuff that tf needs to set up for training).

They are using their own definitions of layers though, so you will have to take a look at tensorflow_utils/ to see how they define their residual blocks from line 290 to 305.

Thanks - I’d downloaded the environment and spent some time looking through. Can’t believe I must have scrolled straight past it.