Recreating a network with "Residual Blocks"

AISandbox.dev · May 23, 2020, 11:53am

Hi,

I’m recreating a network based on a paper (viewable here - Solving the Rubik’s cube with deep reinforcement learning and search | Nature Machine Intelligence) to help solve a Rubiks cube. The network starts off with a series of one-hot encoded inputs (one set for each tile), and estimates the amount of work still to be done to solve it.

I can’t find the relevant source code, but the description is:

The first two hidden layers of the DNNs have sizes
of 5,000 and 1,000, respectively, with full connectivity. These are then followed
by four residual blocks, where each residual block has two hidden layers of size
1,000. Finally, the output layer consists of a single linear unit representing the costto-
go estimate (Supplementary Fig. 3). We used batch normalization40 and rectified
linear activation functions in all hidden layers. The DNN was trained with a batch
size of 10,000, optimized with ADAM42, and did not use any regularization. The
maximum number of random moves applied to any training state K was set to 30.
The error threshold ε was set to 0.05.

I get the fully connected layers, but can’t see any way to create “residual blocks” (based on the descriptions I’ve found on the internet). Will I have to use a ComputationGraph rather than a MultiLayerNetwork?

Thanks

Graham

treo · May 24, 2020, 8:09pm

The paper links to this codeocean environment:
https://codeocean.com/capsule/5723040/tree/v1

And there in code/ml_utils/nnet_utils.py at line 238 to line 325 you can see the definition of the neural network (or rather to line 271, if you disregard all the other stuff that tf needs to set up for training).

They are using their own definitions of layers though, so you will have to take a look at tensorflow_utils/layers.py to see how they define their residual blocks from line 290 to 305.

AISandbox.dev · May 24, 2020, 8:50pm

Thanks - I’d downloaded the environment and spent some time looking through. Can’t believe I must have scrolled straight past it.

Graham

Topic		Replies	Views
Stacking c copies of a (1,m,n) layer into a (c,m,n) tensor, inside network DL4J	5	259	July 15, 2023
scoreExamples and outputSingle output shape DL4J	0	31	June 20, 2024
Questions on DL4J application DL4J	3	347	March 24, 2021
Querying MultiLayerNetwork for configuration properties DL4J	4	352	January 7, 2021
Minimal version for inference	3	395	June 11, 2021

Recreating a network with "Residual Blocks"

Related topics