The input to my convolution layer is BATCH_SIZE x 11 feature planes x a 9x9 chessboard. I’d like to describe the representation of the data in my simulation (in heap memory) and ask for advice on how best to convey it to the neural net for inference.

The first feature plane is derived from values bit-packed into a java array of 9 ints: 18 bits of each integer are used to represent 9 values each in the range 0-3; so that in the 9-int array, 81 such values are represented. In the feature plane, I want 0 where the simulation has 0, and 1 where the simulation has nonzero.

The next seven features represent pieces on the board; there are seven different pieces, and I don’t think I need another plane to represent empty squares. In the simulation, again it’s an array of 9 ints, this time each using 27 bits, 3 per square, to represent empty or one of the seven pieces. In the neural net, these should become seven separate feature planes of 0s and 1s.

The last three features are derived from a single java int each, in which 18 bits are used to represent 9 values each in the range 0-2. In the first of these three planes, the 9 values are duplicated across rows, like:

```
aaaaaaaaa
bbbbbbbbb
ccccccccc
ddddddddd
eeeeeeeee
fffffffff
ggggggggg
hhhhhhhhh
iiiiiiiii
```

In the second, across columns, like:

```
jklmnopqr
jklmnopqr
jklmnopqr
...
jklmnopqr
```

In the third, in a 3x3 pattern, like:

```
ssstttuuu
ssstttuuu
ssstttuuu
vvvwwwxxx
vvvwwwxxx
vvvwwwxxx
yyyzzzAAA
yyyzzzAAA
yyyzzzAAA
```

Now, my guess is that each time I want to do inference, I should convert these into java int arrays (or should I use byte arrays given that the values would fit?) and call Nd4j.create (or should I use Nd4j.createFromArray?), and do the rest of the transformation inside the ComputationGraph. I.e. the ComputationGraph is responsible for expanding the 9x9 piece array into the seven different layers each of 9x9 0s and 1s, and expanding the 3x3 into 9x9, then concatenating the 11 planes together. Is this the most performant way?