Encodable and ObservationSpace

Hi,

I am fairly new to rl4j and reinforcement learning in general. I tried to implement a very simple MDP (Snake) to gain more experience. However since the documentation is still WOP I have a problem comprehending how exactly the ObservationSpace works.

I created an Encodable to represent the Field of snake:

public class SnakeObservationSpace implements Encodable {

    private final Snake snake;

    @Override
    public double[] toArray() {
        // Deprecated method.. not implemented
        return null;
    }

    @Override
    public boolean isSkipped() {
        return false;
    }

    @Override
    public INDArray getData() {
        return Nd4j.createFromArray(snake.getField());
    }

    @Override
    public Encodable dup() {
        return new SnakeObservationSpace(snake);
    }
}

snake.field is basically a two dimensional array representing the field. But now I am kinda stuck since the MDP interface also wants me to return an ObservationSpace. In the examples I have seen the use of the ArrayObservationSpace but I am unsure what exactly I am supposed to do.

Can someone explain to me how these two elements work with each other and what exactly I am supposed to do? Thanks a lot in advance!

For a snake game that somewhat works, you can take a look at GitHub - treo/rl4j-snake: An *unfinished* RL4J snake playing example

I started that as an example for RL4J, but didn’t have the time to actually finish it.

Anyway, the ObservationSpace tells RL4J what shape the observation it is going to get will have.

In that example I’m using a simple one dimensional array for the observation, as I’m feeding a simple MLP type neural network with a very specialized observation.

In principle the ObservationSpace is all about the shape of the possible observations. The Encodable is the observation, and therefore what you’ve implemented is not the ObservationSpace.

Your observation space would still be an ArrayObservationSpace and the array specifies the shape, i.e. the size in each dimension, of your snake field.

First of all thank you for your fast response! Fun fact: I already had a look at your project before, but I wanted to try to implement it with the 2D Matrix.

To create the Observation Space I tried this:

observationSpace = new ArrayObservationSpace<>(new int[]{10, 10});

But it gives me this exception:

Exception in thread "main" org.deeplearning4j.exception.DL4JInvalidInputException: Input that is not a matrix; expected matrix (rank 2), got rank 3 array with shape [1, 10, 10]. Missing preprocessor or wrong input type? (layer name: layer0, layer index: 0, layer type: DenseLayer)

I also wondered if there is advantage of encoding it as a 2D Matrix since I also could just transform it to a one dimensional Array?

You get that because the neural network you are training isn’t expecting a matrix. You are probably using the Dense builder there. In order for it to be able to deal with a matrix you will need the CNN variant.

Is there an advantage to looking at an image as a rectangle over looking at the pixels as a single line?

Your matrix essentially represents a picture, so yes there can be an advantage.

You get that because the neural network you are training isn’t expecting a matrix. You are probably using the Dense builder there. In order for it to be able to deal with a matrix you will need the CNN variant.

Yes you absolutely right. I will have a closer look at CNNs to fully understand how to use them.

I have one last question about the ObservationSpace: What exactly are high and low for? In the ArrayObservationSpace they are basically just and Array with one Column:

low = Nd4j.create(1);
high = Nd4j.create(1);

If I remember correctly they are about the highest and lowest values that are valid for the observation space, but I can be wrong on this.

I’m not entirely sure if they are actually used somewhere or not.