Police-based algorithm

About RL4j. To my knowledge, RL4j now support value-based rl algorithms like DQN(Double-DQN), any plan to develop policy-based algorithm ?

It already does A3C, for example, check this example: https://github.com/eclipse/deeplearning4j-examples/blob/master/rl4j-examples/src/main/java/org/deeplearning4j/examples/rl4j/A3CALE.java

1 Like