Activation function in batch norm layer (builders/configuration)

I configured a computation graph in which I wanted to apply a ReLU activation following a batch normalization. I did this by simply setting the activation function in the batch norm layer when building it, like so:

        new BatchNormalization.Builder().activation(Activation.RELU).build()

However, I have since realized that a batch norm layer will, if I am not mistaken, never apply its activation function:


public INDArray activate(boolean training, LayerWorkspaceMgr workspaceMgr) {
    return preOutput(input, training ? TrainingMode.TRAIN : TrainingMode.TEST, workspaceMgr);

There is of course no problem with simply adding a separate activation layer (surely that is the intended way to do it), and I guess it could be argued that this is a stupid thing to do, since a batch norm layer is not like a normal feed-forward layer and hence should not have an activation function. But programmers are lazy (at least I am), so if I see that I can include the activation in the batch norm layer and thus skip a few lines of code, then I’m inclined to do so… There might also be other examples of layer configurations that “fail” silently.

Regardless, I think it would be an improvement of the API if this configuration was not possible in the first place, so I would like to suggest that some checks are implemented to ensure that such “stupid” configurations are not possible.