Mismatched Shapes

Hello there.

I am trying to create an autoencoder for a csv file with 27 float columns. I made my own datasetiterator, based on the AnomalyDataSet example project. The resulting output is a nice Dataset with 27 floats as columns, and a mask with all 1s.

===========INPUT===================
[[  832.2833,  907.5333,   75.2635,  ...   67.4916,   32.9680,   32.9513], 
 [   29.0333,  346.5667,  317.5359,  ...   37.4298,   52.6219,   52.6225], 
 [  599.8500,  643.9667,   44.1232,  ...   57.8035,   49.0082,   48.9850], 
  ..., 
 [ 1370.0500,  504.4167,  574.3795,  ...   37.5940,   28.8473,   28.8397], 
 [ 1293.5834,  295.0833,  441.5164,  ...   39.3701,   17.5364,   17.5363], 
 [ 1327.3000,  544.4000,  657.1083,  ...   38.1679,   20.4905,   20.4902]]
=================OUTPUT==================
[[  832.2833,  907.5333,   75.2635,  ...   67.4916,   32.9680,   32.9513], 
 [   29.0333,  346.5667,  317.5359,  ...   37.4298,   52.6219,   52.6225], 
 [  599.8500,  643.9667,   44.1232,  ...   57.8035,   49.0082,   48.9850], 
  ..., 
 [ 1370.0500,  504.4167,  574.3795,  ...   37.5940,   28.8473,   28.8397], 
 [ 1293.5834,  295.0833,  441.5164,  ...   39.3701,   17.5364,   17.5363], 
 [ 1327.3000,  544.4000,  657.1083,  ...   38.1679,   20.4905,   20.4902]]
===========INPUT MASK===================
[[    1.0000,    1.0000,    1.0000,  ...    1.0000,    1.0000,    1.0000], 
 [    1.0000,    1.0000,    1.0000,  ...    1.0000,    1.0000,    1.0000], 
 [    1.0000,    1.0000,    1.0000,  ...    1.0000,    1.0000,    1.0000], 
  ..., 
 [    1.0000,    1.0000,    1.0000,  ...    1.0000,    1.0000,    1.0000], 
 [    1.0000,    1.0000,    1.0000,  ...    1.0000,    1.0000,    1.0000], 
 [    1.0000,    1.0000,    1.0000,  ...    1.0000,    1.0000,    1.0000]]
===========OUTPUT MASK===================
[[    1.0000,    1.0000,    1.0000,  ...    1.0000,    1.0000,    1.0000], 
 [    1.0000,    1.0000,    1.0000,  ...    1.0000,    1.0000,    1.0000], 
 [    1.0000,    1.0000,    1.0000,  ...    1.0000,    1.0000,    1.0000], 
  ..., 
 [    1.0000,    1.0000,    1.0000,  ...    1.0000,    1.0000,    1.0000], 
 [    1.0000,    1.0000,    1.0000,  ...    1.0000,    1.0000,    1.0000], 
 [    1.0000,    1.0000,    1.0000,  ...    1.0000,    1.0000,    1.0000]]

The output is the same as the input, because of the autoencoder idea.

Now I have made this network configuration:

MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
    		    .seed(12345)
    		    .weightInit(WeightInit.XAVIER)
    		    .updater(new AdaGrad(0.05))
    		    .activation(Activation.RELU)
    		    .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
    		    .l2(0.0001)
    		    .list()
    		    .layer(0, new DenseLayer.Builder().nIn(27).nOut(50)
    		            .build())
    		    .layer(1, new DenseLayer.Builder().nIn(50).nOut(10)
    		            .build())
    		    .layer(2, new DenseLayer.Builder().nIn(10).nOut(50)
    		            .build())
    		    .layer(3, new OutputLayer.Builder().nIn(50).nOut(27)
    		            .lossFunction(LossFunctions.LossFunction.MSE)
    		            .build())
    		    .build();
    	
        MultiLayerNetwork net = new MultiLayerNetwork(conf);
        net.init();

I am not sure all the hyperparameters, but that is a worry for when the code starts running properly.

Now when the fitting starts, I get a very deep exception:

Exception in thread "main" java.lang.IllegalStateException: Mismatched shapes (shape = [64, 50], column vector shape =[64, 27])
	at org.nd4j.linalg.api.ndarray.BaseNDArray.doColumnWise(BaseNDArray.java:2368)
	at org.nd4j.linalg.api.ndarray.BaseNDArray.muliColumnVector(BaseNDArray.java:2788)
	at org.deeplearning4j.nn.layers.AbstractLayer.applyMask(AbstractLayer.java:252)
	at org.deeplearning4j.nn.layers.BaseLayer.preOutputWithPreNorm(BaseLayer.java:331)
	at org.deeplearning4j.nn.layers.BaseLayer.preOutput(BaseLayer.java:291)
	at org.deeplearning4j.nn.layers.BaseLayer.activate(BaseLayer.java:339)
	at org.deeplearning4j.nn.layers.AbstractLayer.activate(AbstractLayer.java:258)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.ffToLayerActivationsInWs(MultiLayerNetwork.java:1134)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.computeGradientAndScore(MultiLayerNetwork.java:2746)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.computeGradientAndScore(MultiLayerNetwork.java:2704)
	at org.deeplearning4j.optimize.solvers.BaseOptimizer.gradientAndScore(BaseOptimizer.java:170)
	at org.deeplearning4j.optimize.solvers.StochasticGradientDescent.optimize(StochasticGradientDescent.java:63)
	at org.deeplearning4j.optimize.Solver.optimize(Solver.java:52)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.fitHelper(MultiLayerNetwork.java:1715)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.fit(MultiLayerNetwork.java:1636)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.fit(MultiLayerNetwork.java:1623)
	at CAMSTAMUnsupervised.main(CAMSTAMUnsupervised.java:91)

The 27 are my columns, so that makes sense. The 50 correspond to the the first layer of the network. Specifically this line of code: .layer(0, new DenseLayer.Builder().nIn(inputNum).nOut(50).
I am not sure what the 64 is, maybe a batch size.

So my question is what I have misconfigured that this exception is thrown? I have tried simply to make the first layer have 27 nodes, but that generates the message: Mismatched shapes (shape = [64, 27], column vector shape =[64, 27]), which is even more confusing.

This is probably a beginner question, so thanks for your help.
Maarten

Mask is applied per example, not per input.

As you aren’t actually masking anything out, you might want to not set any masks at all.

Removing the masks indeed fixed the problem. Thanks!