Stacking c copies of a (1,m,n) layer into a (c,m,n) tensor, inside network

Hello,
is there a way to stack c copies of a (1,m,n) layer (2D) into a (c,m,n) tensor, inside the network? I have a (1,m,n) layer L1 and I want to do an ElementWise addition with the output layer of a Convolution, L2, that is (c,m,n), but the shapes are clearly not compatible, so I’d like to to produce a tensor of L1’s with right shape. This is happening in the hidden layers of a network so I couldn’t do that manually on the input.
Thank you in advance

@EquanimeAugello you’ll want a merge vertex. The output of each layer should be the same dimension.

Note that CNN output is actually going to be 4d not 3d. Do you mean ou want a CNN 1d?

Thank you very much. Ah yes you are right, my CNN is 2D so 4d output, my mistake. I have then a doubt, since I should merge those c instances of the same layer ( with c a variable that can change in different instances of the network) but so far I only know how to merge only an explicit number of layers manually; could you kindly suggest a reference for how to do it?

For clarity I append here the code that configures the network I’m working on:

 public ResNetBrain(int boardSize, int nResidualBlocks){
        int arrayArea= (int) Math.pow(boardSize*3,2);
        ComputationGraphConfiguration conf = customResBlocks(
                new NeuralNetConfiguration.Builder()
                    .weightInit(WeightInit.XAVIER)
                    .updater(new Sgd(0.01))
                    //.updater(new Adam(0.01))
                    .graphBuilder()
                    .setInputTypes(InputType.convolutional(boardSize*3, boardSize*3, 1))
                    .addInputs("input"), 
                boardSize, 
                nResidualBlocks //è il numero di ripetizioni del residual block
            )
            .addLayer("penultimo", new DenseLayer.Builder().nOut(arrayArea).build(), "A_"+nResidualBlocks+"_3")
            .addLayer("BN_penultimo", new BatchNormalization.Builder().nOut(boardSize).build(), "penultimo")
            .addLayer("ultimo", new DenseLayer.Builder().nOut(1)
                    //.gradientNormalization(GradientNormalization.RenormalizeL2PerLayer)
                    .build(),"BN_penultimo")
            .addLayer("output", new OutputLayer.Builder()
                    .lossFunction(new PessimisticLossFunction())
                    .activation(Activation.TANH)
                    .nIn(4).nOut(1).build(), "ultimo")
            .setOutputs("output")
            .build();
        this.net=new ComputationGraph(conf);
        this.net.init();
        
        Nd4j.getExecutioner().setProfilingConfig(ProfilerConfig.builder()
                .checkForINF(true)
                .checkElapsedTime(true)
                .checkLocality(true)
                .checkWorkspaces(true)
                .build());
    }
    public GraphBuilder customResBlocks(GraphBuilder previousArchitecture, int boardSize, int nResidualBlocks){ //il vertice/layer di input si deve chiamare "input" perchè funzioni
        GraphBuilder resBlocks=previousArchitecture;
        resBlocks=resBlocks
            .addLayer("A_0_3", new BatchNormalization.Builder().build(), "input");
        for(int i=1;i<=nResidualBlocks; i++){
            resBlocks=resBlocks
            .addLayer(i+"_1", new ConvolutionLayer.Builder()
                        .kernelSize(3,3)
                        .padding(1,1)
                        .nIn(1)
                        //Note that nIn need not be specified in later layers
                        .stride(1,1)
                        .nOut(boardSize)
                        //.activation(Activation.LEAKYRELU)
                        .build(),"A_"+(i-1)+"_3" )
            .addLayer("BN_"+i+"_1", new BatchNormalization.Builder().nOut(boardSize).build(), i+"_1")
            .addLayer("A_"+i+"_1", new ActivationLayer(Activation.TANH), "BN_"+i+"_1" )
            .addLayer(i+"_2", new ConvolutionLayer.Builder()
                        .kernelSize(3,3)
                        .padding(1,1)
                        .nIn(boardSize)
                        //Note that nIn need not be specified in later layers
                        .stride(1,1)
                        .nOut(boardSize)
                        //.activation(Activation.LEAKYRELU)
                        .build(),"A_"+i+"_1" )
            .addLayer("BN_"+i+"_2", new BatchNormalization.Builder().nOut(boardSize).build(), i+"_2")
            
            .addLayer("A_tiled_"+(i-1)+"_3", new ConvolutionLayer.Builder() //This tiles the 2D input into a boardsize deep tensor, with different weights (of the (1,1) filters) that acts as different weight in the ElementWise sum between each channel with the residual 2d input
                        .kernelSize(1,1)
                        .nIn(1)
                        //Note that nIn need not be specified in later layers
                        .stride(1,1)
                        .nOut(boardSize)
                        //.activation(Activation.LEAKYRELU)
                        .build(),"A_"+(i-1)+"_3" )
            .addLayer("BN_A_tiled_"+(i-1)+"_3", new BatchNormalization.Builder().nOut(boardSize).build(), "A_tiled_"+(i-1)+"_3")
            
            .addVertex(i+"_3", new ElementWiseVertex(ElementWiseVertex.Op.Add), "BN_"+i+"_2", "BN_A_tiled_"+(i-1)+"_3")
            //.addVertex(i+"_3", new MergeVertex(), "BN_"+i+"_2", "A_"+(i-1)+"_3")
            .addLayer("A_"+i+"_3", new ActivationLayer(Activation.TANH), i+"_3" );
        }
        return resBlocks;
    }

Please note that right now I have a (1x1 kernels) convolutional layer in place of the merge layer I should implement, since I thought that temporarily it should produce a layer of right shape although multiplied by arbitrary weights. I just did that for preliminary testing of the rest of the network.

Thank you sincerely,
Equanime

@EquanimeAugello could you clarify your question a bit? Are you not sure how to make everything the same sahpe and are looking for a path to that? That’s the only way a valid merge can happen.

Hello,
thank you for the reply,
my main doubt is how to get from one layer to multiple (c) copies of it, and then stack (merge) them all, since I only know how to merge 2 layers in a ComputationGraph. Thank you very much in advance,
Equanime

@EquanimeAugello sorry forgot to follow up here.

Could you clarify by what you mean with multiple copies? Repeat might do what you want but I’m not sure. Do you mean n copies of the same ndarray followed by a merge?

The RepeatVector layer should do what you’re looking for without all the complicated steps instead.