I’m working on a ComputationGraph and a MultiLayerNetwork with the need to obtain its output and the loss for each entry of the batch. The below is the configuration for my ComputationGraph and MultiLayerNetwork.
MultiLayerConfiguration conf_MLN = new NeuralNetConfiguration.Builder()
.weightInit(WeightInit.XAVIER_UNIFORM)
.updater(new RmsProp())
.list()
.layer(new DenseLayer.Builder().nIn(n_feat).nOut(1).activation(Activation.RELU).build())
.layer(new OutputLayer.Builder(LossFunctions.LossFunction.MSE)
.nIn(1).nOut(1).activation(Activation.IDENTITY)
.build())
.build();
ComputationGraphConfiguration conf_CG = new NeuralNetConfiguration.Builder()
.weightInit(WeightInit.XAVIER_UNIFORM)
.updater(new RmsProp())
.graphBuilder()
.addInputs("state", "omega")
.addVertex("merge",new MergeVertex(),"state","omega")
.addLayer("L1", new DenseLayer.Builder().nIn(n_feat+2).nOut(2).activation(Activation.RELU).build(),"merge")//feature length+objective weights len
.addLayer("out",new OutputLayer.Builder(LossFunctions.LossFunction.MSE)
.nIn(2).nOut(2).activation(Activation.IDENTITY)
.build(),"L1")
.setOutputs("out")
.build();
I have the following question regarding the size of each method’s output:
- I supply a (batch_size,n_feat) input to model_MLN.output(), is the output shape (batch_size,1) or (batch_size)
- I supply two inputs of shape (batch_size,n_feat) and (batch_size,2) to model_CG.outputSingle(), is tge output shape (batch_size,2)?
- When I use .score(DataSet) for MultiLayerNetwork, do I get the sum of mse for all entries or the average?
- When I use .score(MultiDataSet) for the above ComputationGraph, how is the score calculated across the outputs.
- I want to obtain the MSE loss for each “entry” of the batch (shape [batch_size]), should I use scoreExamples to get it for the above ComputationGraph and MultiLayerNetwork? I ask some AI-LLM-chat models, and they said that scoreExamples calculate the loss for each output of each entry but not them together, resulting in a shape [batch_size,2] score. I’m wondering if this is true or not, and if I were to obtain the MSE loss for each entry, which method should I use.