Hello there.
I am new to DL4J. I am trying to create an autoencoder, and had a look at several examples. I have a dataset that has 11 doubles as input. I would like to create an autoencoder that reduces the 11 features to a code of size 2 or 3 for nice visualisations. Before doing that, I was thinking of a sanity check, and using a code of size 11. So I have a dense multi-layer network with 5 layers, each having 11 nodes. This means that the input should rather simply be passed straight through, and a perfect score should be obtainable. I have tried many different settings, and nothing made DL4J learn this solution. I am seeking for help and ideas so that I can do this sanity check. After that I want to reduce the code size.
Here are some things I have tried.
Preprocessing
- None
- NormalizerStandardize
- NormalizerMinMaxScaler
Batch size
- 64
Epoch
- 30 + but there is no convergence after only a couple of epochs
Weight Initializations
- Identity
- RELU
- Xavier
Updater
- AdaGrad(0.05)
- Adam(0.05)
Activation functions
- Identity
- Relu
- Sigmoid
Optimization
- Sotchastic gradient descent
- line gradient descent
L2
- 0.0001
- disabled
Layers
- 0-3 hidden layers
Loss function
- MSE
- MAE
I have tried many of the above combinations, but it just will not converge to anything. The error stays pretty much the same after weight initialization. When feeding the inputs through the network, I see that the response is somewhat, but not very similar to the original input. It should be really similar, as not encoding should be occuring. I also see that some internal nodes have an activation of 0 (depending on the settings).
I have no idea what I am doing wrong. It seems I have tried enough variants. Is there maybe something wrong with my custom data iterator? When inspecting the data set, I nicely see the same numbers as features and labels.
Please help solving this mystery. Thank you very much.
Here is the code, for reference.
public class CAMSTAMUnsupervised {
private static int trainBatchSize = 64;
private static int testBatchSize = 1;
private static int numEpochs = 30;
public static String dataLocalPath;
public static void main(String[] args) throws Exception {
File modelFile = new File(dataLocalPath, "camstam.gz");
DataSetIterator trainIterator = new CAMSTAMDataSetIterator(new File(dataLocalPath, "camstam_with_hr_features.csv").getAbsolutePath(), trainBatchSize);
DataSetIterator testIterator = new CAMSTAMDataSetIterator(new File(dataLocalPath,"camstam_with_hr_features.csv").getAbsolutePath(), testBatchSize);
System.out.println("Input Columns: "+trainIterator.inputColumns());
System.out.println("Output Columns: "+trainIterator.totalOutcomes());
MultiLayerNetwork net = createModel(trainIterator.inputColumns(), trainIterator.totalOutcomes());
UIServer uiServer = UIServer.getInstance();
StatsStorage statsStorage = new InMemoryStatsStorage();
uiServer.attach(statsStorage);
DataSet dst = trainIterator.next(1);
System.out.println(dst.getFeatures());
//DataNormalization normalizer = new NormalizerStandardize();
DataNormalization normalizer = new NormalizerMinMaxScaler();
normalizer.fit(trainIterator); //Collect training data statistics
trainIterator.reset();
trainIterator.setPreProcessor(normalizer);
testIterator.setPreProcessor(normalizer); //Note: using training normalization statistics
NormalizerSerializer.getDefault().write(normalizer, new File(dataLocalPath, "anomalyDetectionNormlizer.ty").getAbsolutePath());
// training
net.setListeners(new StatsListener(statsStorage), new ScoreIterationListener(10));
net.fit(trainIterator, numEpochs);
//Sanity check
while (testIterator.hasNext()) {
DataSet ds = testIterator.next();
System.out.println(ds.getFeatures());
List<INDArray> result = net.feedForward(ds.getFeatures());
System.out.println(result.get(4));
System.out.println("-----------");
}
}
public static MultiLayerNetwork createModel(int inputNum, int outputNum) {
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
.seed(12345)
.weightInit(WeightInit.)
//.updater(new AdaGrad(0.05))
.updater(new Adam(0.05))
.activation(Activation.IDENTITY)
.optimizationAlgo(OptimizationAlgorithm.)
//.l2(0.0001)
.list()
.layer(0, new DenseLayer.Builder().nIn(inputNum).nOut(11)
.build())
.layer(1, new DenseLayer.Builder().nIn(11).nOut(11)
.build())
.layer(2, new DenseLayer.Builder().nIn(11).nOut(11)
.build())
.layer(3, new OutputLayer.Builder().nIn(11).nOut(outputNum)
.lossFunction(LossFunctions.LossFunction.MSE)
.build())
.build();
MultiLayerNetwork net = new MultiLayerNetwork(conf);
net.init();
return net;
}