Low accuracy compared to model trained with Keras

I have a simple model in keras. I tried making similar model in java using dl4j. But the accuracy is extremely low compared to the the one trained in keras. I am not sure why this is happening as model looks quite similar to me. Can some one please help me to figure out if i am doing something wrong in java here ?

Python model :

def create_model():
model = tf.keras.models.Sequential([
keras.layers.Dense(600, activation=‘relu’, input_shape=(512,)),
keras.layers.Dropout(0.2),
keras.layers.Dense(300, activation=‘relu’),
keras.layers.Dropout(0.2),
keras.layers.Dense(1, activation=‘sigmoid’)
])

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
return model


def get_data_from_csv():
    my_data = np.genfromtxt('spam.csv', delimiter=',')
    X = my_data[:, :512]
    X = X.astype('float32')
    Y = my_data[:, -1].astype('int')
    print(type(X), X.dtype, type(Y), Y.dtype)
    print(X.shape, Y.shape)
    return train_test_split(X, Y, test_size=0.25)
model = create_model()
X_train, X_test, y_train, y_test = get_data_from_csv()
model.fit(X_train, y_train, epochs=20, batch_size=100)
_, accuracy = model.evaluate(X_test, y_test)
print('Accuracy: %.2f' % (accuracy * 100))

Python Output :

Epoch 1/20
1/1 [==============================] - 0s 313us/step - loss: 0.0299 - accuracy: 0.9938
Epoch 2/20
1/1 [==============================] - 0s 308us/step - loss: 0.0194 - accuracy: 0.9943
Epoch 3/20
1/1 [==============================] - 0s 318us/step - loss: 0.0146 - accuracy: 0.9959
Epoch 4/20
1/1 [==============================] - 0s 305us/step - loss: 0.0104 - accuracy: 0.9969
Epoch 5/20
1/1 [==============================] - 0s 312us/step - loss: 0.0067 - accuracy: 0.9982
Epoch 6/20
1/1 [==============================] - 0s 321us/step - loss: 0.0047 - accuracy: 0.9987
Epoch 7/20
1/1 [==============================] - 0s 301us/step - loss: 0.0031 - accuracy: 0.9992
Epoch 8/20
1/1 [==============================] - 0s 301us/step - loss: 0.0024 - accuracy: 0.9995
Epoch 9/20
1/1 [==============================] - 0s 303us/step - loss: 0.0021 - accuracy: 0.9997
Epoch 10/20
1/1 [==============================] - 0s 309us/step - loss: 0.0020 - accuracy: 0.9997
Epoch 11/20
1/1 [==============================] - 0s 296us/step - loss: 0.0015 - accuracy: 0.9997
Epoch 12/20
1/1 [==============================] - 0s 315us/step - loss: 0.0013 - accuracy: 1.0000
Epoch 13/20
1/1 [==============================] - 0s 332us/step - loss: 0.0013 - accuracy: 1.0000
Epoch 14/20
1/1 [==============================] - 0s 303us/step - loss: 0.0016 - accuracy: 1.0000
Epoch 15/20
1/1 [==============================] - 0s 308us/step - loss: 0.0012 - accuracy: 1.0000
Epoch 16/20
1/1 [==============================] - 0s 305us/step - loss: 0.0012 - accuracy: 1.0000
Epoch 17/20
1/1 [==============================] - 0s 306us/step - loss: 8.8349e-04 - accuracy: 1.0000
Epoch 18/20
1/1 [==============================] - 0s 299us/step - loss: 0.0012 - accuracy: 1.0000
Epoch 19/20
1/1 [==============================] - 0s 322us/step - loss: 0.0011 - accuracy: 1.0000
Epoch 20/20
1/1 [==============================] - 0s 305us/step - loss: 9.3288e-04 - accuracy: 1.0000
41/41 [==============================] - 0s 625us/step - loss: 0.0436 - accuracy: 0.9915
Accuracy: 99.15

Java code :

public class SpamClassifier {
  private static MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
    .seed(123)
    .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
    .updater(new Adam(0.0001))
    .list()
    .layer(0, new DenseLayer.Builder()
      .nIn(512).nOut(600)
      .weightInit(WeightInit.XAVIER)
      .activation(Activation.RELU).build())
    .layer(1, new DropoutLayer.Builder().dropOut(0.2).build())
    .layer(2, new DenseLayer.Builder()
      .nIn(600).nOut(300)
      .weightInit(WeightInit.XAVIER)
      .activation(Activation.RELU).build())
    .layer(3, new DropoutLayer.Builder().dropOut(0.2).build())
    .layer(4, new OutputLayer.Builder(LossFunctions.LossFunction.XENT)
      .nIn(300).nOut(2)
      .weightInit(WeightInit.XAVIER)
      .activation(Activation.SIGMOID).build())
    .build();

  private static MultiLayerNetwork model = new MultiLayerNetwork(conf);

  public static void run() throws IOException, InterruptedException {
    RecordReader recordReader = new CSVRecordReader(0, ',');
    recordReader.initialize(new FileSplit(new ClassPathResource("spam.csv").getFile()));
    DataSetIterator iterator = new RecordReaderDataSetIterator(recordReader, 5171, 512, 2);
    DataSet allData = iterator.next();
    allData.shuffle(42);
    SplitTestAndTrain testAndTrain = allData.splitTestAndTrain(0.70);
    DataSet trainingData = testAndTrain.getTrain();
    DataSet testData = testAndTrain.getTest();
    model.init();
    model.setListeners(new ScoreIterationListener(1));
    model.setEpochCount(100);
    model.fit(trainingData);
    INDArray output = model.output(testData.getFeatures());
    Evaluation eval = new Evaluation(2);
    eval.eval(testData.getLabels(), output);
    System.out.println(eval.stats());
  }

  public static void main(String[] args) throws IOException, InterruptedException {
    run();
  }


}

Dataset i am using is around of size 5000 and has 512 input features. The same dataset is used in python as well. Output class is the last column of csv and has only 1’s and 0’

Java output :

========================Evaluation Metrics========================
 # of classes:    2
 Accuracy:        0.7126
 Precision:       0.7500
 Recall:          0.0133
 F1 Score:        0.0262
Precision, recall & F1: reported for positive class (class 1 - "1") only


=========================Confusion Matrix=========================
    0    1
-----------
 1100    2 | 0 = 0
  444    6 | 1 = 1

Confusion matrix format: Actual (rowClass) predicted as (columnClass) N times
==================================================================

Process finished with exit code 0

A little difference in accuracy can be understood but here difference is too large.

Follow this, the trick generally lies in the defaults.
https://deeplearning4j.konduit.ai/getting-started/tutorials/hyperparameter-optimization
If you want, you can save the model in keras and import it in to dl4j instead:
https://deeplearning4j.konduit.ai/keras-import/overview

Maybe compare the models with model.summary() to see what each one prints.

I have tried a lot of tuning already from above documentation. I have removed dropout layers from above code to simplify the model. After removing dropout layers. Both java and python still produce samre results i.e. python hvaing 98+ % accuracy and java having 71 % around.

Model summary of Python :

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 600)               307800    
_________________________________________________________________
dense_1 (Dense)              (None, 300)               180300    
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 301       
=================================================================
Total params: 488,401
Trainable params: 488,401
Non-trainable params: 0

Model summary java :

=======================================================================
LayerName (LayerType)   nIn,nOut   TotalParams   ParamsShape           
=======================================================================
layer0 (DenseLayer)     512,600    307,800       W:{512,600}, b:{1,600}
layer1 (DenseLayer)     600,300    180,300       W:{600,300}, b:{1,300}
layer2 (OutputLayer)    300,2      602           W:{300,2}, b:{1,2}    
-----------------------------------------------------------------------
            Total Parameters:  488,702
        Trainable Parameters:  488,702
           Frozen Parameters:  0
=======================================================================

Yes i can try saving keras model and loading in java but i want to undesrtand the issue here as we plan to use this library in our prod environment, hence i should know what i am using and why is it creating issues that too such drastic difference.

@luvk1412 no you misunderstood me. Use the summary on the imported model to see if there are any differences in the imports like the hyper parameters etc. The import will automatically tell you what the equivalent parameters should be.

As for using the lib in production, that’s the point of the model import. In your case for inference all you have to do is make sure the results are equivalent.

Edit: And yeah to be fair I also gave you a link for the hyper parameter tuning in dl4j so you can take a look. I told you where to look. Please don’t disregard that advice. If you can be more specific with your concerns/problems I can give more pointed advice, but otherwise start with references first to see where it leads. I"m helping you explore the problem so you can understand things for yourself.

1 Like

@agibsonccc thanks a lot for your help. i have tried loading keras model into python. I have compared the summaroes as well. The only difference i can see is keras shows only one node in output layer while dl4j shows 2 layers in output layer. Also keras one has extra loss layer as well.
in dl4j i have used
new OutputLayer.Builder(LossFunctions.LossFunction.XENT)
.nIn(300).nOut(2)
as i read somewhere online that nout needs to be number of classes we have for prediction. In keras there is only one node for output which will either provide 1 or 0. But if i try to keep one node in output layer in dl4j
i get this error
Exception in thread "main" java.lang.IllegalArgumentException: Labels and preOutput must have equal shapes: got shapes [3619, 2] vs [3619, 1]
which i am able to understand why its coming(in my understanding dl4j is loading [1,0] for labels which are 1 and [0,1] for which are 0, hence creating requirements for 2 neurons) but i am not sure how do i get this working with only one neuron in output layer in dl4j same as keras.
Just in case for information

Java :
=======================================================================
LayerName (LayerType)   nIn,nOut   TotalParams   ParamsShape           
=======================================================================
layer0 (DenseLayer)     512,600    307,800       W:{512,600}, b:{1,600}
layer1 (DropoutLayer)   -,-        0             -                     
layer2 (DenseLayer)     600,300    180,300       W:{600,300}, b:{1,300}
layer3 (DropoutLayer)   -,-        0             -                     
layer4 (OutputLayer)    300,2      602           W:{300,2}, b:{1,2}    
-----------------------------------------------------------------------
            Total Parameters:  488,702
        Trainable Parameters:  488,702
           Frozen Parameters:  0
=======================================================================
Loaded from keras saved h5
==========================================================================
LayerName (LayerType)      nIn,nOut   TotalParams   ParamsShape           
==========================================================================
dense (DenseLayer)         512,600    307,800       W:{512,600}, b:{1,600}
dropout (DropoutLayer)     -,-        0             -                     
dense_1 (DenseLayer)       600,300    180,300       W:{600,300}, b:{1,300}
dropout_1 (DropoutLayer)   -,-        0             -                     
dense_2 (DenseLayer)       300,1      301           W:{300,1}, b:{1,1}    
dense_2_loss (LossLayer)   -,-        0             -                     
--------------------------------------------------------------------------
            Total Parameters:  488,401
        Trainable Parameters:  488,401
           Frozen Parameters:  0
==========================================================================

Could you show how you’re loading your input? Are you directly loading the same npy files in both keras and dl4j?

@agibsonccc
The code for loading data in both python and java is in my post.
Also the link to the csv being used for both codes : [spam.csv - Google Drive](http://gdrive link)
same csv is being used in both codes.

@luvk1412
I imported your data etc and reproduced the f1 scores:

I tweaked your keras script to make it easier to reproduce in dl4j:

It turns out the datasets are different. I haven’t compared them yet, I just took your dataset as is and directly loaded the numpy arrays.

I also added some code for batching the datasets similar what you would find in keras.

I don’t have time to look at your dataset (it’s getting late here) but at least I found the difference. Feel free to ping me later.

Als, note the dropout config:
https://deeplearning4j.org/api/latest/org/deeplearning4j/nn/conf/dropout/Dropout.html

@luvk1412 take a look at this:

I got rid of shuffle with the test/train split and it seems to work fine.
I’ll have to look in to why that is…