Quantcast
Channel: Recent Questions - Stack Overflow
Viewing all articles
Browse latest Browse all 11781

CNNs overfitting on FER2013

$
0
0

I want to train a few chosen models (MobileNet, Xception and ResNet50) for a task of facial emotion recognition. I am using the FER2013 dataset, however I don't need to recognize all included emotions, only sad, angry, fearful, neutral and happy. So it's 5 labels in total. Because the dataset is imbalanced, I applied class weights. I'm training the models from scratch with Keras and Tensorflow.

Based on Papers with code (~70% on Inception for example) I would expect to achieve accuracy around 70% or even more, as these results are for the full 7-class dataset.

Unfortunately, the highest the models go is ~65% (Xception), ~62% (ResNet50) and ~63% (MobileNet) before they start to overfit.

For data augmentation I'm using the following transformations:

train_datagen = tf.keras.preprocessing.image.ImageDataGenerator(    rescale=1./255,    width_shift_range=0.1,    height_shift_range=0.1,    zoom_range=0.1,    fill_mode='constant',    cval=0,    horizontal_flip=True,)

I'm using SGD optimizer with initial learning rate equal to 1e-3, momentum 0.9 and weight decay of 1e-4 (I have tried to use 1e-6 and 1e-2 with no improvements). Learning rate is halved every 10-epoch stagnation. Batch size is equal to 16 as the size of 8 gave no advancements, only making the traning process longer.

As an example, here are the metrics from training Xception (batch size = 16, initial learning rate = 0.001, momentum = 0.9, weight decay = 1e-4):

Training accuracy:

enter image description here

Testing accuracy:

enter image description here

Training loss:

enter image description here

Testing loss:

enter image description here

The best accuracy for this model was 65.64%.

What could be improved in my training method? Is there any way to achieve better results?


Viewing all articles
Browse latest Browse all 11781

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>