I'm training a CNN for a classification task and I split the dataset into three parts: 70% for training, 15% for validation and final 15% for testing. I use the training set to train the network and use the validation set to choose the hyper-parameters. After all the works, I tested my model using the test set, however, it turned out the model performed better on the test set than the validation set(one got 85% accuracy and the other got 80% accuracy).
Would it be possible or I did something wrong? Since I put much effort on improving the model performance on validation set while the test set is invisible during the whole process of model training.
3 Answers 3
It is possible when your test set is a better representation of your training data than validation. Usually it means there is some problem is the way you split the data
Try repeating the experiment after randomly shuffling the data and by creating the splits again to make sure that it is not due to the lucky splits. Have you used stratified sampling when you have created the splits.
Comments
Looks a bit strange. Just to be 100% sure: 1. Increase validation up to 20-25%. 2. Use StratifiedKFold if you did not have. 3. Run model several times and get several scores.
Comments
If you use cross validation in your training phase, the chance of under fitting would be minimized. For example, the averaged performance of the model on your training data should be slightly better than the result on the test data.
Comments
Explore related questions
See similar questions with these tags.