Training Accuracy 100%, Validation Accuracy 0%, and Testing Accuracy 92% in Chapter 6 challenge 2

coder_rgv7 · October 22, 2020, 4:11pm

I finished chapter 6 and was doing the optional challenges at the end. I tried challenge 2, where you have to use resnet instead of squeezenet, and got some very strange metrics which I’m not understanding.
I looked at my training session and saw the stats as this:
(the last column is validation accuracy and the second last column is training accuracy).
Screenshot 2020-10-22 at 9.34.11 PM
I thought this was a classic case of overfitting, but out of curiosity I decided to see what would happen when I evaluated it on the test data. Surprisingly I got very good test stats:
Screenshot 2020-10-22 at 9.39.35 PM
I’m not sure why this is happening. Why is my model getting 100% training accuracy and 0% validation accuracy, but good test accuracy? Have I done something wrong with my code?
Or, since this is actually the second time I had to run the model since it was buggy in the first run, did the test data being exposed twice help the overfitted model?

hollance · October 22, 2020, 5:17pm

You can see that the validation loss is very high, and that it becomes higher over time (in your screenshot it starts out at 24 and goes up to 46) while the training loss is much smaller.

Turi Create chooses the validation set randomly on 5% of the training data. It’s possible that it just chose very badly this time. Since your model gives very good results on the test set, this seems to be very likely.