Overfitting vs Underfitting

These terms describe two opposing extremes which both result in poor performance.

Overfitting refers to a model that was trained too much on the particulars of the training data (when the model learns the noise in the dataset). A model that is overfit will not perform well on new, unseen data. Overfitting is arguably the most common problem in applied machine learning and is especially troublesome because a model that appears to be highly accurate will actually perform poorly in the wild.

Underfitting typically refers to a model that has not been trained sufficiently. This could be due to insufficient training time or a model that was simply not trained properly. A model that is underfit will perform poorly on the training data as well as new, unseen data alike.

Both underfitting and overfitting will yield poor performance -- the sweet spot is in between these two extremes. As the number of training iterations increases, the parameters of the model are updated and the curve goes from underfitting to optimal to overfitting.

The optimal state is is referred to as generalization. This is where the model performs well on both training data and new data not seen during the training process.

Last updated