How does a learning curve give insight into whether the model is under- or over-fitting?

A model that is underfit will produce evaluation metrics that are poor on the training data alone, such as high RMSE or misclassification rate. A model that is overfit will appear to evaluate well on the training data but will show a strong deterioration in its performance metrics on a validation data set compared to the training set, such as low RMSE on the training but high on the validation. 

A learning curve is a diagnostic tool that plots the error metric used to evaluate a machine learning algorithm for both the training and validation data at each iteration of the algorithm. In most cases, the training error, or deviance, will continue to decrease as the model is built out, while the validation error decreases for a number of iterations before eventually increasing. The point at which the validation error first begins to rise provides guidance for an appropriate number of iterations to balance the bias/variance tradeoff. A classic learning curve is drawn below, with the optimal stopping point marked. 

learning curve
A typical learning curve for training and validation sets
Source: Pocketdentistry

If a model is significantly underfit, both the training and validation error will be high and not significantly improve over further iterations. If the training error does not mostly flatten out by the last few iterations, it is likely a sign that the number of iterations are not sufficiently large enough for the algorithm to appropriately learn the data. On the other hand, if the training error is flat for many iterations while at the same time the validation error is increasing, the model is overfitting at that point of the algorithm, and the number of iterations should be decreased to the point at which the validation error first begins to rise.

underfitting-learning-curveoverfitting-learning-curve

Understanding the phenomenon of overfitting, underfitting and perfect fit using learning curves
Source: HIStalk. Image further annotated by AIML.com for better visibility

Author

Help us improve this post by suggesting in comments below:

– modifications to the text, and infographics
– video resources that offer clear explanations for this question
– code snippets and case studies relevant to this concept
– online blogs, and research publications that are a “must read” on this topic

Leave the first comment

Partner Ad
Find out all the ways that you can
Contribute