Learn Before
Applying learning curves to diagnose performance
Case context: You are developing a machine learning model, and its current dev-set error is unacceptably high. You want to determine if investing months into gathering more training data will actually help reduce the error.
Question: How should you use a learning curve in this situation to inform your decision?
Sample answer: You should plot a learning curve by calculating and graphing the dev-set error for various subsets of your current training data. By observing the trend of the dev-set error as the number of training examples increases, you can extrapolate whether the curve is still trending downwards (meaning more data will likely help) or if it has flattened out (meaning more data will not significantly reduce the error).
Key points:
- Plot dev-set error against training set size
- Observe the trend as training examples increase
- Extrapolate to make a decision about gathering data
Rubric: The answer should identify plotting the dev-set error against the number of training examples to observe the trend and evaluate the benefit of more data.
0
1
Tags
Machine Learning
Deep Learning
Supervised Learning
Dive into Deep Learning @ D2L
Data Science
Machine Learning Strategy
Machine Learning Yearning @ DeepLearning.AI
Related
Constructing a Learning Curve by Varying Training Set Size
Dev-Set Error Should Decrease as Training Set Size Increases
Desired Error Rate for a Learning Algorithm
Using a Dev-Error Learning Curve to Estimate the Benefit of More Data
Training Error Plot for Estimating the Effect of More Data
Interpreting Learning Curves with Training and Dev Error
Small Training Sets Can Make Learning Curves Noisy
Identifying the axes of a learning curve
Purpose of a learning curve
A learning curve plots your _____ error against the number of training examples.
Components of a learning curve
Steps to construct a learning curve
Analyzing the utility of learning curves
Applying learning curves to diagnose performance
Defining a learning curve
The dependent variable in a learning curve
Informational value of learning curves