Diagnosing Hidden Errors in a Cat Classifier Dev Set
Case context: You are building a cat classifier. Your dev set has 1,000 images, and your algorithm misclassified 50 of them. To improve the quality of your dev set labels, your colleague suggests reviewing only those 50 misclassified images to find and fix any incorrect labels.
Question: What is the flaw in your colleague's proposed label review strategy, and what specific type of error might remain undetected if you follow their advice?
Sample answer: The flaw in the strategy is that it ignores the 950 correctly classified images, which might also contain labeling errors. If you only review the misclassified images, you fail to detect cases where an image was incorrectly labeled and the algorithm also incorrectly predicted it as that wrong label. In this scenario, the algorithm appears to be correct because its prediction matches the label, but both are actually wrong.
Key points:
- The strategy fails to review correctly classified examples.
- It misses cases where both the label and the prediction are wrong.
- An incorrect prediction matching an incorrect label appears as a correctly classified example.
Rubric: The response must identify that the strategy ignores correctly classified examples. It must also explain the specific undetected error: when an incorrect original label matches an incorrect prediction made by the learning algorithm.
0
1
Tags
Machine Learning
Deep Learning
Supervised Learning
Dive into Deep Learning @ D2L
Data Science
Machine Learning Strategy
Machine Learning Yearning @ DeepLearning.AI
Related
Bias from Fixing Labels of Only Misclassified Dev Examples
When improving dev set label quality, which examples should you double-check?
It is possible for both the original label and the learning algorithm to be wrong on the same dev example.
When improving label quality, double-check labels of both _____ and correctly classified dev examples.
Match each label-quality review scenario to its correct description.
Order the steps for conducting a thorough dev set label quality review per Machine Learning Yearning.
Why might a correctly classified dev example still contain a labeling error?
Reviewing only misclassified dev examples is sufficient for a complete label quality improvement process.
It is possible that both the original _____ and the learning algorithm were wrong on the same dev example.
Match each dev example category to its significance in the label quality review process.
Order the reasoning steps that justify reviewing correctly classified examples for label errors.
Discuss why it is insufficient to only review misclassified examples when improving dev set label quality.
Diagnosing Hidden Errors in a Cat Classifier Dev Set
Rationale for Reviewing Correctly Classified Examples