Learn Before
Case Study

Diagnose the data issue in a cat classifier's dev set.

Case context: You are developing a cat classifier. During error analysis of the dev set, you notice a picture of a dog that the algorithm correctly predicted as "not cat", but it was marked as an error because the target label was set to "cat".

Question: Diagnose the nature of this error. What is this type of issue called, who caused it, and what specific part of the data point (x, y) is incorrect?

Sample answer: This issue is a "mislabeled" example. The error was caused by a human labeler who assigned the label before the algorithm encountered the picture. Specifically, in the data point (x, y), the class label y was set to "cat" (incorrect value) instead of "not cat", while the input x is the picture of the dog.

Key points:

  • Identify the issue as a "mislabeled" example.
  • The error was introduced by a human labeler before the algorithm encountered the data.
  • The class label y in (x, y) has an incorrect value.

Rubric: The candidate must identify the issue as a "mislabeled" example. They must state that the error was made by a human labeler before the algorithm encountered it. They must specify that the class label y is the part of the data point (x, y) that has the incorrect value.

0

1

Updated 2026-06-18

Contributors are:

Who are from:

Tags

Machine Learning

Deep Learning

Supervised Learning

Dive into Deep Learning @ D2L

Data Science

Machine Learning Strategy

Machine Learning Yearning @ DeepLearning.AI