Learn Before
Case Study

Evaluating an End-to-End Approach for a Dictation Application

Case context: You are a machine learning engineer tasked with building a voice dictation feature. You have access to a large dataset of audio clips paired with their accurate text transcripts.

Question: Based on the principles outlined in Machine Learning Yearning, why might an end-to-end learning architecture be a suitable choice for this project?

Sample answer: An end-to-end architecture is suitable because the project has the right (input, output) labeled pairs—audio clips and their transcripts. The text specifically highlights that end-to-end speech recognition works well, allowing the system to input the audio clip and directly output the rich transcript.

Key points:

  • End-to-end speech recognition is known to work well.
  • The project has the necessary (input, output) labeled pairs.
  • The system can directly output the transcript from the audio input.

Rubric: The response must recommend the end-to-end approach, justify it by citing the availability of labeled pairs, and note the proven success of end-to-end speech recognition.

0

1

Updated 2026-05-27

Contributors are:

Who are from:

Tags

Machine Learning

Deep Learning

Supervised Learning

Dive into Deep Learning @ D2L

Data Science

Machine Learning Strategy

Machine Learning Yearning @ DeepLearning.AI