Concept

Automatic Speech Recognition

Automatic speech recognition is a sequence-to-sequence learning task where the input sequence is an audio recording of a speaker and the output is a text transcript of the spoken words. A significant challenge in this domain is that there is no one-to-one correspondence between audio frames and text, as thousands of audio samples may correspond to a single word, making the input sequence much longer than the output sequence.

Image 0

0

1

Updated 2026-05-01

Tags

Data Science

D2L

Dive into Deep Learning @ D2L

Related