1Cademy - Automatic Speech Recognition

Learn Before

Frame-based Dialogue System
Natural language processing
Sequence-to-Sequence Learning

Concept

Automatic Speech Recognition

Automatic speech recognition is a sequence-to-sequence learning task where the input sequence is an audio recording of a speaker and the output is a text transcript of the spoken words. A significant challenge in this domain is that there is no one-to-one correspondence between audio frames and text, as thousands of audio samples may correspond to a single word, making the input sequence much longer than the output sequence.