Speech Recognition and Graph Transformer Nets

NYU Center for Data Science

Difficulty level

Advanced

Speaker

Type

Duration

1:55:03

Topic

Deep learning

This lecture provides an introduction to the problem of speech recognition using neural models, emphasizing the CTC loss for training and inference when input and output sequences are of different lengths. It also covers the concept of beam search for use during inference, and how that procedure may be modeled at training time using a Graph Transformer Network. It is a part of the Deep Learning Course at NYU's Center for Data Science. Prerequisites for this module include: Modules 1 - 5 of this course and an Introduction to Data Science or a Graduate Level Machine Learning course.

Topics covered in this lesson

Chapters:

00:00 – Guest lecturer introduction
01:10 – Outline
02:36 – Modern speech recognition
09:26 – Connectionist temporal classification
54:44 – Decoding with beam search (inference)
1:11:09 – Graph Transformer Networks

External Links

Lecture Slides

Lecture Notes

Speech recognition and graph transformer networks notes I

Speech recognition and graph transformer networks notes II

Yann LeCun's Deep Learning Course at CDS

Prerequisites