Algorithms for speech and natural language processing
E. DUPOUX, B. SAGOT
ModellingNatural Language Processing

Prè-requis

Basic linear algebra, calculus, probability theory

Objectif du cours

Speech and natural language processing is a subfield of artificial intelligence used in an increasing number of applications; yet, while some aspects are on par with human performances, others are lagging behind. This course will present the full stack of speech and language technology, from automatic speech recognition to parsing and semantic processing. The course will present, at each level, the key principles, algorithms and mathematical principles behind the state of the art, and confront them with what is know about human speech and language processing. Students will acquire detailed knowledge of the scientific issues and computational techniques in automatic speech and language processing and will have hands on experience in implementing and evaluating the important algorithms.

[style1;Topics :]

– speech features & signal processing

– hidden markov & finite state modeling

– probabilistic parsing

– continuous embeddings

– deep learning for language-related tasks (DNNs, RNNs)

– linguistics and psycholinguistics

– comparing human and machine performance

Organisation des séances

Eight courses (2h) and 6 practical assignments (QAs for 1 hour) based around the implementation of key algorithms. For the assignments, students are provided with the necessary data and Python code and will hand in their source code and a max two page report, detailing their work, the difficulties encountered and the results.

1. Speech and language processing: Principles and applications (2h)

Presentation of the main speech and language processing stack, main computational challenges, and main practical applications.

No Assignement Readings

2. Speech I: Acoustic modeling (2h + 1h)

Algorithms: signal processing, speech features, source separation, GMMs, DNNs

Human processing: auditory neuroscience, psychoacoustics, speech perception, scene analysis

Assignment. Evaluating Speech Features

Given a set of modules (filterbank, compressor, expander, spectrogram), perform ABX on different combinations and evaluate classification performance.

Readings

J&M2 7.1-7.4, 9.3

3. Speech II: Language modeling (2h+1h)

Algorithms: HMM decoding: forward, Viterbi, RNN, LSTM, end-to-end CTC, embeddings. Human processing: speaker and accent adaptation, L2 perception

Application speech: application orthographique

Assignment. HMM Decoder

Given trained GMM-HMM, build a decoder and evaluate phone error rate on test set

Readings

J&M2 9.1-9.8

4. Language I: Formal Grammars and Syntax (2h+1)

Algorithms: Chomsky’s hierarchy, Finite state transducers, Context Free grammars, Mildly context sensitive formalisms

Human processing: linguistic typology

Assignment: POS Tagger

Given a training data, build a Hidden Markov Part-of-Speech tagger and evaluate accuracy on test set.

Readings

J&M3, ch. 9 & 10

5. Language II: Parsing (2h+1h)

Algorithms: PCFGs, treebanks, estimation, chart & dependency parsing Human processing: syntax, garden-paths, acquisition, dependency, hierarchy

Assignment: Probabilistic Parsing

Given a treebank, extract a PCFG, implement a CKY parser and evaluate parsing f1-score on a test set.

Readings

M. Covington (2001) – A Fundamental Algorithm for Dependency Parsing: http://www.stanford.edu/~mjkay/covington.pdf

J&M3 ch. 12 & 13

6. Language III: Language Processing in the wild (2h+1h)

Algorithms: text normalization, coreference, distributional semantics, word embeddings

Human processing: conversational & casual language

Assignment: Evaluating Topic Models

Given a dataset of documents and human topic annotation, correlate different topic models with human judgements.

Readings:

J. Chang, J. Boyd-Graber, C. Wang, S. Gerrish, and D. Blei (2009). Reading Tea Leaves: How Humans Interpret Topic Models. Neural Information Processing Systems. http://www.umiacs.umd.edu/~jbg/docs/nips2009-rtl.pdf

7.     Language IV: Lost in translation (2h+1h)

Algorithms: CNNs, sequence to sequence RNNs, attentional models, graph models Human processing:

Assignment:

Test find systematic patterns of errors in Google Translate

8.     Open questions and hot topics (2h)

–  linguistic and non linguistic context

– Unsupervised learning

–  Domain adaptation & zero shot learning

Références

The recommended, but not obligatory textbook for the course is D. Jurafsky & J. Martin – Speech and Language Processing, 3rd (online) edition for already available chapters [J&M3], 2nd edition otherwise [J&M2]. Readings for each of the sessions will be provided by the instructors.

Validation

Grading will be on the six homework sets, and the grade will be computed on the 5 best assignments.

Les intervenants

Emmanuel Dupoux

(INRIA CoML)

Benoît Sagot

(INRIA ALMANACH)

voir les autres cours du 2nd semestre