Algorithms for speech and natural language processing
C. CLAVEL, D. SEDDAH, R. BAWDEN, G. WISNIEWSKI, B. SAGOT
ModellingNatural Language Processing

Prè-requis

Basic linear algebra, calculus, probability theory

Objectif du cours

Speech and natural language processing is a subfield of artificial intelligence used in an increasing number of applications. This course will provide an overview and details of techniques and tasks used in the automatic processing of text and speech, covering certain history aspects of the field, the representation of textual and speech data, language modelling, machine translation, sentiment analysis and other labelling tasks, chatbots and speech synthesis and recognition. The aim is to provide the key principles, algorithms and mathematical principles behind the state of the art, and confronting them with the reality of processing real data.

 

En savoir plus : https://github.com/rbawden/MVA_2024_SL/

Organisation des séances

The courses consist in 7 three-hours slots.

Each three-hour slot will have a lecture lasting approximately two hours, followed by a quiz and Q&As.

Mode de validation

Evaluation consists of 2 parts:

  • Quizzes (30% of the total grade): You’ll be given a link to an online questionnaire (google form) and will have 30 minutes to complete the questionnaire, which will be activated exactly at 6:00pm and closed down at a time decided on-the-fly by the professors, generally 6:30pm. Any forms submitted after the deadline will be automatically rejected and graded as zero. The quizzes will contain comprehension questions and the best 5 grades out of the 6 quizzes will be used for the average. Between 6:30 and 7:00 there will be a Q&A period where you’ll be able to ask questions about the course and quiz.
  • Final exam (70% of the total grade): This year (due to time constraints), there will be a final written exam, with theory questions covering topics covered in lectures.

Références

The recommended, but not obligatory textbook for the course is D. Jurafsky & J. Martin – Speech and Language Processing, 3rd (online) edition for already available chapters [J&M3], 2nd edition otherwise [J&M2]. Readings for each of the sessions will be provided by the instructors.

Thèmes abordés

Topics:

  • speech features & signal processing
  • hidden markov & finite state modeling
  • word embeddings
  • deep learning for NLP (RNNs, transformers)
  • neural language modelling, including large language models (LLMs)
  • machine translation
  • sentiment analysis
  • sequence labelling tasks
  • chatbots
  • evaluation: comparing human and machine performance
  • speech synthesis and speech recognition

En savoir plus

Les intervenants

Chloé Clavel

(INRIA)

Djamé Seddah

(INRIA)

Rachel Bawden

(INRIA)

Guillaume Wisniewski

(Université Paris)

Benoît Sagot

(INRIA ALMANACH)

voir les autres cours du 2nd semestre