Reinforcement learning
M. PIROTTA
LearningMachine Learning

Prè-requis

Basic of Probability and Statistics (niveau L3 maths ou GE)

Objectif du cours

Introduction to the models and mathematical tools used in formalizing the problem of learning and decision-making under uncertainty. In particular, we will focus on the frameworks of reinforcement learning and multi-arm bandit.

Organisation des séances

  • 8 cours théoriques de 2h
  • 3 travaux dirigés de 3h

Mode de validation

Reading of papers of interest, implementation or theoretical analysis of reinforcement learning algorithms. The project will be evaluated on the basis of a short report and an oral presentation.

Références

  • Processus decisionnels de Markov et Intelligence Artificielle, 2008. Editeurs O. Sigaud et O. Buffet.
  • Neuro-Dynamic Programming, Bertsekas et Tsitsiklis, 1996.

Thèmes abordés

  • Historical multi-disciplinary basis of reinforcement learning
  •  Markov decision processes and dynamic programming
  • Stochastic approximation and Monte-Carlo methods
  •  Function approximation and statistical learning theory
  • Approximate dynamic programming
  • Introduction to stochastic and adversarial multi-arm bandit
  • Learning rates and finite-sample analysis
Les intervenants

Matteo PIROTTA

voir les autres cours du 1er semestre