Prè-requis
Basic of Probability and Statistics (niveau L3 maths ou GE)
Objectif du cours
Introduction to the models and mathematical tools used in formalizing the problem of learning and decision-making under uncertainty. In particular, we will focus on the frameworks of reinforcement learning and multi-arm bandit.
Organisation des séances
- 8 cours théoriques de 2h
- 3 travaux dirigés de 3h
Mode de validation
Reading of papers of interest, implementation or theoretical analysis of reinforcement learning algorithms. The project will be evaluated on the basis of a short report and an oral presentation.
Références
- Sutton, R. et Barto, A. Reinforcement Learning: An Introduction.
- Processus decisionnels de Markov et Intelligence Artificielle, 2008. Editeurs O. Sigaud et O. Buffet.
- Neuro-Dynamic Programming, Bertsekas et Tsitsiklis, 1996.
- Algorithms for Reinforcement Learning. Cs. Szepesvari, 2009
Thèmes abordés
- Historical multi-disciplinary basis of reinforcement learning
- Markov decision processes and dynamic programming
- Stochastic approximation and Monte-Carlo methods
- Function approximation and statistical learning theory
- Approximate dynamic programming
- Introduction to stochastic and adversarial multi-arm bandit
- Learning rates and finite-sample analysis
Les intervenants