I am a third year PhD student at SequeL, INRIA Lille – Nord Europe, under the supervision of Alessandro Lazaric and Daniil Ryabko. My research interests lie in designing Reinforcement Learning algorithms with provable good performance. Reinforcement Learning –RL for short– is a subfield of Machine Learning. I am particularly interested in the exploration-exploitation trade-off in on-line RL (from a theoretical perspective). In this context, the performance of an algorithm is usually measured in terms of sample complexity or regret. Current state-of-the-art RL methods relying on Deep Learning are now able to learn to play Atari games or Go at a super-human level. However, they require huge amount of training data. In contrast, most humans manage to play relatively well with a very limited experience. The lack of sample efficiency of Deep Reinforcement Learning (DRL) algorithms is a clear obstacle to their deployment in real-world applications. I believe that understanding the theoretical properties of RL is essential to overcome the challenge of sample efficient RL. This is what motivates my interest in algorithms that we can analyse theoretically and for which we can prove near-optimal performance (in terms of sample complexity or regret). Although such algorithms do not scale to large dimensional tasks yet (e.g., Atari games, robotics, etc…), they can provide good intuitions on how to improve DRL approaches.
During my PhD, I attended two summer schools on Machine Learning: MLSS 2017 and the first edition of DS3 (2017). Before starting my Ph.D. I completed an MSc at CentraleSupélec (a French Engineering School part of Paris-Saclay University), with a major in Applied Mathematics and Machine Learning. Many lectures that I attended were part of the renowned MVA Master organized by ENS Paris-Saclay. In parallel of my MSc and BSc, I worked for 3 years in an R&D team of Airbus Group as an apprentice software engineer. My work focused on developing algorithms in the field of Signal Processing (Kalman filtering, Interactive Multiple Models, etc…), Multi-Agent Task Assignment (centralized/decentralized protocols, computation of the Pareto frontier, etc…), and Game Theory (2-player extensive form games with incomplete and imperfect information, computation of Nash equilibria and refinements, etc…). My work consisted in: 1) finding the appropriate mathematical framework to model challenging practical problems encountered by the company, 2) looking for algorithms in the literature to solve these problems (adapting the algorithms to the specific problems at hand when necessary), 3) implementing and empirically validating the approach.
Picture: © Ralitza Soultanova – http://photo-pro.be/