About me

Γνῶθι σεαυτόν (know thyself)

I recently joined Decathlon to work as a Data Scientist on various Machine Learning topics.

Before that, I wad a Data Scientist at Vekia, a start-up applying AI algorithms to overcome various challenges arising in Supply Chain. I was mostly (but not exclusively) working on time series forecasting problems and inventory problems. I was particularly interested in investigating forecast-free approaches to address replenishment problems (i.e, predicting the quantity to order/produce directly without forecasting the demand). I was (and still am) curious about what active learning and more broadly on-line Reinforcement Learning could bring to the field by carefully designing clever exploration strategies e.g., to handle cold start issues, or to actively infer the demand when shortages occur, etc.

Before that, I was a PhD student at SequeL (now Scool), INRIA Lille – Nord Europe, under the supervision of Alessandro Lazaric and Daniil Ryabko. My research interests lied in designing Reinforcement Learning algorithms with provable good performance. Reinforcement Learning –RL for short– is an area of Machine Learning concerned with sequential decision making in an unknown environment.  I was particularly interested in the exploration-exploitation trade-off in on-line RL (from a theoretical perspective). In this context, the performance of an algorithm is usually measured in terms of sample complexity or regret. Current state-of-the-art RL methods relying on Deep Learning are now able to learn how to play Atari games or Go at a super-human level. However, they require huge amount of training data. In contrast, most humans manage to play relatively well with a very limited experience. The lack of sample efficiency of Deep Reinforcement Learning (DRL) algorithms is a clear obstacle to their deployment in real-world applications. I believe that understanding the theoretical properties of RL is essential to overcome the challenge of sample efficient RL. This is what motivated my interest in algorithms that can be analysed theoretically and for which we can prove near-optimal performance (in terms of sample complexity or regret). Although such algorithms do not scale to large dimensional tasks yet (e.g., Atari games, robotics, etc…), they can provide good intuitions on how to improve DRL approaches.

In January 2019, I joined Facebook AI Research in Montreal for a 4-month internship with Joelle Pineau. I worked on off-policy methods for policy gradient algorithms like Actor-Critic.

During my PhD, I had the opportunity to collaborate with many other awesome researchers including: Matteo Pirotta, Emma Brunskill, Ronald Ortner and Mohammad Ghavamzadeh. I attended two summer schools on Machine Learning: MLSS 2017 and the first edition of DS3 (2017).

Before starting my Ph.D. I completed an MSc at CentraleSupélec (a French Engineering School part of Paris-Saclay University), with a major in Applied Mathematics and Machine Learning. Many lectures that I attended were part of the renowned MVA Master organized by ENS Paris-Saclay. In parallel of my MSc and BSc, I worked for 3 years in an R&D team of Airbus Group as an apprentice software engineer. My work focused on developing algorithms in the field of Signal Processing (Kalman filtering, Interactive Multiple Models, etc…), Multi-Agent Task Assignment (centralized/decentralized protocols, computation of the Pareto frontier, etc…), and Game Theory (2-player extensive form games with incomplete and imperfect information, computation of Nash equilibria and refinements, etc…). My work consisted in: 1) finding the appropriate mathematical framework to model challenging practical problems encountered by the company, 2) looking for algorithms in the literature to solve these problems (adapting the algorithms to the specific problems at hand when necessary), 3) implementing and empirically validating the approach.

For more about me, you can download my resume or visit the rest of this website.

Picture: © Ralitza Soultanova – http://photo-pro.be/