Modelling behaviour and brain activity using Multi-agent Reinforcement Learning

alternate text

Background:

Recent advancements in reinforcement learning (RL) have underscored the potential of framing RL and control problems as probabilistic inference tasks. Levine (2018) provides a comprehensive tutorial and review on this perspective, elucidating how RL can be interpreted through the lens of probabilistic inference. Building upon this foundation, Ghugare et al. (2022) propose a unified objective that jointly optimizes representations, latent-space models, and policies, simplifying the model-based RL paradigm. In parallel, neuroscientific studies have shed light on the neural circuits underlying action suppression and avoidance behaviors. Cruz et al. (2022) reveal opponent parallel control mechanisms via striatal circuits during action suppression, while Ehret et al. (2024) investigate population-level coding of avoidance learning in the medial prefrontal cortex. Integrating these insights from machine learning and neuroscience we aim at building a biologically plausible RL model that captures the multiplicity of the neural dynamics across different tasks.

 

Master Thesis / Semester Project:

This project aims to develop a biologically realistic multi-agent reinforcement learning (MARL) framework that models neural dynamics observed during avoidance learning tasks. By extending the probabilistic inference approach to RL and incorporating findings from recent neuroscientific studies, we seek to create models that not only perform effectively but also mirror the underlying neural processes.

In this project you'll be able to focus on one (or many) of the following problems:

  1. 1. Framework Development: Utilize the probabilistic inference perspective on RL as detailed by Levine (2018) and Ghugare et al. (2022) to construct a MARL framework that can accommodate the complexities of avoidance learning behaviors in a neurally compatible and interpretable architecture.
  2. 2. Model Alignment: Align the MARL framework with neuroscientific data by integrating mechanisms that reflect the opponent parallel control via striatal circuits during action suppression shown in Cruz & Guiomar et al. (2022).
  3. 3. Validation and Data Analysis: Validate the MARL framework against empirical data on population-level coding of avoidance learning in cortex and striatum, following Ehret et al. (2024).

 

Your benefits:

Engaging in this project positions you at the confluence of reinforcement learning, neuroscience and model optimisation, offering the opportunity to contribute to the development of biologically informed AI models. You will gain experience in probabilistic modeling, multi-agent systems, and the integration of computational models with neuroscientific data, potentially culminating in a high impact publication.

 

Related works / Preliminary readings:

  • - Levine, S. (2018). Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review. arXiv preprint arXiv:1805.00909. arXiv
  • - Ghugare, R., Bharadhwaj, H., Eysenbach, B., Levine, S., & Salakhutdinov, R. (2022). Simplifying Model-based RL: Learning Representations, Latent-space Models, and Policies with One Objective. arXiv preprint arXiv:2209.08466. arXiv
  • - Cruz, B. F., Guiomar, G., Soares, S., Motiwala, A., Machens, C. K., & Paton, J. J. (2022). Action suppression reveals opponent parallel control via striatal circuits. Nature, 607(7919), 521–526.
  • - Ehret, B., Boehringer, R., Amadei, E. A., Cervera, M. R., Henning, C., Galgali, A. R., Mante, V., & Grewe, B. F. (2024). Population-level coding of avoidance learning in medial prefrontal cortex. Nature Neuroscience, 27(9), 1805–1815. Zenodo

 

Your profile:

We seek a motivated student with a background in machine learning, neuroscience, or a related field. Proficiency in Python and PyTorch and some experience with reinforcement learning is preferred.

 

Supervisors:

  • - Dr. Gonçalo Guiomar (ETH, UZH): goncalo (at) ini.uzh.ch
  • - Philipp Eugster (UZH)
  • - Prof. Valerio Mante (ETH, UZH)
  • - Prof. Benjamin Grewe (ETH, UZH)

 

Starting date + Duration:

This project is currently available as a semester project or thesis.