Master Thesis or Semester Project: Decoding speech from intracranial recordings in the human brain

alternate text

Aphasia, the loss of the ability to communicate through speech due to brain injury, is a profoundly debilitating condition. Affecting a significant number of individuals, aphasia often results in long-term communication challenges due to the limited capacity of the adult brain to fully reorganize after injury. However, recent advances in neurotechnology offer a promising avenue for restoring communication in these patients. Specifically, the ability to record brain activity using surgically implanted electrodes – a procedure already performed clinically for epilepsy evaluation – opens the door to directly decoding speech from neural signals. This project focuses on developing and evaluating machine learning algorithms to achieve this goal, paving the way for future brain-computer interfaces (BCIs) for communication restoration.

 

Project description

This project aims to decode speech from human brain activity recorded using intracranal electrodes. We leverage a unique clinical opportunity: the invasive evaluation of patients considered for epilepsy surgery, where electrodes are routinely implanted to localize seizure origins. By analyzing neural recordings obtained while patients listen to audio stories and engage in natural conversations, we will develop algorithms to decode speech information at multiple levels of processing – from basic speech units like phonemes and syllables, through words, and ultimately, entire sentences. The goal is to build robust and accurate decoding models that can translate neural activity into intelligible speech.

 

Methodology

In this project, you will:

• Process and analyze large-scale intracranial electrophysiological data.

• Implement and evaluate machine learning models, with a focus on deep learning architectures such as Transformer networks, using libraries like PyTorch.

• Explore different feature extraction techniques to optimize decoding performance at various levels of speech processing.

• Compare and contrast the effectiveness of different decoding strategies and model architectures.

• Validate the decoding models using established metrics and potentially through perceptual evaluation of synthesized speech.

The project will utilize existing, pre-collected datasets of neural-voice recordings, allowing you to focus on algorithm development and evaluation.

 

Requirements

- Proficiency in Python is essential.

- A solid foundation in machine learning and data analysis is required.

 

Contact

- Prof. Timothée Proix: proix@ini.ethz.ch

 

Starting date and duration

This project is currently available as a semester project or master thesis. The start date is flexible.

 

Related literature

- Makin et al. (2020). Machine translation of cortical activity to text with an encoder–decoder framework. Nature Neuroscience(23) 4. DOI: 10.1038/s41593-020-0608-8

- Tang et al. (2023). Semantic reconstruction of continuous language from non-invasive brain recordings Nature Neuroscience (26) 858-866. DOI: 10.1038/s41593-023-01304-9

- Defossez et al. (2023). Decoding speech perception from non-invasive brain recordings. Nature Machine Intelligence(5) 10. DOI: 10.1038/s42256-023-00714-5