Online few-shot learning on small-world graphical hardware

Motivation
While recent machine learning systems outperform humans at various challenging tasks such as vision, speech recognition, and games, they require immense data to train. The data efficiency of humans (few-shot learners) is widely attributed to our DNA encoding the neural network architectures in the brain. Therefore, rather than learning everything from scratch whilst interacting with the environment, which would be highly costly, our brain takes advantage of the neural architectures that are developed during evolution. Such neural architectures improve data encoding, data flow, and representation, and thus the brain primarily fine tunes the network parameters during its lifetime on top of the pre-optimized network topology. This project proposes to get inspired by this mechanism and design an in-memory computing neuromorphic learning engine that performs a similar few-shot learning mechanism.

Project Description

Background
Recently, we have proposed a novel neuromorphic hardware, the Mosaic, which is inspired by the local connectivity structure of the cortex [1]. The Mosaic is a spiking recurrent neural network accelerator using scalable emergent memory technologies i.e., Resistive Random Access Memory (RRAM). The neural network architecture has a specific connection pattern featuring the small-world property which makes it extremely area and energy efficient with additional computational benefits. Since Mosaic has segregated neural cores (tiles) with dedicated learning circuitry, it is capable of doing local few-shot learning, on top of a previously learned topology (offline). In this project, we will train a spiking recurrent neural network using the surrogate gradient method to end up with an architecture (i) featuring small-world property (ii) having a representation that allows few-shot learning on speech recognition tasks.

Approach
Meta-learning has emerged recently as an approach for learning from small amounts of data by training a model on a variety of learning tasks [2, 3, 4]. Specifically, Model Agnostic Meta-Learning (MAML) uses a bi-level optimization method to learn a good parameter initialization for the network on a family of tasks, such that through applying a few parameter changes, the network performs well on the specific task at hand, sampled from the distribution of tasks. Similarly, Reptile uses a simplified version of MAML, similar to joint training, to learn an initialization for the parameters of a neural network model, such that when optimizing these parameters at test time, learning is fast. In this project, we will meta-learn the topology of a small-world graph, using MAML or Reptile, on a family of speech tasks. Based on this pre-learned topology, we perform few-shot online learning on unseen speech data.

References
[1] Dalgaty*, Thomas, Moro*, Filippo, Demirag*, Yigit, De Pra, Alessio, Indiveri, Giacomo, Vianello, Elisa, and Payvand, Melika. The neuromorphic mosaic: re-configurable in-memory small-world graphs. 2021.
[2] Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. arXiv, 2017.
[3] Alex Nichol, Joshua Achiam, and John Schulman. On First-Order Meta-Learning Algorithms.arXiv, 2018.
[4] Kenneth Stewart and Emre Neftci. Meta-learning Spiking Neural Networks with Surrogate Gradient Descent. arXiv, 2022.

Requirements

• Strong programming skills in Python.
• In-depth neural network training, hyperparameter optimization and debugging skills.
• Experience with PyTorch or JAX machine learning frameworks.
• Being interested in few-shot learning methods (Reptile, MAML) and general memristor knowledge is preferred.

Tasks involved in this project
• Study of few-shot learning algorithms (Reptile and MAML)
• Working on few-shot learning optimized data loader functions for common spike-based speech recognition benchmarks e.g., Spiking Heidelberg Digits and Google Speech Commands Dataset
• Working on the MAML-based training of small-world spiking neural network implementation developed in our group
• Investigate the impact of analog non-idealities on software (limited bit-precision, noise and device non-linearities)

Contact

This project will target a high-quality journal publication. If you are interested in this project for a Masters thesis, please contact Melika Payvand and Yigit Demirag with
{melika, yigit}@ini.uzh.ch.

© 2022 Institut für Neuroinformatik