Dealing with Device Mismatch in Deep Networks from an Algorithmic Perspective
This Master’s Thesis project is centered on machine learning with notions of control theory and electronics. Our goal is to compare how different learning algorithms perform when the algorithm does not have full knowledge of the parameters and nonlinearities present in the network it trains.
Deep neural networks are the powerhorse of modern machine learning, but their performance comes at extremely high energetic costs, with the associated environmental problems and limitations for embedded applications. A natural solution is to run deep networks on specialized circuits that can perform all the operations at the physical level. However, this problem faces a fundamental challenge: analog or even spiking devices suffer from device mismatch, meaning that imperfections and variability in the fabrication process lead to differences between the elements in the designed circuit and its real implementation.
Since the device mismatch is a problem that has multiple sources and in efficient devices it will come down to fundamental physical limitations, we ask the question of whether the issue can be solved at the algorithmic level.
The starting point of this thesis is to take well known algorithms that are commonly used in machine learning and apply them to train neural networks where the activation functions of the neurons and the weights have some unknowns. For this, we will model different types of mismatches in both neurons and weights, and apply learning algorithms such as backpropagation, equilibrium propagation or feedback alignment without letting the algorithm access the mismatch.
In previous work we have developed an algorithm based on control theory ideas that does not assume a model of the underlying circuit: instead, a controller nudges the neural activities of the circuit towards its desired state, while simultaneously adapting its own model of the system. The nudge imposed by the controller can then be used by the circuit to adapt its weights in a procedure that can be equivalent to Gauss-Newton optimization.
• Familiarize yourself with the main algorithms used for training deep networks.
• Create a computer model of a deep neural network with virtual imperfections.
• Test the performance of various learning algorithms on the model.
• In the learning algorithm of (Alex), study how the controller identifies the model imperfections.
• Interest in deep learning, electronics, and good grasp of optimization theory or control theory.
• Programming skills (python preferred).
• Willingness to learn some basics of control theory, robotics, statistical physics and computational neuroscience.
The project will take place under the supervision of Prof. Benjamin Grewe being the PI and Dr. Pau Vilimelis Aceituno as the direct supervisor. We will discuss and collect input from Alexander Meulemanns, Matilde Tristany Farinha and Martino Sorbaro.
Meulemans, Alexander, et al. "Credit Assignment in Neural Networks through Deep Feedback Control." arXiv preprint arXiv:2106.07887 (2021).
Contacts: pau(at)ini.uzh.ch and bgrewe(at)ini.uzh.ch