Rahim Entezari: “The role of permutation invariance in linear mode connectivity of neural networks”

Feb 15, 2022 | 17:5018:20

Loading Events
  • This event has passed.

Event Navigation

Rahim Entezari and Hanie Sedghi (senior scientist at Google Brain) will present research pursued with Behnam Neyshabur, Hanie Sedghi and Olga Saukh at the EPFL Virtual Symposium “Loss Landscape of Neural Networks: theoretical insights and practical implications,” which takes place Feb 15–16, 2022.




In this work, we conjecture that if the permutation invariance of neural networks is taken into account, SGD solutions will likely have no barrier in the linear interpolation between them. Although it is a bold conjecture, we show how extensive empirical attempts fall short of refuting it.


We further provide a preliminary theoretical result to support our conjecture. Our conjecture has implications for the lottery ticket hypothesis, distributed training, and ensemble methods.



About the Symposium:


In practical applications, Deep Neural Networks are typically trained by walking down the loss surface using gradient descent augmented with a bag of tricks. One of the important practical insights has been that large, overparameterized, networks that have more parameters than necessary work better  –  one potential interpretation (but not the only one) is the ‘lottery ticket hypothesis’. Obviously, the shape of the loss landscape is important when walking down. In recent years, research on the shape of the loss landscape has addressed questions such as: “Is there one big global minimum or many scattered small ones?”, “Is the loss landscape rough or smooth?”,  “Should we worry about saddle points?”, “Are there flat regions in the loss?”, “How many saddle points are there?”.


While these look like questions for theoreticians, their answers might have practical consequences and lead to a better understanding of the role of overparameterization, pruning, and reasons of the bag of tricks. The aim of this workshop is to bring together researchers that have worked on these topics from different points of views and different backgrounds (Computer Science, Math, Physics, Computational Neuroscience), and build a community around these questions.