Loading [MathJax]/extensions/Safe.js

[link]
Summary by devin132 5 years ago

The paper: "Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask" by Zhou et al., 2019 found that by just learning binary masks one can find random subnetworks that do much better than chance on a task. This new paper builds on this method by proposing a strong algorithm than Zhou et al. for finding these high-performing subnetworks.

The intuition follows: "If a neural network with random weights (center) is sufficiently overparameterized, it will contain a subnetwork (right) that performs as well as a trained neural network (left) with the same number of parameters."

While Zhou et al. learned a probability for each weight this paper learns a score for each weight and takes the top k percent at evaluation. The scores are learned through their primary contribution that they call the edge-popup algorithm:

"In the edge-popup Algorithm, we associate a score with each edge. On the forward pass we choose the top edges by score. On the backward pass we update the scores of all the edges with the straight-through estimator, allowing helpful edges that are “dead” to re-enter the subnetwork. We never update the value of any weight in the network, only the score associated with each weight."

They're able to find higher-performing random subnetworks than Zhou et al.

more
Your comment:

Send Feedback
ShortScience.org allows researchers to publish paper summaries that are voted on and ranked!
About

Sponsored by: