Tiled Hacker news on React Router

Puzzling Success of Overparameterization: Lottery Tickets or Escape Dimensions?

42 points - yesterday at 9:50 AM

cherryteastain
today at 2:52 PM
A related viewpoint is that overparametrization is good because the model is stranded when the Hessian has all positive/zero eigenvalues. If we treat the probability that a particular Hessian eigenvalue turns positive as a Bernoulli process, the chance of all eigenvalues going positive/zero exponentially decreases as the parameter count increases
[1] https://arxiv.org/abs/1406.2572
vatsachak
today at 5:25 PM
Isn't this trivial?
What's more interesting is as to why double descent happens
Scene_Cast2
today at 1:21 PM
IIRC the original author of the Lottery Ticket Hypothesis now disavows that idea.
One intuitive way of looking at it is like so - let's say that you have a gaussian-looking plot. You want to fit a gaussian. You have a stupid simple model where you can slide your gaussian left and right.
If your initial starting point happens to be roughly within range, great, your optimizer will take care of it for you and slide it into the correct place. If you're too far, too bad, no meaningful gradient.
Instead, neural nets give you the option to spawn a gaussian anywhere you please. In this case, no sliding is necessary, but it comes at a heavy parametrization cost.
TestINGNG
today at 1:54 PM
[dead]