
Jascha Sohl-Dickstein
@jaschasd • 29,896 subscribers
Member of the technical staff @ Anthropic. Most (in)famous for inventing diffusion models. AI + physics + neuroscience + dynamics.
Videos

Have you ever done a dense grid search over neural network hyperparameters? Like a *really dense* grid search? It looks like this (!!). Blueish colors correspond to hyperparameters for which training converges, redish colors to hyperparameters for which training diverges.
Jascha Sohl-Dickstein1,767,493 次观看 • 2 年前

The boundary between trainable and untrainable neural network hyperparameter configurations is *fractal*! And beautiful! Here is a grid search over a different pair of hyperparameters -- this time learning rate and the mean of the parameter initialization distribution.
Jascha Sohl-Dickstein250,458 次观看 • 2 年前
没有更多内容可加载