
Jascha Sohl-Dickstein
@jaschasd • 29,896 subscribers
Member of the technical staff @ Anthropic. Most (in)famous for inventing diffusion models. AI + physics + neuroscience + dynamics.
Videos

Have you ever done a dense grid search over neural network hyperparameters? Like a *really dense* grid search? It looks like this (!!). Blueish colors correspond to hyperparameters for which training converges, redish colors to hyperparameters for which training diverges.
Jascha Sohl-Dickstein1,767,493 Aufrufe • vor 2 Jahren

The boundary between trainable and untrainable neural network hyperparameter configurations is *fractal*! And beautiful! Here is a grid search over a different pair of hyperparameters -- this time learning rate and the mean of the parameter initialization distribution.
Jascha Sohl-Dickstein250,458 Aufrufe • vor 2 Jahren
Keine weiteren Inhalte verfügbar