Video yükleniyor...

Video Yüklenemedi

Bu video yüklenirken bir sorun oluştu. Bu geçici bir ağ sorunundan kaynaklanıyor olabilir veya video kullanılamıyor olabilir.

Ana Sayfaya Dön

The boundary between trainable and untrainable neural network hyperparameter configurations is fractal! And beautiful! Here is a grid search over a different pair of hyperparameters -- this time learning rate and the mean of the parameter initialization distribution.

Jascha Sohl-Dickstein

30,571 subscribers

250,667 görüntüleme • 2 yıl önce •via X (Twitter)

Oyun Bilim & Teknoloji Eğitim

Anya Rossi• Live Now

Private livecam show

10 Yorum

Jascha Sohl-Dickstein profil fotoğrafı

Jascha Sohl-Dickstein2 yıl önce

Have you ever done a dense grid search over neural network hyperparameters? Like a *really dense* grid search? It looks like this (!!). Blueish colors correspond to hyperparameters for which training converges, redish colors to hyperparameters for which training diverges.

Jascha Sohl-Dickstein profil fotoğrafı

Jascha Sohl-Dickstein2 yıl önce

There are similarities between the way in which many fractals are generated, and the way in which we train neural networks. Both involve repeatedly applying a function to its own output. In both cases, that function has hyperparameters that control its behavior.

Jascha Sohl-Dickstein profil fotoğrafı

Jascha Sohl-Dickstein2 yıl önce

In both cases the function iteration can produce outputs that either diverge to infinity or remain happily bounded depending on those hyperparameters. Fractals are often defined by the boundary between hyperparameters where function iteration diverges or remains bounded.

Jascha Sohl-Dickstein profil fotoğrafı

Jascha Sohl-Dickstein2 yıl önce

So it shouldn't (post-hoc) be a surprise that hyperparameter landscapes are fractal. This is a general phenomenon: in these panes we see fractal hyperparameter landscapes for every neural network configuration I tried, including deep linear networks.

Jascha Sohl-Dickstein profil fotoğrafı

Jascha Sohl-Dickstein2 yıl önce

The best performing hyperparameters are typically at the edge of stability -- so when you optimize neural network hyperparameters, you are contending with hyperparameter landscapes that look like this.

Jascha Sohl-Dickstein profil fotoğrafı

Jascha Sohl-Dickstein2 yıl önce

Want to learn more? Blog post: 3-page paper:

Jascha Sohl-Dickstein profil fotoğrafı

Jascha Sohl-Dickstein2 yıl önce

I don't have a SoundCloud, but I did join Anthropic last week, and so far it has exceeded my (high) expectations. I would strongly recommend working there (and using Claude). *this project not done at Anthropic -- this was recreational machine learning on my own time.

Kosta Derpanis profil fotoğrafı

Kosta Derpanis2 yıl önce

Just in time to make the cut for my lecture today. At 45 sec mark. Thanks for sharing!

Mihoda profil fotoğrafı

Mihoda2 yıl önce

I'm not sure what I'm looking at, but my guess at interpretation would be instability.

Kenneth Shinozuka profil fotoğrafı

Kenneth Shinozuka2 yıl önce

beautiful result

Benzer Videolar

Does GPT understand the world? Here is what Ilya Sutskever, co-founder of OpenAI, says during a discussion with Jensen Huang, CEO of Nvidia: (1) When we train a large neural network to accurately predict the next word in lots of different texts from the internet, the AI is learning a world model. (2) On the surface, it may look like learning correlations in text, but it turns out that to 'just learn' statistical correlations in text, to compress information really well, what the neural network learns is some representation of the process that produced the text. (3) This text is a projection of the world...what the neural network is learning is aspects of the world, of people, of the human conditions, their hopes, dreams, motivations, their interactions...the situations we are in. The neural network learns a compressed, abstract, usable representation." Do you think learning representations = understanding? Are large language models simply stochastic parrots, or are they much more?

Does GPT understand the world? Here is what Ilya Sutskever, co-founder of OpenAI, says during a discussion with Jensen Huang, CEO of Nvidia: (1) When we train a large neural network to accurately predict the next word in lots of different texts from the internet, the AI is learning a world model. (2) On the surface, it may look like learning correlations in text, but it turns out that to 'just learn' statistical correlations in text, to compress information really well, what the neural network learns is some representation of the process that produced the text. (3) This text is a projection of the world...what the neural network is learning is aspects of the world, of people, of the human conditions, their hopes, dreams, motivations, their interactions...the situations we are in. The neural network learns a compressed, abstract, usable representation." Do you think learning representations = understanding? Are large language models simply stochastic parrots, or are they much more?

Alex Ker 🔭

1,367,077 görüntüleme • 2 yıl önce

here is a montage about a decade of doing the same thing over and over again and expecting different results

here is a montage about a decade of doing the same thing over and over again and expecting different results

Justin McElroy

206,972 görüntüleme • 3 yıl önce

This is the time of meeting between a farmer and chickens.

This is the time of meeting between a farmer and chickens.

Worldly

83,456 görüntüleme • 1 yıl önce

Puffin 'billing' is an act of pair bonding between a pair of Puffins and can be frequently seen especially during the start of the seabird season in March and April.

Puffin 'billing' is an act of pair bonding between a pair of Puffins and can be frequently seen especially during the start of the seabird season in March and April.

David Steel

10,374 görüntüleme • 11 ay önce

Strengthening the relationships between creators and fans by cutting out the middlemen. This is the future of creative distribution.

Strengthening the relationships between creators and fans by cutting out the middlemen. This is the future of creative distribution.

KAMI

14,346 görüntüleme • 4 ay önce

The age of Neural Engineering is here... and Max Hodak's Science Corp is on the frontier: ‣ Curing blindness ‣ Merging mind & machine with BCIs ‣ Today, revealing a new kind of neural probe ‣ And exploring consciousness itself on Episode 65 of S3.

The age of Neural Engineering is here... and Max Hodak's Science Corp is on the frontier: ‣ Curing blindness ‣ Merging mind & machine with BCIs ‣ Today, revealing a new kind of neural probe ‣ And exploring consciousness itself on Episode 65 of S3.

Jason Carman

132,932 görüntüleme • 1 yıl önce

What do they mean and what is the different between them?

What do they mean and what is the different between them?

Interesting AF

29,615 görüntüleme • 5 ay önce

This is not just a Political victory. This is a revolution by the people of West Bengal. This is no longer just a contest between two parties. It is the victory of Maa Mati Manush over the TMC and its forces of destruction and corruption.

This is not just a Political victory. This is a revolution by the people of West Bengal. This is no longer just a contest between two parties. It is the victory of Maa Mati Manush over the TMC and its forces of destruction and corruption.

BJP West Bengal

46,626 görüntüleme • 2 ay önce

Ever wondered what neural networks are and how they work? Systems like ChatGPT use neural networks to work as well as they do. Neural networks are composed of "layers" of neurons, layers with different functions; connections between layers called "weights"; and mathematical functions called "activation functions". If you’re interested in learning about these systems, check the comments. Ultimately, the neural network structure of the model serves to visually demonstrate that it is, in fact, a complex mathematical equation. When companies release the model's weights, they are releasing a key component needed to run the model's complete equation. Without the weights, the equation is incomplete. For the math-minded: the weights of a model are the learned numbers (they are variables during training) that are then used as constants in the mathematical functions that make up the model. Neural networks are ultimately just one big, hyper-complex mathematical function, and when a model is trained, it learns the constants associated with the high-dimensional input.

Ever wondered what neural networks are and how they work? Systems like ChatGPT use neural networks to work as well as they do. Neural networks are composed of "layers" of neurons, layers with different functions; connections between layers called "weights"; and mathematical functions called "activation functions". If you’re interested in learning about these systems, check the comments. Ultimately, the neural network structure of the model serves to visually demonstrate that it is, in fact, a complex mathematical equation. When companies release the model's weights, they are releasing a key component needed to run the model's complete equation. Without the weights, the equation is incomplete. For the math-minded: the weights of a model are the learned numbers (they are variables during training) that are then used as constants in the mathematical functions that make up the model. Neural networks are ultimately just one big, hyper-complex mathematical function, and when a model is trained, it learns the constants associated with the high-dimensional input.

Harper Carroll

29,648 görüntüleme • 8 ay önce

Individual Neuron: Neural Network A neuron in a neural network performs a weighted sum of inputs, adds a bias, and applies an activation function like sigmoid, ReLU, or tanh, introducing non-linearity. This output helps the neuron learn and represent patterns in the data.

Individual Neuron: Neural Network A neuron in a neural network performs a weighted sum of inputs, adds a bias, and applies an activation function like sigmoid, ReLU, or tanh, introducing non-linearity. This output helps the neuron learn and represent patterns in the data.

ₕₐₘₚₜₒₙ — e/acc

47,467 görüntüleme • 2 yıl önce

Horse, pony, and all the in between! The smooth front variation is now in. This pair here are a size large example test

Horse, pony, and all the in between! The smooth front variation is now in. This pair here are a size large example test

Biglionsden 🦁

41,189 görüntüleme • 8 ay önce

Eric Cantona : " If Gaza is over this time the world is inevitably over. The world developed its way of murdering , its greediness grew and its desecration of innocents has extended . If Gaza doesn't make it this time this mean that victory is for the most violent , and the most barbaric "

Eric Cantona : " If Gaza is over this time the world is inevitably over. The world developed its way of murdering , its greediness grew and its desecration of innocents has extended . If Gaza doesn't make it this time this mean that victory is for the most violent , and the most barbaric "

Irlandarra

684,677 görüntüleme • 1 yıl önce

The definition of insanity is doing the same thing over and over while expecting a different result.

The definition of insanity is doing the same thing over and over while expecting a different result.

Toby Doeden

82,793 görüntüleme • 7 ay önce

Watch Spot crouch, jump, climb boxes and leap across gaps, controlled by a neural network trained with reinforcement learning (RL) and multi-expert distillation. Multiple expert policies were trained and distilled together into a single policy that was fine tuned to improve performance over diverse terrains. This work was inspired by ANYmal’s parkour capabilities. The neural network processes depth data from Spot's sensors to construct an understanding of the environment.

Watch Spot crouch, jump, climb boxes and leap across gaps, controlled by a neural network trained with reinforcement learning (RL) and multi-expert distillation. Multiple expert policies were trained and distilled together into a single policy that was fine tuned to improve performance over diverse terrains. This work was inspired by ANYmal’s parkour capabilities. The neural network processes depth data from Spot's sensors to construct an understanding of the environment.

RAI Institute

14,768 görüntüleme • 2 ay önce

The start of the Prayer Service at the Sistine Chapel. This is the first time a British monarch and a Pope have prayed together here since 855. Charles is here as King and Head of the Church of England.

The start of the Prayer Service at the Sistine Chapel. This is the first time a British monarch and a Pope have prayed together here since 855. Charles is here as King and Head of the Church of England.

Raymond Arroyo

17,805 görüntüleme • 9 ay önce