Loading video...
Video Failed to Load
When doing machine learning and AI research (or writing books), making the code reproducible is usually desirable. Often, that's easier said than done! So, I recorded a video illustrating and dealing with 6 sources of randomness that occur when training deep neural networks and LLMs: 1. Model weight initialization... show more
81,670 views • 2 years ago •via X (Twitter)
6 Comments

If you prefer, here is a YouTube version that includes chapter marks:

Maybe a real solution is to consider a probability of a model state (weights) instead of actual numerical values of the state. The optimization process is random by def, e.g. with Langevin dynamics based methods. But when lr decays the final state is the same with very high prob.

I like this idea, and it kind of is what Baysian neural networks do? But for those who prefer certain huge models due to the good predictive performance, e.g. vision and language transformers, this would probably not be feasible, I'd say.

Great Video. Nice explanations. Thanks for that!

Thank you for sharing this video! I'm equally excited for reading your new book too.

Fun fact: it's basically based on chapter 10 :)
