Video yükleniyor...

Video Yüklenemedi

Ana Sayfaya Dön

The universal approximation theorem states that a neural network with one hidden layer can approximate continuous functions on compact sets with any desired precision.

218,753 görüntüleme • 2 yıl önce •via X (Twitter)

10 Yorum

Roy van Rijn profil fotoğrafı
Roy van Rijn2 yıl önce

Or…. it could be a single spline 🙄

ΣTΞCH profil fotoğrafı
ΣTΞCH2 yıl önce

If you want 25 minutes of how to make a universal function approximation, there's a great video by this one guy who gave it a shot with a handful of methods. Really cool to observe the thinking behind it

BrokieRichard profil fotoğrafı
BrokieRichard2 yıl önce

You spelled Stone-Weierstrass wrong

ऋषिक तिवारी (Rishik Tiwari) profil fotoğrafı
ऋषिक तिवारी (Rishik Tiwari)2 yıl önce

You are talking about general curve fitting (all NNs do) but are wrong about any desired precision. The precision is directly proportional to width of the single layer. Which also means if you are dealing with periodic functions, then nyquist sampling theorem shall apply.

mel 🐰 profil fotoğrafı
mel 🐰2 yıl önce

Omg this is the thing they make me draw in math class…

HL®H profil fotoğrafı
HL®H2 yıl önce

“The Universal Approximation Theorem means that a simple neural network can accurately model any continuous function within a certain set, as long as it has enough neurons. It's like a versatile tool for many tasks.”

A$AP profil fotoğrafı
A$AP2 yıl önce

Is this a consequence that continuous functions can be approximated by piecewise functions ?

Jesse Palmer profil fotoğrafı
Jesse Palmer2 yıl önce

Does that mean multiple hidden layers handle discontinuity?

Matthew Zeits profil fotoğrafı
Matthew Zeits2 yıl önce

How about functions mapping vectors to vectors? Or complex vectors?

; profil fotoğrafı
;2 yıl önce

How does the architecture of a neural network change when incorporating multiple hidden layers, and how does this relate to the universal approximation theorem?

Benzer Videolar

This video, created by my dear coauthor Mahdi E Kahou for our teaching and papers, shows how overparameterized neural networks produce smooth function approximations even in the context of the Runge phenomenon. Some background. Imagine you want to approximate the Runge function using polynomial interpolation at equally spaced points. It is well known that, despite targeting an infinitely differentiable function, such a polynomial approximation produces oscillatory behavior that worsens with the degree of the polynomial. In other words, higher-degree polynomial approximations might not improve accuracy. Instead, approximate the Runge function with a neural network (here, two layers are just to make the example concrete; nothing fundamental depends on it). As you increase the number of parameters well above the 11 training points (in our example, a two-layer neural network with 128 nodes each), you nicely converge to the target, without wild oscillations. Yes, this has much to do with double descent and benign overparameterization, but the main punchline of this post is that neural networks are really very different types of animals than polynomial approximations. And yes, Chebyshev nodes and splines exist, and in this case, they will prevent the oscillations. But that's not the point. Chebyshev nodes and splines still confront Faber’s theorem, which states that for any system of polynomial interpolation nodes, there exists a continuous function whose sequence of interpolating polynomials diverges as the number of nodes grows to infinity. Faber’s theorem does not apply to neural networks because they are not polynomials. The notebook, if you want to check the details, is here: Stay tuned for more on this 👀

Jesús Fernández-Villaverde

46,690 görüntüleme • 1 ay önce