Loading video...

Video Failed to Load

There was a problem loading this video. This could be due to a temporary network issue or the video might be unavailable.

We often hear that Machine Learning models learn patterns in data. But what does that look like in geometry? Picture dropping an elastic mesh into a cloud of points and letting it adapt. How would it bend, stretch, and settle so it matches the shape hiding in the data?... In this scene we watch a self-organizing map(SOM)...a classic unsupervised neural model...learn a 2D dataset arranged like a spiral arm. Over the points we place a square grid of neurons whose weights live in the same plane. At the start it’s just a flat net drifting across the cloud. No clue what the structure is. Fir the SOM, learning is a repeated game: Pick a random data point, find the neuron whose weight is closest, then nudge that neuron and its neighbours toward the point. Repeat, repeat, repeat...while gradually shrinking how wide the neighbourhood influence spreads. The result is satisfying...the grid stops being a grid and turns into a coordinate sheet wrapped onto the spiral. #MachineLearning #ManifoldLearning #UnsupervisedLearning #NeuralMaps #GeometricML #SelfOrganizingMap #Topologyshow more

Mathelirium

27,621 subscribers

23,021 views • 6 months ago •via X (Twitter)

Anya Rossi• Live Now

Private livecam show

0 Comments

No comments available

Comments from the original post will appear here

Related Videos

We often hear that Machine Learning models learn patterns in data. But what does that actually look like in Geometry? If you dropped a little elastic mesh into a cloud of points and let it learn, how would it fold itself to match the shape of the data? In this scene we watch a Self-Organizing Map (SOM), a simple unsupervised neural model, learn the shape of a 3D datasets l, one static and the other dynamic. On top of this, we lay down a square grid of neurons whose weights live in the same plane. At the start, this grid is just a flat net floating across the cloud. It knows nothing about the structure underneath. Learning is a repeated game: Pick a random data point, find the neuron whose weight is closest, and then nudge that neuron and its neighbours toward the point. Do this again and again, while slowly shrinking how far the neighbourhood influence spreads. Python code is available for Subscribers. #MachineLearning #ManifoldLearning #UnsupervisedLearning #NeuralMaps #GeometricML

We often hear that Machine Learning models learn patterns in data. But what does that actually look like in Geometry? If you dropped a little elastic mesh into a cloud of points and let it learn, how would it fold itself to match the shape of the data? In this scene we watch a Self-Organizing Map (SOM), a simple unsupervised neural model, learn the shape of a 3D datasets l, one static and the other dynamic. On top of this, we lay down a square grid of neurons whose weights live in the same plane. At the start, this grid is just a flat net floating across the cloud. It knows nothing about the structure underneath. Learning is a repeated game: Pick a random data point, find the neuron whose weight is closest, and then nudge that neuron and its neighbours toward the point. Do this again and again, while slowly shrinking how far the neighbourhood influence spreads. Python code is available for Subscribers. #MachineLearning #ManifoldLearning #UnsupervisedLearning #NeuralMaps #GeometricML

Mathelirium

76,194 views • 4 months ago

A neural network can begin as a flat sheet and learn the shape of hidden data A self-organizing map turns learning into geometry. Each data point pulls one winning neuron toward it, but nearby neurons move too, and so the whole lattice bends without losing its neighborhood structure. The strange part is that the network is not given the roll shape. It discovers the shape through competition and local cooperation. Paper: Self-Organized Formation of Topologically Correct Feature Maps Authors: Teuvo Kohonen Year: 1982

A neural network can begin as a flat sheet and learn the shape of hidden data A self-organizing map turns learning into geometry. Each data point pulls one winning neuron toward it, but nearby neurons move too, and so the whole lattice bends without losing its neighborhood structure. The strange part is that the network is not given the roll shape. It discovers the shape through competition and local cooperation. Paper: Self-Organized Formation of Topologically Correct Feature Maps Authors: Teuvo Kohonen Year: 1982

Mathelirium

129,378 views • 2 months ago

The Machine That Learns The Law Behind The Data A very very interesting US Patent US10963540B2 - Physics Informed Learning Machine describes a learning system that does not begin with data alone. It begins with a physical model, usually written as a differential equation (or PDE) dx/dt = f(x,t) A normal Machine Learning model sees scattered data and tries to fit it. A physics-informed learning machine starts with a law. Then it treats the data as evidence that updates what the model believes about the physical system. For this application, I use the patent idea on NASA C-MAPSS Turbofan engine data. The machine watches multivariate telemetry from a degrading engine and infers a hidden health state that is not measured directly. From that posterior belief, it estimates the engine’s remaining useful life. In the main 3D scene, the engine lifetime is turned into a tunnel. The spiral ribbons are real sensor channels evolving over cycle-time. The glowing core is the inferred health state. The surrounding cloud is uncertainty. The orange wall ahead is the predicted failure horizon. So the big picture is: sensor evidence comes in, posterior belief tightens, and the machine moves from uncertainty toward a concrete failure prediction. The inset posteriors make that explicit. The health posterior shows where the model believes the hidden engine condition sits at the current moment, and how sharply it believes it. The RUL posterior shows the same idea for remaining life... early on it is broad, later it shifts left and narrows as the machine becomes more certain about how close failure is. This idea is not limited to engines. The same idea can apply to data centers, CPUs, GPUs, cooling systems, power grids, robotics, batteries, and any machine that produces telemetry while obeying physical constraints. In an age where machine learning runs on massive hardware infrastructure, this kind of model matters: it can turn noisy sensor streams into early warnings before expensive systems fail.

The Machine That Learns The Law Behind The Data A very very interesting US Patent US10963540B2 - Physics Informed Learning Machine describes a learning system that does not begin with data alone. It begins with a physical model, usually written as a differential equation (or PDE) dx/dt = f(x,t) A normal Machine Learning model sees scattered data and tries to fit it. A physics-informed learning machine starts with a law. Then it treats the data as evidence that updates what the model believes about the physical system. For this application, I use the patent idea on NASA C-MAPSS Turbofan engine data. The machine watches multivariate telemetry from a degrading engine and infers a hidden health state that is not measured directly. From that posterior belief, it estimates the engine’s remaining useful life. In the main 3D scene, the engine lifetime is turned into a tunnel. The spiral ribbons are real sensor channels evolving over cycle-time. The glowing core is the inferred health state. The surrounding cloud is uncertainty. The orange wall ahead is the predicted failure horizon. So the big picture is: sensor evidence comes in, posterior belief tightens, and the machine moves from uncertainty toward a concrete failure prediction. The inset posteriors make that explicit. The health posterior shows where the model believes the hidden engine condition sits at the current moment, and how sharply it believes it. The RUL posterior shows the same idea for remaining life... early on it is broad, later it shifts left and narrows as the machine becomes more certain about how close failure is. This idea is not limited to engines. The same idea can apply to data centers, CPUs, GPUs, cooling systems, power grids, robotics, batteries, and any machine that produces telemetry while obeying physical constraints. In an age where machine learning runs on massive hardware infrastructure, this kind of model matters: it can turn noisy sensor streams into early warnings before expensive systems fail.

Mathelirium

17,758 views • 2 months ago

A Neural Network Can Grow New Neurons Where It Is Confused? In 1994, Bernd Fritzke published A Growing Neural Gas Network Learns Topologies. He introduced a network that starts small, follows incoming data, and inserts new neurons where its error is highest. In the animation, the fog is the drifting data. The glowing nodes are neurons. The fibers are learned connections. The network grows into a living skeleton of the manifold.

A Neural Network Can Grow New Neurons Where It Is Confused? In 1994, Bernd Fritzke published A Growing Neural Gas Network Learns Topologies. He introduced a network that starts small, follows incoming data, and inserts new neurons where its error is highest. In the animation, the fog is the drifting data. The glowing nodes are neurons. The fibers are learned connections. The network grows into a living skeleton of the manifold.

Mathelirium

38,306 views • 2 months ago

$“But I saved the history on CD, DVD, Magnetics” That data will all decay in a MTBF of +\-50 years. The cloud is deleting as much data that is being produced daily. At some point in the future we would have lost a majority of the 2000s-2040s. We are the amnesia generation.$

“But I saved the history on CD, DVD, Magnetics” That data will all decay in a MTBF of +\-50 years. The cloud is deleting as much data that is being produced daily. At some point in the future we would have lost a majority of the 2000s-2040s. We are the amnesia generation.

Brian Roemmele

410,355 views • 2 years ago

Geometry of Machine Learning Models - Gaussian Process Kernel In 1948 Norbert Wiener framed prediction as a correlation problem, and in the 1970s George Wahba clarified that picking a smoothness preference is the same as picking a kernel. The motivation is that whenever data i s sparse and noisy, you don't just want a fit, you want a fit with calibrated uncertainty.

Geometry of Machine Learning Models - Gaussian Process Kernel In 1948 Norbert Wiener framed prediction as a correlation problem, and in the 1970s George Wahba clarified that picking a smoothness preference is the same as picking a kernel. The motivation is that whenever data i s sparse and noisy, you don't just want a fit, you want a fit with calibrated uncertainty.

Mathelirium

15,661 views • 4 months ago

The biggest marketplace on Earth won't be the assets humans own. It'll be the data AI needs. Everything that can be digitised is getting pulled in to train the next models. Every scientific dataset, every piece of research sitting in a university, every sensor reading off every John Deere tractor in a field. All of it feeds the machine, and almost none of it is visible to us. What comes out is an agentic economy running at machine speed. You plug your data into a vault, and agents monetise it on your behalf, trading with other agents in real time, in a marketplace we never see. The company sitting on proprietary data becomes part of it without even knowing. This is what the Economic Singularity actually looks like from the inside. The real economy stops being the one we can see and price, and moves somewhere we can't follow. You own a piece of it, or you get priced out of it. Jordi Visser.

The biggest marketplace on Earth won't be the assets humans own. It'll be the data AI needs. Everything that can be digitised is getting pulled in to train the next models. Every scientific dataset, every piece of research sitting in a university, every sensor reading off every John Deere tractor in a field. All of it feeds the machine, and almost none of it is visible to us. What comes out is an agentic economy running at machine speed. You plug your data into a vault, and agents monetise it on your behalf, trading with other agents in real time, in a marketplace we never see. The company sitting on proprietary data becomes part of it without even knowing. This is what the Economic Singularity actually looks like from the inside. The real economy stops being the one we can see and price, and moves somewhere we can't follow. You own a piece of it, or you get priced out of it. Jordi Visser.

Raoul Pal

65,302 views • 1 day ago

AI runs on electricity, and Sweden has a head start. This morning at Ruinen we put the question on the table: is the next wave of AI data centres a Swedish opportunity, and what does the grid need to make it happen?

AI runs on electricity, and Sweden has a head start. This morning at Ruinen we put the question on the table: is the next wave of AI data centres a Swedish opportunity, and what does the grid need to make it happen?

Hitachi Energy

6,940,092 views • 29 days ago

The Torus Becomes the Canvas A Jacobi Theta Function lives naturally on a complex torus. Instead of drawing it on a flat plane, we wrapped the mathematics onto the torus. The surface is driven by θ₁(z|τ), with its moving zeros, phase winding, and logarithmic derivative shaping the colour, seams, and raised divisor points across the geometry. What you are see is a periodic quantum-like field painted onto the space where it actually belongs. #JacobiTheta #ComplexTorus #MathematicalArt #ComplexAnalysis #RiemannSurfaces #MathAnimation

The Torus Becomes the Canvas A Jacobi Theta Function lives naturally on a complex torus. Instead of drawing it on a flat plane, we wrapped the mathematics onto the torus. The surface is driven by θ₁(z|τ), with its moving zeros, phase winding, and logarithmic derivative shaping the colour, seams, and raised divisor points across the geometry. What you are see is a periodic quantum-like field painted onto the space where it actually belongs. #JacobiTheta #ComplexTorus #MathematicalArt #ComplexAnalysis #RiemannSurfaces #MathAnimation

Mathelirium

22,197 views • 1 month ago

Does GPT understand the world? Here is what Ilya Sutskever, co-founder of OpenAI, says during a discussion with Jensen Huang, CEO of Nvidia: (1) When we train a large neural network to accurately predict the next word in lots of different texts from the internet, the AI is learning a world model. (2) On the surface, it may look like learning correlations in text, but it turns out that to 'just learn' statistical correlations in text, to compress information really well, what the neural network learns is some representation of the process that produced the text. (3) This text is a projection of the world...what the neural network is learning is aspects of the world, of people, of the human conditions, their hopes, dreams, motivations, their interactions...the situations we are in. The neural network learns a compressed, abstract, usable representation." Do you think learning representations = understanding? Are large language models simply stochastic parrots, or are they much more?

Does GPT understand the world? Here is what Ilya Sutskever, co-founder of OpenAI, says during a discussion with Jensen Huang, CEO of Nvidia: (1) When we train a large neural network to accurately predict the next word in lots of different texts from the internet, the AI is learning a world model. (2) On the surface, it may look like learning correlations in text, but it turns out that to 'just learn' statistical correlations in text, to compress information really well, what the neural network learns is some representation of the process that produced the text. (3) This text is a projection of the world...what the neural network is learning is aspects of the world, of people, of the human conditions, their hopes, dreams, motivations, their interactions...the situations we are in. The neural network learns a compressed, abstract, usable representation." Do you think learning representations = understanding? Are large language models simply stochastic parrots, or are they much more?

Alex Ker 🔭

1,367,077 views • 2 years ago

Peter Ndegwa: In a crisis like we faced last year, when there were the Gen Z protests, there was an internet issue and the issue of data. It doesn’t matter what story we say, at the end of the day, we pride ourselves on delivering and always being safe secure. In a way, we disappointed customers because the internet was not working at that point. On the Gen Z side, it’s a learning for everyone, it’s a learning from a political and from a corporate perspective. People still trust our brand, our reputation is still very strong #CitizenExplainer Yvonne Okwara

Citizen TV Kenya

72,261 views • 1 year ago

IN 1986 MIT FILMED THE LECTURE WHERE CODE STOPPED BEING CODE AND BECAME DATA 43 minutes from Gerald Sussman, in the most legendary intro programming course ever recorded. -> The idea that lands: to a program, your code is just data it can read and rewrite. He builds a program that does real calculus -- not by crunching numbers, but by reading the math as a list and transforming it piece by piece. Then he shows the trick hiding under all of it: in this language, code and data are the same material. Forty years later that is exactly what an AI coding agent is -- a program that reads your code as data, rewrites it, and hands it back. Sussman drew the whole idea on a blackboard in 1986. Writing code was never the deepest skill -- understanding that code itself is data a program can manipulate is. This is where you learn it. Most people think AI writing code is brand new. The ones who watch this saw the blueprint 40 years ago. Bookmark & Watch it. This one's a legend ↓

slash1s

22,107 views • 29 days ago

There's a fruit fly walking around right now that was never born. eon just released a video where they took a real fly's connectome — the wiring diagram of its brain — and simulated it. Dropped it into a virtual body. It started walking. Grooming. Feeding. Doing what flies do. Nobody taught it to walk. No training data, no gradient descent toward fly-like behavior. This is the opposite of how AI works. They rebuilt the mind from the inside, neuron by neuron, and behavior just... emerged. It's the first time a biological organism has been recreated not by modeling what it does, but by modeling what it is. A human brain is 6 OOM more neurons. That's a scaling problem, something we've gotten very good at solving. So what happens when we have a working copy of the human mind?

There's a fruit fly walking around right now that was never born. eon just released a video where they took a real fly's connectome — the wiring diagram of its brain — and simulated it. Dropped it into a virtual body. It started walking. Grooming. Feeding. Doing what flies do. Nobody taught it to walk. No training data, no gradient descent toward fly-like behavior. This is the opposite of how AI works. They rebuilt the mind from the inside, neuron by neuron, and behavior just... emerged. It's the first time a biological organism has been recreated not by modeling what it does, but by modeling what it is. A human brain is 6 OOM more neurons. That's a scaling problem, something we've gotten very good at solving. So what happens when we have a working copy of the human mind?

hattie

9,353,157 views • 4 months ago

Erwin Schrödinger’s 1926 equation changed the game by turning Quantum Mechanics into wave dynamics. That move gave Physics a new way to think. Instead of forcing a particle onto one sharp path, it lets a complex wavefunction evolve in time, with its shape and phase holding the structure of the phenomenon. What makes this so striking is that Schrödinger’s equation does not start from vague mystery. It starts from a precise and daring idea The state of a particle is a complex field ψ(x,t), and the dynamics must push ψ forward in a way that preserves total probability.

Erwin Schrödinger’s 1926 equation changed the game by turning Quantum Mechanics into wave dynamics. That move gave Physics a new way to think. Instead of forcing a particle onto one sharp path, it lets a complex wavefunction evolve in time, with its shape and phase holding the structure of the phenomenon. What makes this so striking is that Schrödinger’s equation does not start from vague mystery. It starts from a precise and daring idea The state of a particle is a complex field ψ(x,t), and the dynamics must push ψ forward in a way that preserves total probability.

Mathelirium

54,730 views • 4 months ago

New PNAS paper. Historical GDP per capita data is scarce, but data on the places of birth, death, and occupations of famous individuals is abundant. In this paper we estimate the historical GDP per capita of hundreds of regions in Europe and North America using a machine learning model that leveraged data on about 500k famous biographies. Our estimates more-or-less quadruple the availability of historical GDP per capita estimates for the last 700 years. So why use biographies to augment historical GDP per capita data? Biographical data contains information about people who might have contributed directly to economic growth, like James Watt, or that were attracted to wealthy places looking for patrons, like Michelangelo. So we--mainly Philipp (Philipp Koch)--used this data to construct hundreds of features describing each European region. Then, we trained a machine learning model to find the features that explained most of the variance in a cross-validation test, where we split regions multiple times into a training set and a test set. On average, the model explained about 90% of the variance in GDP per capita of the regions it had not seen during training. But we wanted to go further, and Philipp really went to town by looking at different ways to validate our estimates. We found our estimates correlate positively with historical measures of wellbeing, church building activity, urbanization, and body height. We also used these measures to reproduce the basic Atlantic trade result of Acemoglu, Johnson, and Robison and to explore the economic consequences of the famous Lisbon earthquake of 1755. But what I personally loved most about this project, other than working with Philipp Koch and V, is that it shows that we can use machine learning methods not only to explore the future, but the past. There is a bright and growing future in the use of machine learning for economic history. Hope you enjoy the paper and the data. You can find links to the paper and a data exploration tool in the first comment.

New PNAS paper. Historical GDP per capita data is scarce, but data on the places of birth, death, and occupations of famous individuals is abundant. In this paper we estimate the historical GDP per capita of hundreds of regions in Europe and North America using a machine learning model that leveraged data on about 500k famous biographies. Our estimates more-or-less quadruple the availability of historical GDP per capita estimates for the last 700 years. So why use biographies to augment historical GDP per capita data? Biographical data contains information about people who might have contributed directly to economic growth, like James Watt, or that were attracted to wealthy places looking for patrons, like Michelangelo. So we--mainly Philipp (Philipp Koch)--used this data to construct hundreds of features describing each European region. Then, we trained a machine learning model to find the features that explained most of the variance in a cross-validation test, where we split regions multiple times into a training set and a test set. On average, the model explained about 90% of the variance in GDP per capita of the regions it had not seen during training. But we wanted to go further, and Philipp really went to town by looking at different ways to validate our estimates. We found our estimates correlate positively with historical measures of wellbeing, church building activity, urbanization, and body height. We also used these measures to reproduce the basic Atlantic trade result of Acemoglu, Johnson, and Robison and to explore the economic consequences of the famous Lisbon earthquake of 1755. But what I personally loved most about this project, other than working with Philipp Koch and V, is that it shows that we can use machine learning methods not only to explore the future, but the past. There is a bright and growing future in the use of machine learning for economic history. Hope you enjoy the paper and the data. You can find links to the paper and a data exploration tool in the first comment.

César A. Hidalgo

54,332 views • 1 year ago

I don’t know if we live in a Matrix, but I know for sure that robots will spend most of their lives in simulation. Let machines train machines. I’m excited to introduce DexMimicGen, a massive-scale synthetic data generator that enables a humanoid robot to learn complex skills from only a handful of human demonstrations. Yes, as few as 5! DexMimicGen addresses the biggest pain point in robotics: where do we get data? Unlike with LLMs, where vast amounts of texts are readily available, you cannot simply download motor control signals from the internet. So researchers teleoperate the robots to collect motion data via XR headsets. They have to repeat the same skill over and over and over again, because neural nets are data hungry. This is a very slow and uncomfortable process. At NVIDIA, we believe the majority of high-quality tokens for robot foundation models will come from simulation. What DexMimicGen does is to trade GPU compute time for human time. It takes one motion trajectory from human, and multiplies into 1000s of new trajectories. A robot brain trained on this augmented dataset will generalize far better in the real world. Think of DexMimicGen as a learning signal amplifier. It maps a small dataset to a large (de facto infinite) dataset, using physics simulation in the loop. In this way, we free humans from babysitting the bots all day. The future of robot data is generative. The future of the entire robot learning pipeline will also be generative. 🧵

I don’t know if we live in a Matrix, but I know for sure that robots will spend most of their lives in simulation. Let machines train machines. I’m excited to introduce DexMimicGen, a massive-scale synthetic data generator that enables a humanoid robot to learn complex skills from only a handful of human demonstrations. Yes, as few as 5! DexMimicGen addresses the biggest pain point in robotics: where do we get data? Unlike with LLMs, where vast amounts of texts are readily available, you cannot simply download motor control signals from the internet. So researchers teleoperate the robots to collect motion data via XR headsets. They have to repeat the same skill over and over and over again, because neural nets are data hungry. This is a very slow and uncomfortable process. At NVIDIA, we believe the majority of high-quality tokens for robot foundation models will come from simulation. What DexMimicGen does is to trade GPU compute time for human time. It takes one motion trajectory from human, and multiplies into 1000s of new trajectories. A robot brain trained on this augmented dataset will generalize far better in the real world. Think of DexMimicGen as a learning signal amplifier. It maps a small dataset to a large (de facto infinite) dataset, using physics simulation in the loop. In this way, we free humans from babysitting the bots all day. The future of robot data is generative. The future of the entire robot learning pipeline will also be generative. 🧵

Jim Fan

165,246 views • 1 year ago

The solar system is a study in chronological lag. When you look at the Sun, you are peering 8 minutes and 13 seconds into the past. By the time that same light reaches Neptune, over four hours have vanished. We tend to visualize space as a static map, but it is actually a staggered broadcast of ancient photons. This visualization captures the profound isolation of the outer giants. While Mercury is practically bathing in real time, the rest of the family is drifting in a massive delay. We are all orbiting the same flame, just at different points in history. Credit: cosmicverse

The solar system is a study in chronological lag. When you look at the Sun, you are peering 8 minutes and 13 seconds into the past. By the time that same light reaches Neptune, over four hours have vanished. We tend to visualize space as a static map, but it is actually a staggered broadcast of ancient photons. This visualization captures the profound isolation of the outer giants. While Mercury is practically bathing in real time, the rest of the family is drifting in a massive delay. We are all orbiting the same flame, just at different points in history. Credit: cosmicverse

Cosmos Archive

39,317 views • 3 months ago

What if Your Neural Network Was Forced to Obey Physics? Physics-Informed Neural Networks (PINNs) are neural networks trained to satisfy a differential equation by building the PDE residual directly into the loss. They emerged from a very practical problem...classical PDE pipelines can be brilliant, but they often demand heavy discretization work (meshes, stencils, stability tuning), and the method you build is usually tied to one geometry and one solver setup. A PINN flips the workflow by representing the solution itself as a smooth function uᵩ(x,t) and enforcing the physics everywhere you choose to sample the domain. People often meet PINNs in the least helpful way...via a flashy solution plot, and almost no explanation of what was enforced to get it. In this series we keep the enforcement visible. We pick a differential equation, represent the unknown solution as a flexible function, measure how well that function satisfies the equation across the domain, and train it to reduce that mismatch everywhere we sample. A normal neural net learns from labels...you give it inputs and target outputs. A PINN learns from a differential equation...you give it inputs (x,t) and it gets punished whenever its output fails the PDE. By punish we mean that the loss increases when the mismatch is large we reward it if the loss decreases as the mismatch gets smaller. The network isn’t replacing physics, it’s becoming a flexible function that is forced to satisfy the same calculus you’d impose on any candidate solution. The math breakdown: We start with a PDE we want to solve on a domain Ω. Write it as uₜ(x,t) + N(u(x,t), uₓ(x,t), uₓₓ(x,t), …) = 0 for (x,t) in Ω A PINN replaces the unknown function u with a neural network output uᵩ(x,t) Now define the physics residual by plugging uᵩ into the PDE rᵩ(x,t) = ∂uᵩ/∂t + N(uᵩ, ∂uᵩ/∂x, ∂²uᵩ/∂x², …) If uᵩ were an exact solution, we would have rᵩ(x,t) = 0 everywhere. We may also have data points (xᵢ,tᵢ,uᵢ) from measurements or a known initial condition. The training objective is just a weighted sum of squared errors L(ᵩ) = L_data(ᵩ) + λ L_phys(ᵩ) + L_bc/ic(ᵩ) with L_data(ᵩ) = meanᵢ |uᵩ(xᵢ,tᵢ) − uᵢ|² L_phys(ᵩ) = meanⱼ |rᵩ(xⱼ,tⱼ)|² where (xⱼ,tⱼ) are the collocation points in Ω L_bc/ic(ᵩ) = penalties enforcing boundary conditions and initial conditions The key technical step is that the derivatives inside rᵩ are computed by automatic differentiation ∂uᵩ/∂t, ∂uᵩ/∂x, ∂²uᵩ/∂x², … So we can differentiate the total loss L(ᵩ) with respect to ᵩ and train with gradient descent. This is the whole idea behind PINNs. Learn a function, but make the PDE part of the loss, so the network is trained to be a solution, not just a curve-fitter. In the render, the main 3D surface is the network’s current guess uᵩ(x,t), drawn as a living sheet over the (x,t) plane. Hovering above is the neural scaffold...a visible graph of feature nodes and connections. The bright tension threads are the physics residual rᵩ(x,t): each thread tethers a collocation bead on the sheet up to the scaffold, and it thickens and brightens exactly where |rᵩ| is large (color encodes the sign). As training runs, those threads go slack across the domain not because we hid the error, but because the network has actually been pushed toward rᵩ(x,t) ≈ 0. #PINNs #PhysicsInformedNeuralNetworks #ScientificMachineLearning #PDE #DifferentialEquations #Optimization #MachineLearning #AppliedMath #ComputationalPhysics

What if Your Neural Network Was Forced to Obey Physics? Physics-Informed Neural Networks (PINNs) are neural networks trained to satisfy a differential equation by building the PDE residual directly into the loss. They emerged from a very practical problem...classical PDE pipelines can be brilliant, but they often demand heavy discretization work (meshes, stencils, stability tuning), and the method you build is usually tied to one geometry and one solver setup. A PINN flips the workflow by representing the solution itself as a smooth function uᵩ(x,t) and enforcing the physics everywhere you choose to sample the domain. People often meet PINNs in the least helpful way...via a flashy solution plot, and almost no explanation of what was enforced to get it. In this series we keep the enforcement visible. We pick a differential equation, represent the unknown solution as a flexible function, measure how well that function satisfies the equation across the domain, and train it to reduce that mismatch everywhere we sample. A normal neural net learns from labels...you give it inputs and target outputs. A PINN learns from a differential equation...you give it inputs (x,t) and it gets punished whenever its output fails the PDE. By punish we mean that the loss increases when the mismatch is large we reward it if the loss decreases as the mismatch gets smaller. The network isn’t replacing physics, it’s becoming a flexible function that is forced to satisfy the same calculus you’d impose on any candidate solution. The math breakdown: We start with a PDE we want to solve on a domain Ω. Write it as uₜ(x,t) + N(u(x,t), uₓ(x,t), uₓₓ(x,t), …) = 0 for (x,t) in Ω A PINN replaces the unknown function u with a neural network output uᵩ(x,t) Now define the physics residual by plugging uᵩ into the PDE rᵩ(x,t) = ∂uᵩ/∂t + N(uᵩ, ∂uᵩ/∂x, ∂²uᵩ/∂x², …) If uᵩ were an exact solution, we would have rᵩ(x,t) = 0 everywhere. We may also have data points (xᵢ,tᵢ,uᵢ) from measurements or a known initial condition. The training objective is just a weighted sum of squared errors L(ᵩ) = L_data(ᵩ) + λ L_phys(ᵩ) + L_bc/ic(ᵩ) with L_data(ᵩ) = meanᵢ |uᵩ(xᵢ,tᵢ) − uᵢ|² L_phys(ᵩ) = meanⱼ |rᵩ(xⱼ,tⱼ)|² where (xⱼ,tⱼ) are the collocation points in Ω L_bc/ic(ᵩ) = penalties enforcing boundary conditions and initial conditions The key technical step is that the derivatives inside rᵩ are computed by automatic differentiation ∂uᵩ/∂t, ∂uᵩ/∂x, ∂²uᵩ/∂x², … So we can differentiate the total loss L(ᵩ) with respect to ᵩ and train with gradient descent. This is the whole idea behind PINNs. Learn a function, but make the PDE part of the loss, so the network is trained to be a solution, not just a curve-fitter. In the render, the main 3D surface is the network’s current guess uᵩ(x,t), drawn as a living sheet over the (x,t) plane. Hovering above is the neural scaffold...a visible graph of feature nodes and connections. The bright tension threads are the physics residual rᵩ(x,t): each thread tethers a collocation bead on the sheet up to the scaffold, and it thickens and brightens exactly where |rᵩ| is large (color encodes the sign). As training runs, those threads go slack across the domain not because we hid the error, but because the network has actually been pushed toward rᵩ(x,t) ≈ 0. #PINNs #PhysicsInformedNeuralNetworks #ScientificMachineLearning #PDE #DifferentialEquations #Optimization #MachineLearning #AppliedMath #ComputationalPhysics

Mathelirium

17,285 views • 2 months ago

This is the best way to understand how ML models actually work! Use Drawdata to draw a 2D dataset in Jupyter. Use it to actively pick data from the widget and update the model as the data is being drawn! Fully interactive, real-time, and open-source!

This is the best way to understand how ML models actually work! Use Drawdata to draw a 2D dataset in Jupyter. Use it to actively pick data from the widget and update the model as the data is being drawn! Fully interactive, real-time, and open-source!

Daily Dose of Data Science

52,070 views • 8 months ago