正在加载视频...

视频加载失败

Yann LeCun says language isn’t intelligence. Predicting text doesn’t mean understanding reality. The real world is messy, physical, and causal and today’s LLMs barely touch that. The next leap is Physical AI: world models, cause and effect, real planning. Do you think LLMs can evolve into this, or do...

76,154 次观看 • 4 个月前 •via X (Twitter)

0 条评论

暂无评论

原始帖子的评论将显示在这里

相关视频

Yann LeCun (Yann LeCun ) beautifully explains how the architecture and principles used to train LLMs can not be extended to teach AI the real-world intelligence. In 1 line: LLMs excel where intelligence equals sequence prediction over symbols. Real-world intelligence requires learned world models, abstraction, causality, and action planning under uncertainty, which current next-token training does not provide. He says current LLMs learn by predicting the next token. That objective works very well when the task itself can be reduced to manipulating discrete symbols and sequences. Math, physics problem solving on paper, and coding fit this pattern because success largely comes from searching and composing the right sequences of symbols, equations, or program tokens. With enough data and scale, these models get very good at that kind of structured sequence prediction. Real-world intelligence is different. The physical world is continuous, noisy, uncertain, and high dimensional. To act in it, a system needs internal models that capture objects, dynamics, causality, constraints from the body, and the outcomes of actions over time. Humans and animals build abstract representations from rich sensory streams, then make predictions in that abstract space, not at the raw pixel level. That is why a child can learn intuitive physics, plan multi-step actions, and adapt quickly in new situations with little data. His claim about saturation follows from this gap. Scaling token prediction keeps improving symbol manipulation tasks like math and code, but it hits limits on embodied reasoning and common sense because text alone does not provide the right learning signals for world models. Predicting the next word cannot efficiently teach contact forces, affordances, occlusion, friction, or how actions change the state of the environment. For that, he argues we need architectures that learn abstractions from sensory data and predict futures in abstract latent spaces, then use those predictions to plan actions toward goals with built-in guardrails. --- From 'Pioneer Works' YT Channel (link in comment)

Rohan Paul

104,460 次观看 • 5 个月前

Yann LeCun just exposed AI’s fundamental flaw. We’re celebrating systems that can’t do what insects do effortlessly. LeCun: “The biggest difficulty is not to get fooled into thinking that a computer system is intelligent simply because it can manipulate language.” Language feels like intelligence because we experience it as the highest form of human thought. So when a machine produces fluent, articulate, convincing text, the instinct is to conclude it understands. It doesn’t. LeCun: “It turns out the real world is much, much more complicated.” Language is actually the easy part. A sequence of discrete symbols with a finite number of possibilities. Predicting the next word is a tractable mathematical problem. Impressive at scale. Not understanding. Pattern matching in symbol space. The real world is something else entirely. A high-dimensional, continuous, noisy signal that changes every millisecond in ways no text corpus can capture. Physical reality doesn’t come in tokens. LeCun: “Which your house cat is perfectly able to deal with. But not computers yet.” This is the Moravec paradox. The things that feel hard to humans: writing essays, solving equations, passing bar exams. Computationally straightforward. The things that feel trivially easy: walking across a room, catching a falling object, folding a shirt. Extraordinarily difficult for machines. Your house cat navigates a complex three-dimensional physical environment in real time. Predicts trajectories. Adjusts to surprises. Understands cause and effect through direct interaction with the world. The most powerful AI systems ever built cannot do what your cat does before breakfast. That’s not a minor gap. That’s the entire frontier. Language is the easy problem that looks hard to humans. The physical world is the hard problem that looks easy because evolution solved it billions of years ago. We’re pouring hundreds of billions into making language models marginally better at the simple problem. The actual intelligence problem remains unsolved. LeCun has spent fifteen years on this. Not making chatbots more fluent. Giving machines the ability to understand, predict, and interact with physical reality the way animals do instinctively. The benchmark that matters isn’t passing a bar exam. It’s folding a shirt. Loading a dishwasher. Navigating an unfamiliar room without a map. We built systems that can write your dissertation before we built systems that can tie your shoes. That’s where AI actually is. Everything else is autocomplete at scale.

Dustin

284,028 次观看 • 3 个月前