Loading video...

Video Failed to Load

There was a problem loading this video. This could be due to a temporary network issue or the video might be unavailable.

What's the difference between retrieving a fact and truly reasoning? Prof. Kambhampati Subbarao Kambhampati (కంభంపాటి సుబ్బారావు) begins by noting that human reasoning is tricky to define. Yet, since the Greeks, we’ve relied on formal logic (like syllogisms) to guide sound reasoning. A thread 🧵👇

Machine Learning Street Talk

34,711 subscribers

19,132 views • 1 year ago •via X (Twitter)

Education Science & Technology

Anya Rossi• Live Now

Private livecam show

9 Comments

Machine Learning Street Talk1 year ago

2/7 He points out that we lack a neat definition of “human reasoning,” but that hasn’t stopped our civilization from moving forward. We built entire disciplines—from Aristotle’s logic to modern computer science—on the foundation of formal, structured reasoning.

Machine Learning Street Talk1 year ago

3/7 Kambhampati contrasts retrieval (pulling stored info) with reasoning (connecting ideas with logical rigor). Just because we string ideas together doesn’t guarantee we’re actually reasoning—there must be standards for correctness and validity.

Machine Learning Street Talk1 year ago

4/7 He humorously references Monty Python’s witch trial scene: the argument is “If she floats like wood, she must be a witch.” It looks like reasoning—there’s a chain of statements—but it’s clearly not sound. It’s a playful reminder that not all stepwise arguments are valid.

Machine Learning Street Talk1 year ago

5/7 Between raw retrieval (“She’s a witch!") and truly logical inferences, there’s a vast middle ground of fallacies and “Monty Python logic” that mimics reasoning but fails basic tests of correctness.

Machine Learning Street Talk1 year ago

6/7 Hence, in AI (and broader AGI pursuits), we can’t just replicate the surface features of human reasoning. We need rigorous definitions and methods—logic, probability, evidence—to separate mere associations from true inference.

Machine Learning Street Talk1 year ago

7/7 By preserving “sound reasoning” standards, Kambhampati argues we uphold the legacy of centuries of philosophical and mathematical thought, aiming for AI that doesn’t just retrieve but reasons in a formally robust way.

OnlineBookClub.org1 year ago

What is the nature of an existence that is experienced entirely outside of time itself? Can a single decision that is made in a state of timelessness simultaneously affect EVERY point in time and space? Groundbreaking reconciliation of creationism with natural science.

Bahaeddin ERAVCI1 year ago

@rao2z I touched on the same concept on my last substack. LLMs are statement generators with populist vote based on training data and hallucinations are not bug but feature. We need formal systems like logic for verifying truth among these statements.

Eray Özkural, PhD - OG AI ⏭️🌟1 year ago

@rao2z Come on who needs logic? 🤪

Related Videos

The legendary Subbarao Kambhampati (కంభంపాటి సుబ్బారావు) explains the difference between reasoning and memorisation using the example of the famous "why are manhole covers round?" interview question. Which do you think LLMs do?

The legendary Subbarao Kambhampati (కంభంపాటి సుబ్బారావు) explains the difference between reasoning and memorisation using the example of the famous "why are manhole covers round?" interview question. Which do you think LLMs do?

Machine Learning Street Talk

326,859 views • 1 year ago

We tend to use logic to reach conclusions that agree with our biases, AKA, what psychologists call "motivated reasoning." Nate Silver and Maria Konnikova talk about motivated reasoning in the Biden campaign and beyond on #RiskyBusiness:

We tend to use logic to reach conclusions that agree with our biases, AKA, what psychologists call "motivated reasoning." Nate Silver and Maria Konnikova talk about motivated reasoning in the Biden campaign and beyond on #RiskyBusiness:

Pushkin Industries 🎙️

64,237 views • 1 year ago

Elon Musk said Tesla will add “reasoning” to FSD, likely by the end of the year. “With reasoning, it will literally think about which parking spot to pick. It’ll drop you off at the store entrance, then go find a spot. It will spot empty spaces much better than a human and use reasoning to solve problems.”

Elon Musk said Tesla will add “reasoning” to FSD, likely by the end of the year. “With reasoning, it will literally think about which parking spot to pick. It’ll drop you off at the store entrance, then go find a spot. It will spot empty spaces much better than a human and use reasoning to solve problems.”

Nic Cruz Patane

242,927 views • 6 months ago

Most AI tools generate answers. MiroMind generates verifiable reasoning. Here’s why that matters A THREAD 🧵

Most AI tools generate answers. MiroMind generates verifiable reasoning. Here’s why that matters A THREAD 🧵

TISHA WEB3

49,745 views • 26 days ago

Proteins can now talk. Introducing BioReason-Pro, the first reasoning model for protein function. A thread🧵

Proteins can now talk. Introducing BioReason-Pro, the first reasoning model for protein function. A thread🧵

Adib

202,696 views • 3 months ago

GPT-5 on Sudoku-Bench 🧩 Since releasing Sudoku-Bench in May 2025, when no LLM could solve a classic 9x9 puzzle, we've been evaluating the latest generation of models. GPT-5 now leads our leaderboard with 33% puzzles solved--approximately 2x the previous leader--and is the first LLM we've tested to solve a 9x9 Sudoku variant. However, with 67% of the much harder puzzles remaining unsolved, Sudoku-Bench continues to present significant challenges for AI reasoning. Modern Sudoku variants require models to first understand novel rulesets through meta-reasoning, then maintain global consistency across long reasoning chains. Our experiments with GRPO fine-tuning on Qwen2.5-7b and "Thought Cloning" (training on expert human reasoning from Cracking the Cryptic) show that current approaches still struggle with the spatial reasoning and creative "break-in" points that human solvers use naturally. We believe new approaches are required to solve our benchmark. These results highlight persistent gaps between computational problem-solving and human-like reasoning, particularly in tasks requiring integrated mathematical logic, spatial awareness, and creative insight. Read more about our update here: 🔗 Blogpost →

GPT-5 on Sudoku-Bench 🧩 Since releasing Sudoku-Bench in May 2025, when no LLM could solve a classic 9x9 puzzle, we've been evaluating the latest generation of models. GPT-5 now leads our leaderboard with 33% puzzles solved--approximately 2x the previous leader--and is the first LLM we've tested to solve a 9x9 Sudoku variant. However, with 67% of the much harder puzzles remaining unsolved, Sudoku-Bench continues to present significant challenges for AI reasoning. Modern Sudoku variants require models to first understand novel rulesets through meta-reasoning, then maintain global consistency across long reasoning chains. Our experiments with GRPO fine-tuning on Qwen2.5-7b and "Thought Cloning" (training on expert human reasoning from Cracking the Cryptic) show that current approaches still struggle with the spatial reasoning and creative "break-in" points that human solvers use naturally. We believe new approaches are required to solve our benchmark. These results highlight persistent gaps between computational problem-solving and human-like reasoning, particularly in tasks requiring integrated mathematical logic, spatial awareness, and creative insight. Read more about our update here: 🔗 Blogpost →

Sakana AI

154,512 views • 7 months ago

Carl Sagan sheds light on the kind of logic and reasoning some use that leads to false assumptions. (🎥Source ; CarlSaganDotCom )

Carl Sagan sheds light on the kind of logic and reasoning some use that leads to false assumptions. (🎥Source ; CarlSaganDotCom )

Prof. Carl Sagan

19,973 views • 1 year ago

The most amazing aspect of DeepSeek Open Source is that just the Reasoning Engine can be isolated and used with any other LLM. In fact you can have a mixture of Reasoning Engines cascade around a problem and then use an Operator type agent AI to function on the the results.

The most amazing aspect of DeepSeek Open Source is that just the Reasoning Engine can be isolated and used with any other LLM. In fact you can have a mixture of Reasoning Engines cascade around a problem and then use an Operator type agent AI to function on the the results.

Brian Roemmele

398,746 views • 1 year ago

While others fight, we ship 🚀 Agno’s ReasoningTools are now live on the Playground — giving non-reasoning models the ability to "think", "analyze", and outperform Reasoning Models on Agentic workloads. If you haven't tried `ReasoningTools` yet - you definitely should 👇

While others fight, we ship 🚀 Agno’s ReasoningTools are now live on the Playground — giving non-reasoning models the ability to "think", "analyze", and outperform Reasoning Models on Agentic workloads. If you haven't tried `ReasoningTools` yet - you definitely should 👇

Ashpreet Bedi

27,864 views • 1 year ago

🔥 Battle for the top reasoning LLM intensifies! The QwQ-32B-Preview is a very good reasoning LLM. Full video of my tests here: Summary of my findings and thoughts: It was able to solve a couple of hard math problems so it looks very promising for maths. It didn’t do so well on my coding task (generating bash script). By the results reported on the LiveCodeBench it has room for improvement. One thing that’s become very clear to me is that the reasoning capabilities of these LLMs are significantly closing the gap between the open and closed-sourced models. The competition is now going to be on a different level and it's going to be focused on which model produces the most efficient, optimized, accurate, and fastest reasoning steps beyond just accurate responses. That's what developers will care about. Traditional benchmarks are not going to be good enough for this. On that note, it's getting harder to assess these models, especially the consistency, efficiency, and quality of reasoning steps. After experimenting with this model, I realized that the reasoning paths are not fully optimized and there is a lot more optimization that needs to happen before these models are used in production settings. There might be a need to build some type of native and efficient self-assessment or self-reflection capability that prevents these reasoning LLMs to go in loops or produce unnecessary lengthy sequences. I also noticed that this model, at least from the HF demo, doesn’t separate the reasoning from the response. I think that actually hurts the performance of the model. On the other hand, o1 and R1 do that really well. In addition to that, I believe the training on reasoning is hurting the performance of the LLM in other areas such as helpfulness (check the code example in the video). Something that’s necessary at the moment is validating or evaluating the quality of the reasoning chains and figuring out a better strategy to optimize them. Current methods are probably not sufficient to solve this problem but that's where innovation will comes next. I recognize that this is a first effort so kudos to the Qwen team on this release. These issues highlight the importance of transparency with reasoning LLMs. We need to know how it was trained and with exact data or optimization strategy. Understanding that will enable researchers and developers to build better intuition and improve the reasoning capabilities and components at a faster rate. There is an opportunity for someone or a company to build a truly open-reasoning LLM. The race is on! I will continue to track the state-of-the-art in reasoning LLMs and report my takes and observations here. Stay tuned for more.

🔥 Battle for the top reasoning LLM intensifies! The QwQ-32B-Preview is a very good reasoning LLM. Full video of my tests here: Summary of my findings and thoughts: It was able to solve a couple of hard math problems so it looks very promising for maths. It didn’t do so well on my coding task (generating bash script). By the results reported on the LiveCodeBench it has room for improvement. One thing that’s become very clear to me is that the reasoning capabilities of these LLMs are significantly closing the gap between the open and closed-sourced models. The competition is now going to be on a different level and it's going to be focused on which model produces the most efficient, optimized, accurate, and fastest reasoning steps beyond just accurate responses. That's what developers will care about. Traditional benchmarks are not going to be good enough for this. On that note, it's getting harder to assess these models, especially the consistency, efficiency, and quality of reasoning steps. After experimenting with this model, I realized that the reasoning paths are not fully optimized and there is a lot more optimization that needs to happen before these models are used in production settings. There might be a need to build some type of native and efficient self-assessment or self-reflection capability that prevents these reasoning LLMs to go in loops or produce unnecessary lengthy sequences. I also noticed that this model, at least from the HF demo, doesn’t separate the reasoning from the response. I think that actually hurts the performance of the model. On the other hand, o1 and R1 do that really well. In addition to that, I believe the training on reasoning is hurting the performance of the LLM in other areas such as helpfulness (check the code example in the video). Something that’s necessary at the moment is validating or evaluating the quality of the reasoning chains and figuring out a better strategy to optimize them. Current methods are probably not sufficient to solve this problem but that's where innovation will comes next. I recognize that this is a first effort so kudos to the Qwen team on this release. These issues highlight the importance of transparency with reasoning LLMs. We need to know how it was trained and with exact data or optimization strategy. Understanding that will enable researchers and developers to build better intuition and improve the reasoning capabilities and components at a faster rate. There is an opportunity for someone or a company to build a truly open-reasoning LLM. The race is on! I will continue to track the state-of-the-art in reasoning LLMs and report my takes and observations here. Stay tuned for more.

elvis

14,740 views • 1 year ago

Open sourcing Dynamic Graph Memory by mem0. Memory is fundamental to human reasoning, shaping how we approach tasks and make decisions. At Mem0, we believe that AI agents & apps should reflect this principle. Our Dynamic Graph Memory emulates human memory, advancing AI agents toward more intelligent, human-like reasoning. This is a significant step forward in building AI that truly understands and interacts with the world like we do. All credit to Dev Khant Deshraj Yadav Prateek Chhikara for their countless nights spent on bringing this to life. Link:

Open sourcing Dynamic Graph Memory by mem0. Memory is fundamental to human reasoning, shaping how we approach tasks and make decisions. At Mem0, we believe that AI agents & apps should reflect this principle. Our Dynamic Graph Memory emulates human memory, advancing AI agents toward more intelligent, human-like reasoning. This is a significant step forward in building AI that truly understands and interacts with the world like we do. All credit to Dev Khant Deshraj Yadav Prateek Chhikara for their countless nights spent on bringing this to life. Link:

Taranjeet

51,129 views • 1 year ago

This is a classic articulation of the "logos/rhema" distinction. Logos = written word of God / logic / reasoning Rhema = spoken word of God / "now word" / "revelation" (the "word of the Lord came" to a prophet) The only problem is that this distinction is false. 👇⬇️🧵 1/

This is a classic articulation of the "logos/rhema" distinction. Logos = written word of God / logic / reasoning Rhema = spoken word of God / "now word" / "revelation" (the "word of the Lord came" to a prophet) The only problem is that this distinction is false. 👇⬇️🧵 1/

David Fish

38,620 views • 2 months ago

We introduce HumanOmniV2, an omni-modal model designed to address two core problems in multimodal reasoning: insufficient global context understanding and the shortcut problem. By analyzing visual, auditory, and textual signals, the model performs deep reasoning on complex human intentions, emotions, and social interactions.

We introduce HumanOmniV2, an omni-modal model designed to address two core problems in multimodal reasoning: insufficient global context understanding and the shortcut problem. By analyzing visual, auditory, and textual signals, the model performs deep reasoning on complex human intentions, emotions, and social interactions.

Tongyi Lab

1,221,672 views • 11 months ago

You simply can't argue with sound logic 🧠 Here, Amy explains her scientific reasoning for deciding to become "hot" by any means necessary.

You simply can't argue with sound logic 🧠 Here, Amy explains her scientific reasoning for deciding to become "hot" by any means necessary.

Maggie Mae Fish 🌈

11,797 views • 10 months ago

🆕Scaling Test Time Compute to Multi-Agent Civilizations, with Noam Brown We're excited to publish our full conversation with Noam Brown on the frontiers of the new reasoning paradigm at OpenAI! - first principles for starting the "Multi-Agents" team - what's not captured by the "System 1/System 2" analogy for inference time compute - how Ilya Sutskever convinced him that reasoning was closer than he thought - Deep Research is existence proof that RL generalizes beyond verifiable rewards - the relationship between AI for imperfect information games (like Poker, Stratego, Diplomacy) and reasoning Enjoy! on youtube, or wherever fine podcasts are sold.

🆕Scaling Test Time Compute to Multi-Agent Civilizations, with Noam Brown We're excited to publish our full conversation with Noam Brown on the frontiers of the new reasoning paradigm at OpenAI! - first principles for starting the "Multi-Agents" team - what's not captured by the "System 1/System 2" analogy for inference time compute - how Ilya Sutskever convinced him that reasoning was closer than he thought - Deep Research is existence proof that RL generalizes beyond verifiable rewards - the relationship between AI for imperfect information games (like Poker, Stratego, Diplomacy) and reasoning Enjoy! on youtube, or wherever fine podcasts are sold.

Latent.Space

105,905 views • 11 months ago

🕹️We are excited to introduce "ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation" ChronoEdit reframes image editing as a video generation task to encourage temporal consistency. It leverages a temporal reasoning stage that denoises with “video reasoning tokens” to "reason" on physically plausible edits. See the attached video for results. Project Page: Arxiv: Code and model are coming.

🕹️We are excited to introduce "ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation" ChronoEdit reframes image editing as a video generation task to encourage temporal consistency. It leverages a temporal reasoning stage that denoises with “video reasoning tokens” to "reason" on physically plausible edits. See the attached video for results. Project Page: Arxiv: Code and model are coming.

Huan Ling

36,841 views • 8 months ago

Turn any AI Model into Reasoning Model with Deepseek r1 <thinking> Architecture. Models like GPT4o and Sonnet 3.5 are Implementation Models But a new breakthrough with Deepseek can make them a Reasoning model. Here's a step-by-step Explanation: 🧵

Turn any AI Model into Reasoning Model with Deepseek r1 <thinking> Architecture. Models like GPT4o and Sonnet 3.5 are Implementation Models But a new breakthrough with Deepseek can make them a Reasoning model. Here's a step-by-step Explanation: 🧵

CJ Zafir

192,262 views • 1 year ago