正在加载视频...

视频加载失败

加载此视频时出现问题。这可能是由于临时网络问题，或视频可能不可用。

AI coding agents hit a wall when codebases get massive. Even with 2M token context windows, a 10M line codebase needs 100M tokens. The real bottleneck isn't just ingesting code - it's getting models to actually pay attention to all that context effectively.

Garry Tan

963,227 subscribers

976,531 次观看 • 1 年前 •via X (Twitter)

新闻政治科学技术教育

Anya Rossi• Live Now

Private livecam show

11 条评论

Garry Tan 的头像

Garry Tan1 年前

Full video

The Rundown AI 的头像

The Rundown AI1 年前

If you're not learning AI in 2025, you're falling behind. Join 1,000,000+ early adopters reading and learn AI in just 5 minutes a day (for free).

Züri Bar Yochay 的头像

Züri Bar Yochay1 年前

No coding task needs the entire 10 M-line repo in scope. You just need the files you’re touching and their dependency chain, maybe 3-5 layers deep. Big codebases naturally break into mostly independent islands, so a modest context window already covers almost everything that matters.

Gavriel Cohen 的头像

Gavriel Cohen1 年前

Why could you possibly need to have 10M lines of code in context? What kind of task would require simultaneously considering every line of the codebase? I don’t think I can hold more than 30 lines in context when I’m coding but I can work on a project with millions of lines by navigating, selectively focusing and using abstractions.

Yacine Mahdid 的头像

Yacine Mahdid1 年前

Long context is pretty hard, did a review of where the methods are right now last month: Long story short the bottleneck is self-attention which isn’t easy to linearize without performance degradation and scarce long context training data.

kohl 的头像

kohl1 年前

AFAIK Claude Code and Codex don’t use or need indexing, nor do humans. Git history, tests, documentation, environment access, vision and intent. The meta data … the why … is crucial. this is why labs are in best position to win agentic swe

Lachlan Phillips exo/acc 👾 的头像

Lachlan Phillips exo/acc 👾1 年前

Collapsible code? Seems that half of this issue is the linear nature of documents. Code should be able to be represented symbolically natively. You should be able to compress most of your codebase and only expand relevant functions during specific queries.

Dr. Bobby Gomez-Reino 的头像

Dr. Bobby Gomez-Reino1 年前

that is likely not the only way to approach it. humans don't keep attention to 10M lines of code to solve programming tasks.

geoff 的头像

geoff1 年前

Have some ideas how to fix this, might be a little radical but it kinda makes sense if you can constrain the search space then all of a sudden a big repo isn’t so big. Speaking as someone who researched this space with a repo measured in the hundreds of billions of tokens. I don’t know. To be proven shortly.

Moritz Wallawitsch 的头像

Moritz Wallawitsch1 年前

Do human SWEs have a 10M line context window?? Working memory is the real bottleneck.

TheOneCoder 的头像

TheOneCoder1 年前

We don't need larger context, sure it helps! But what we need is better tools. I treat my agents, as Gifted Junior/Mid level engineers, thus I designate them to work like that, I am getting really good results! Small tasks, clear scope My goal right now is to build an architect agent that orchestrates many different agents into a solution, by given each a small piece of the pie, with a goal you have multi experts in that part of pie. Proper coding practices, and modularized code will allows you to scale past 10 million lines!

相关视频

Problem: AI coding performance dips when context windows exceed 50% Solution: Combine Cline's context window awareness with the `new_task` tool + .clinerules to create a workflow that autonomously hands off tasks before hitting limits, ensuring persistent memory. A guide: 🧵

Problem: AI coding performance dips when context windows exceed 50% Solution: Combine Cline's context window awareness with the `new_task` tool + .clinerules to create a workflow that autonomously hands off tasks before hitting limits, ensuring persistent memory. A guide: 🧵

Cline

62,196 次观看 • 1 年前

NVIDIA just dropped a 2.5-hour course on how to build coding agents, featuring the CTO of Cursor: 00:07 – self-driving codebases with async agents 37:50 – agentic AI, from the ground up 1:16:35 – context engineering for high-signal AI code reviews 1:53:26 – teach AI to code in any language Worth more than any $500 enterprise-AI course. Bookmark this & steal my harness, below.

NVIDIA just dropped a 2.5-hour course on how to build coding agents, featuring the CTO of Cursor: 00:07 – self-driving codebases with async agents 37:50 – agentic AI, from the ground up 1:16:35 – context engineering for high-signal AI code reviews 1:53:26 – teach AI to code in any language Worth more than any $500 enterprise-AI course. Bookmark this & steal my harness, below.

Phosphen

11,226 次观看 • 11 天前

$Microsoft just dropped a 17-page paper - "Less Context, Better Agents" - proving the thing nobody building agents wants to admit: Your agent isn't failing because it needs more context. It's drowning in the context it already has. They ran GPT-5 four ways on one 50-task benchmark (via MCP): Full history → bloated, pricey, more errors Trim to the last 5 tool calls → 79% done Add light summarization → 91.6%, on a fraction of the tokens Less context. Fewer tokens. Higher completion. Everyone's racing to cram MORE into the window. Microsoft just published the receipt that the opposite wins for long-running agents. Rewired how I build agents this week. Paper ↓$

Microsoft just dropped a 17-page paper - "Less Context, Better Agents" - proving the thing nobody building agents wants to admit: Your agent isn't failing because it needs more context. It's drowning in the context it already has. They ran GPT-5 four ways on one 50-task benchmark (via MCP): Full history → bloated, pricey, more errors Trim to the last 5 tool calls → 79% done Add light summarization → 91.6%, on a fraction of the tokens Less context. Fewer tokens. Higher completion. Everyone's racing to cram MORE into the window. Microsoft just published the receipt that the opposite wins for long-running agents. Rewired how I build agents this week. Paper ↓

Archive

18,242 次观看 • 27 天前

🚨 Big news for AI innovation: Claude Opus 4 and Claude Sonnet 4, Anthropic's most advanced models, are now available in Amazon Bedrock. These powerful models offer hybrid reasoning, 200K token context windows, and are designed for AI agents. From financial analysis to high-quality writing, to enhanced reasoning, coding, agentic capabilities and more—all with the enterprise-grade security of Amazon Web Services.

🚨 Big news for AI innovation: Claude Opus 4 and Claude Sonnet 4, Anthropic's most advanced models, are now available in Amazon Bedrock. These powerful models offer hybrid reasoning, 200K token context windows, and are designed for AI agents. From financial analysis to high-quality writing, to enhanced reasoning, coding, agentic capabilities and more—all with the enterprise-grade security of Amazon Web Services.

Amazon

123,030 次观看 • 1 年前

This is actually useful. LangChain just released OpenWiki. It's an open-source agent that creates a wiki for your codebase, connects it to your coding agent, and keeps it updated as your repo changes. Your AI coding agent gets long-term repo context without stuffing everything into CLAUDE.md. Here's how to setup. Save this.

This is actually useful. LangChain just released OpenWiki. It's an open-source agent that creates a wiki for your codebase, connects it to your coding agent, and keeps it updated as your repo changes. Your AI coding agent gets long-term repo context without stuffing everything into CLAUDE.md. Here's how to setup. Save this.

Min Choi

124,721 次观看 • 18 天前

Prime Intellect engineer: "everyone's bragging about a million-token context. here's what they don't tell you. at 256k tokens GPT-5.5 scores 80% on retrieval. push it to a million and it drops to 36%. the model accepts the context, it just can't reason across it. people call it context rot." in a 20-minute talk he explains why bigger context windows won't save your agents. continual learning + training on your own traces + real environments - that's the fix. Watch the talk, then save!

Prime Intellect engineer: "everyone's bragging about a million-token context. here's what they don't tell you. at 256k tokens GPT-5.5 scores 80% on retrieval. push it to a million and it drops to 36%. the model accepts the context, it just can't reason across it. people call it context rot." in a 20-minute talk he explains why bigger context windows won't save your agents. continual learning + training on your own traces + real environments - that's the fix. Watch the talk, then save!

Carnage

741,040 次观看 • 10 天前

The secret to "vibe coding"? It's not just the vibes. To get GitHub Copilot to build what you actually want, you need a product requirement doc to provide context. Here's the full breakdown. ▶️

The secret to "vibe coding"? It's not just the vibes. To get GitHub Copilot to build what you actually want, you need a product requirement doc to provide context. Here's the full breakdown. ▶️

GitHub

41,145 次观看 • 8 个月前

CHINA JUST DROPPED AN AI CODING MODEL WITH A 1M CONTEXT WINDOW. And I connected it to Claude Code to see what it could actually do. Meet GLM-X Preview On paper, a few things immediately stood out: → 1M context window → Agentic coding capabilities → Works inside Claude Code → Designed for large-scale coding and reasoning workflows But specs don't matter much if the model can't deliver in practice. So I gave it a real-world task. THE TEST One prompt: > Build a modern AI lead generation dashboard using React and Tailwind CSS. Requirements: → Dark mode → Analytics dashboard → Lead table → Email outreach section → Responsive design → Production-ready component structure Instead of generating a few snippets, it planned the architecture, generated the dashboard components, created the Tailwind configuration, and walked through the implementation requirements. What impressed me most wasn't the code itself. It was how well it maintained context throughout the workflow. That's where a 1M context window starts becoming useful. Less time re-explaining requirements. Less context loss. More room for complex projects. The AI coding race is getting very interesting. And it's no longer just GPT, Claude, and Gemini competing for attention. Results from my test below 👇

CHINA JUST DROPPED AN AI CODING MODEL WITH A 1M CONTEXT WINDOW. And I connected it to Claude Code to see what it could actually do. Meet GLM-X Preview On paper, a few things immediately stood out: → 1M context window → Agentic coding capabilities → Works inside Claude Code → Designed for large-scale coding and reasoning workflows But specs don't matter much if the model can't deliver in practice. So I gave it a real-world task. THE TEST One prompt: > Build a modern AI lead generation dashboard using React and Tailwind CSS. Requirements: → Dark mode → Analytics dashboard → Lead table → Email outreach section → Responsive design → Production-ready component structure Instead of generating a few snippets, it planned the architecture, generated the dashboard components, created the Tailwind configuration, and walked through the implementation requirements. What impressed me most wasn't the code itself. It was how well it maintained context throughout the workflow. That's where a 1M context window starts becoming useful. Less time re-explaining requirements. Less context loss. More room for complex projects. The AI coding race is getting very interesting. And it's no longer just GPT, Claude, and Gemini competing for attention. Results from my test below 👇

Md Riyazuddin

31,199 次观看 • 1 个月前

In this episode, Beyang and Thorsten discuss strategies for effective agentic coding, including the 101 of how it's different from coding with chat LLMs, the key constraint of the context window, how and where subagents can help, and the new oracle subagent which combines multiple LLMs. 00:38 Intros 03:20 How coding with agents is very different from coding with prior AI tools 10:31 Example: fix a simple issue 14:13 Example: debugging an issue with an MCP server 21:50 Example: unifying two build scripts 25:09 How the context window is a key constraint 31:01 Why it's best to focus on one thing at a time 33:09 Subagents and context windows 33:49 The codebase search subagent 38:33 General-purpose subagents 44:05 When to use subagents 46:44 The oracle subagent and o3 51:32 Multi-model agents

In this episode, Beyang and Thorsten discuss strategies for effective agentic coding, including the 101 of how it's different from coding with chat LLMs, the key constraint of the context window, how and where subagents can help, and the new oracle subagent which combines multiple LLMs. 00:38 Intros 03:20 How coding with agents is very different from coding with prior AI tools 10:31 Example: fix a simple issue 14:13 Example: debugging an issue with an MCP server 21:50 Example: unifying two build scripts 25:09 How the context window is a key constraint 31:01 Why it's best to focus on one thing at a time 33:09 Subagents and context windows 33:49 The codebase search subagent 38:33 General-purpose subagents 44:05 When to use subagents 46:44 The oracle subagent and o3 51:32 Multi-model agents

Amp — Research Preview

24,534 次观看 • 1 年前

all AI coding agents hallucinate when it comes to APIs here's a solution for this: Context Hub by Andrew Ng, completely open-source i tested it myself, watch the video to see the live demo

all AI coding agents hallucinate when it comes to APIs here's a solution for this: Context Hub by Andrew Ng, completely open-source i tested it myself, watch the video to see the live demo

Nidhi Singh

32,801 次观看 • 4 个月前

Haiku 4.5 is a workhorse that makes the coding experience in Claude Code feel really fast. While Sonnet 4.5 remains the default, Haiku 4.5 now powers the Explore subagent which can rapidly gather context on your codebase to build apps even faster.

Haiku 4.5 is a workhorse that makes the coding experience in Claude Code feel really fast. While Sonnet 4.5 remains the default, Haiku 4.5 now powers the Explore subagent which can rapidly gather context on your codebase to build apps even faster.

cat

205,427 次观看 • 9 个月前

Coding agents suck at using a browser. Playwright MCP burns through your context window before you even send your first prompt. That is why I built Dev Browser, a Claude Skill to let your agent close the loop without eating up tokens.

Coding agents suck at using a browser. Playwright MCP burns through your context window before you even send your first prompt. That is why I built Dev Browser, a Claude Skill to let your agent close the loop without eating up tokens.

Sawyer Hood

312,803 次观看 • 7 个月前

For a long time, software was limited by how fast people could write code, and how good that code was. As models have improved, that constraint has largely disappeared. Now the bottleneck is access: what surfaces can your agents actually reach? Those interaction layers sit on top of coding agents, the kernel that turns prompts into real-world impact. With Notion's custom agents, we pushed this further. They adapt to your work style as you collaborate with them, using the same deterministic logic that powers coding agents. Simon Last sarah guo

For a long time, software was limited by how fast people could write code, and how good that code was. As models have improved, that constraint has largely disappeared. Now the bottleneck is access: what surfaces can your agents actually reach? Those interaction layers sit on top of coding agents, the kernel that turns prompts into real-world impact. With Notion's custom agents, we pushed this further. They adapt to your work style as you collaborate with them, using the same deterministic logic that powers coding agents. Simon Last sarah guo

Notion Developers

25,065 次观看 • 3 个月前

🚨 AI coding agents hallucinate because they can't actually read your codebase. This MCP server fixes that. It's called Context+ and it gives AI 99% accuracy on large-scale engineering projects by building a real semantic map of your code before touching a single line. Here's what makes it different from every other MCP tool: → Tree-sitter AST parsing across 43 file extensions. Not grep. Not regex. Actual syntax trees. → Spectral Clustering that groups semantically related files into labeled clusters. Your AI finally understands what belongs together. → Obsidian-style wikilinks that map features to code files. Navigate entire codebases like a knowledge graph. → Blast radius tracing. Before any change, it shows every file and line where a symbol is imported or used. No more orphaned references. → Shadow restore points. Every AI-proposed commit creates a restore snapshot. One command to undo any change without touching git history. → Semantic search by meaning. Ask what something does. Not what it's called. The `propose_commit` tool is the wild part. It validates changes against strict rules, creates a shadow restore point, and only then writes to disk. AI can't just freestyle your production code. Works with Claude Code, Cursor, VS Code, and Windsurf. One line to install with bunx or npx. This is what responsible AI coding infrastructure actually looks like. 100% Opensource. Link in comments.

🚨 AI coding agents hallucinate because they can't actually read your codebase. This MCP server fixes that. It's called Context+ and it gives AI 99% accuracy on large-scale engineering projects by building a real semantic map of your code before touching a single line. Here's what makes it different from every other MCP tool: → Tree-sitter AST parsing across 43 file extensions. Not grep. Not regex. Actual syntax trees. → Spectral Clustering that groups semantically related files into labeled clusters. Your AI finally understands what belongs together. → Obsidian-style wikilinks that map features to code files. Navigate entire codebases like a knowledge graph. → Blast radius tracing. Before any change, it shows every file and line where a symbol is imported or used. No more orphaned references. → Shadow restore points. Every AI-proposed commit creates a restore snapshot. One command to undo any change without touching git history. → Semantic search by meaning. Ask what something does. Not what it's called. The `propose_commit` tool is the wild part. It validates changes against strict rules, creates a shadow restore point, and only then writes to disk. AI can't just freestyle your production code. Works with Claude Code, Cursor, VS Code, and Windsurf. One line to install with bunx or npx. This is what responsible AI coding infrastructure actually looks like. 100% Opensource. Link in comments.

Ihtesham Ali

31,051 次观看 • 3 个月前

AI coding agents aren't just about autocorrect, but how do you get the best coding experience with an AI-powered coding agent? In our new short course, Build Apps with Windsurf’s AI Coding Agents, you'll learn how to build, debug, and deploy applications with agentic AI-powered integrated development environment (IDE). AI coding agents, like Codeium's @windsurf, don’t just suggest code, they analyze your codebase, track changes, retrieve relevant information, and apply updates across multiple files. They can help debug, refactor, and even modernize legacy frameworks. But to use them effectively, you need the right approach. This new course shows you how to: 🛠️ Use AI agents to build and refine applications, like a Wikipedia analysis app. 🐞 Debug and refactor JavaScript with AI-assisted automation. 🔍 Understand how search and retrieval power AI coding agents. 🤖 Guide an AI agent effectively—prompting, iterating, and correcting when needed. Taught by Anshul Ramachandran (Anshul Ramachandran), this course gives you hands-on coding experience, insights into how these AI systems work under the hood, and best practices to improve your development workflow. 🔗 Enroll for free:

AI coding agents aren't just about autocorrect, but how do you get the best coding experience with an AI-powered coding agent? In our new short course, Build Apps with Windsurf’s AI Coding Agents, you'll learn how to build, debug, and deploy applications with agentic AI-powered integrated development environment (IDE). AI coding agents, like Codeium's @windsurf, don’t just suggest code, they analyze your codebase, track changes, retrieve relevant information, and apply updates across multiple files. They can help debug, refactor, and even modernize legacy frameworks. But to use them effectively, you need the right approach. This new course shows you how to: 🛠️ Use AI agents to build and refine applications, like a Wikipedia analysis app. 🐞 Debug and refactor JavaScript with AI-assisted automation. 🔍 Understand how search and retrieval power AI coding agents. 🤖 Guide an AI agent effectively—prompting, iterating, and correcting when needed. Taught by Anshul Ramachandran (Anshul Ramachandran), this course gives you hands-on coding experience, insights into how these AI systems work under the hood, and best practices to improve your development workflow. 🔗 Enroll for free:

DeepLearning.AI

23,255 次观看 • 1 年前

Two AI models just became a cheat code. Claude 4.6 thinks. Minimax M2.5 executes. That’s the split. Claude: • 1M token context window • Reads entire codebases • Builds full strategies • Reviews complex work Minimax: • 20x cheaper than flagship models • 80%+ on real coding benchmarks • 37% faster than older versions • Built for agent workflows Here’s the play: 1. Claude plans the system. 2. Minimax builds it. 3. Claude reviews it. 4. Minimax refines it. Strategy → Execution → Review → Scale. That loop is unfair.

Two AI models just became a cheat code. Claude 4.6 thinks. Minimax M2.5 executes. That’s the split. Claude: • 1M token context window • Reads entire codebases • Builds full strategies • Reviews complex work Minimax: • 20x cheaper than flagship models • 80%+ on real coding benchmarks • 37% faster than older versions • Built for agent workflows Here’s the play: 1. Claude plans the system. 2. Minimax builds it. 3. Claude reviews it. 4. Minimax refines it. Strategy → Execution → Review → Scale. That loop is unfair.

Julian Goldie SEO

17,034 次观看 • 5 个月前

Andrej Karpathy explains why human-AI collaboration often fails: we've got the workflow backwards and the bottleneck wrong. He points out that when working with AI, there's a clear pattern. The AI generates solutions quickly while humans verify the output. The goal is making this loop as fast as possible to get real work done. The first way to speed this up is through better verification tools. GUIs are crucial because they tap into our brain's visual processing power. Reading through walls of text is slow and painful, but visual interfaces create a highway to understanding. The second approach is keeping AI constrained. Karpathy warns that people are getting too excited about autonomous agents. An AI that instantly generates 10,000 lines of code isn't helpful when a human still needs hours to verify it's bug-free and secure. The fundamental problem is that even instant AI generation becomes useless if humans can't verify fast enough. The human becomes the bottleneck, having to check for bugs, correct implementation, and security issues in massive outputs. His solution is simple: constrain AI output to maintain manageable verification loops. Don't let AI run wild with massive changes. Keep it on a leash so humans can actually review and approve the work effectively.

Andrej Karpathy explains why human-AI collaboration often fails: we've got the workflow backwards and the bottleneck wrong. He points out that when working with AI, there's a clear pattern. The AI generates solutions quickly while humans verify the output. The goal is making this loop as fast as possible to get real work done. The first way to speed this up is through better verification tools. GUIs are crucial because they tap into our brain's visual processing power. Reading through walls of text is slow and painful, but visual interfaces create a highway to understanding. The second approach is keeping AI constrained. Karpathy warns that people are getting too excited about autonomous agents. An AI that instantly generates 10,000 lines of code isn't helpful when a human still needs hours to verify it's bug-free and secure. The fundamental problem is that even instant AI generation becomes useless if humans can't verify fast enough. The human becomes the bottleneck, having to check for bugs, correct implementation, and security issues in massive outputs. His solution is simple: constrain AI output to maintain manageable verification loops. Don't let AI run wild with massive changes. Keep it on a leash so humans can actually review and approve the work effectively.

Aish

111,470 次观看 • 1 年前

Imagine if coding agents never forgot decisions, never duplicated work, and could coordinate across the same codebase in real time. We’re advancing how agents collaborate on codebases with OriginTrail DKG V9 - turning a GitHub repo and agent activity into a shared knowledge graph for persistent memory and coordination. The result: faster agentic development, less repeated work, lower token usage, and agents that can actually build on each other’s progress. Here’s how it works 👇

Imagine if coding agents never forgot decisions, never duplicated work, and could coordinate across the same codebase in real time. We’re advancing how agents collaborate on codebases with OriginTrail DKG V9 - turning a GitHub repo and agent activity into a shared knowledge graph for persistent memory and coordination. The result: faster agentic development, less repeated work, lower token usage, and agents that can actually build on each other’s progress. Here’s how it works 👇

Jurij Skornik

11,802 次观看 • 3 个月前