Загрузка видео...

Не удалось загрузить видео

На главную

AI coding agents hit a wall when codebases get massive. Even with 2M token context windows, a 10M line codebase needs 100M tokens. The real bottleneck isn't just ingesting code - it's getting models to actually pay attention to all that context effectively.

976,161 просмотров • 11 месяцев назад •via X (Twitter)

Комментарии: 11

Фото профиля Garry Tan
Garry Tan11 месяцев назад

Full video

Фото профиля The Rundown AI
The Rundown AI1 год назад

If you're not learning AI in 2025, you're falling behind. Join 1,000,000+ early adopters reading and learn AI in just 5 minutes a day (for free).

Фото профиля Züri Bar Yochay
Züri Bar Yochay11 месяцев назад

No coding task needs the entire 10 M-line repo in scope. You just need the files you’re touching and their dependency chain, maybe 3-5 layers deep. Big codebases naturally break into mostly independent islands, so a modest context window already covers almost everything that matters.

Фото профиля Gavriel Cohen
Gavriel Cohen11 месяцев назад

Why could you possibly need to have 10M lines of code in context? What kind of task would require simultaneously considering every line of the codebase? I don’t think I can hold more than 30 lines in context when I’m coding but I can work on a project with millions of lines by navigating, selectively focusing and using abstractions.

Фото профиля Yacine Mahdid
Yacine Mahdid11 месяцев назад

Long context is pretty hard, did a review of where the methods are right now last month: Long story short the bottleneck is self-attention which isn’t easy to linearize without performance degradation and scarce long context training data.

Фото профиля kohl
kohl11 месяцев назад

AFAIK Claude Code and Codex don’t use or need indexing, nor do humans. Git history, tests, documentation, environment access, vision and intent. The meta data … the why … is crucial. this is why labs are in best position to win agentic swe

Фото профиля Lachlan Phillips exo/acc 👾
Lachlan Phillips exo/acc 👾11 месяцев назад

Collapsible code? Seems that half of this issue is the linear nature of documents. Code should be able to be represented symbolically natively. You should be able to compress most of your codebase and only expand relevant functions during specific queries.

Фото профиля Dr. Bobby Gomez-Reino
Dr. Bobby Gomez-Reino11 месяцев назад

that is likely not the only way to approach it. humans don't keep attention to 10M lines of code to solve programming tasks.

Фото профиля geoff
geoff11 месяцев назад

Have some ideas how to fix this, might be a little radical but it kinda makes sense if you can constrain the search space then all of a sudden a big repo isn’t so big. Speaking as someone who researched this space with a repo measured in the hundreds of billions of tokens. I don’t know. To be proven shortly.

Фото профиля Moritz Wallawitsch
Moritz Wallawitsch11 месяцев назад

Do human SWEs have a 10M line context window?? Working memory is the real bottleneck.

Фото профиля TheOneCoder
TheOneCoder11 месяцев назад

We don't need larger context, sure it helps! But what we need is better tools. I treat my agents, as Gifted Junior/Mid level engineers, thus I designate them to work like that, I am getting really good results! Small tasks, clear scope My goal right now is to build an architect agent that orchestrates many different agents into a solution, by given each a small piece of the pie, with a goal you have multi experts in that part of pie. Proper coding practices, and modularized code will allows you to scale past 10 million lines!

Похожие видео

🚨 AI coding agents hallucinate because they can't actually read your codebase. This MCP server fixes that. It's called Context+ and it gives AI 99% accuracy on large-scale engineering projects by building a real semantic map of your code before touching a single line. Here's what makes it different from every other MCP tool: → Tree-sitter AST parsing across 43 file extensions. Not grep. Not regex. Actual syntax trees. → Spectral Clustering that groups semantically related files into labeled clusters. Your AI finally understands what belongs together. → Obsidian-style wikilinks that map features to code files. Navigate entire codebases like a knowledge graph. → Blast radius tracing. Before any change, it shows every file and line where a symbol is imported or used. No more orphaned references. → Shadow restore points. Every AI-proposed commit creates a restore snapshot. One command to undo any change without touching git history. → Semantic search by meaning. Ask what something does. Not what it's called. The `propose_commit` tool is the wild part. It validates changes against strict rules, creates a shadow restore point, and only then writes to disk. AI can't just freestyle your production code. Works with Claude Code, Cursor, VS Code, and Windsurf. One line to install with bunx or npx. This is what responsible AI coding infrastructure actually looks like. 100% Opensource. Link in comments.

Ihtesham Ali

29,781 просмотров • 2 месяцев назад