Video yükleniyor...

Video Yüklenemedi

Ana Sayfaya Dön

Why HTML turned out to be the foundation for agentic video making from Bin Liu: “We’ve been trying to build a video agent. However, we learned the hard way that agents have no visual intelligence. So that’s when we turned to code. HTML is the LLM’s native language. LLMs...

16,653 görüntüleme • 11 gün önce •via X (Twitter)

0 Yorum

Yorum bulunmuyor

Orijinal gönderinin yorumları burada görünecek

Benzer Videolar

Introducing /visual-plan - a skill to generate rich, visual plans for Claude Code and Codex. Plan mode in Claude Code is incredible. But I always find my eyes glazing over when it gives me this huge markdown essay in my terminal. I found I can make much better visual plans with reusable components. So I made a skill called `/visual-plan`. It generates plans as MDX with visual, interactive components. Diagrams, interactive API specs, schema design changes, annotated code, and even pan and zoomable wireframes. So for any UI work, you can look at a wireframe first, comment on it, iterate, and then have the agent work. I’ve found this to be a much more intuitive interface for reasoning about what the agent is doing. It’s somewhat inspired by that popular post about how HTML is better than Markdown. But HTML can be slow and verbose to write. And it doesn’t look good checked into a repo. This has really made me feel like humans and engineering are entering a new abstraction phase, where we reason about things at the plan level. As long as the plan is good, agents are getting more and more reliable at executing on it. Almost to the degree that we trust the C compiler to compile to assembly reliably. Plans are the new intermediate representation. I also made a skill for the reverse of this, called `/visual-recap`. After the agent works, it gives you a recap of everything it did. Same idea: wireframes, interactive API specs and diffs, schemas, annotated code, etc. So now when you’re reviewing what the agent did for you, or looking at a pull request of somebody else’s code, you can see a visual recap instead of just reading a wall of text. It’s all free and open source. You can find it on my GitHub. Will link to it in the reply because we all know how dumb these algorithms are with links.

Steve (Builder.io)

121,983 görüntüleme • 16 gün önce