Загрузка видео...

Не удалось загрузить видео

На главную

I recorded Claude fix two MiniJinja issues itself. It's Rust (which it sucks at), in a code base not really set up for vibe coding and it still succeeds. Yes, it takes 30 minutes, but it's hands off. Usually I let this work alone and review later.

39,262 просмотров • 1 год назад •via X (Twitter)

Комментарии: 10

Фото профиля Armin Ronacher ⇌
Armin Ronacher ⇌1 год назад

I also have the video on youtube:

Фото профиля Carlos DP
Carlos DP1 год назад

Honestly, I’m kinda ready to say they don’t actually suck at Rust anymore haha. I’ve given them way too many problems I was sure they’d fail at and they solved it without issue. Now that they iterate with diagnostics feedback, Rust is actually ideal imo

Фото профиля Armin Ronacher ⇌
Armin Ronacher ⇌1 год назад

Well you can see that it struggles with tool usage and the borrow checker. Clearly skill issue ;)

Фото профиля ʈim anton willem
ʈim anton willem1 год назад

non-stop code review fuuuck me😄🔫

Фото профиля Ray Quant
Ray Quant1 год назад

claude's rust win shows automated fixes work, but for markets i still need proof before trusting moves.

Фото профиля Andrei Maxim
Andrei Maxim1 год назад

This is very much my workflow! I think it works very well when you are familiar with the project and you can zoom through the diffs, but I still having a hard time relying on Claude to write new features because it’s harder for me to fully wrap my mind around the code change. I wonder if your experience with open-source projects with multiple contributors helps. I also think that the approach of discussing the bug in detail is a very useful pattern for AI-based work, because it helps the human understand what the AI should generate. I’d argue that the support for different languages and tooling will only get better over time and the quality of the code will only improve. However, it also seems that having an easy way to run a test and the entire test suite is more and more important. The same when it comes to using the right abstractions. I do wonder if the human programmer will shift towards creating the correct context for the AI agent.

Фото профиля Armin Ronacher ⇌
Armin Ronacher ⇌1 год назад

New features on new projects works well if you structure them for agentic workflows but it requires quite a bit of attention. I’m sure this will get better soon but today there are some things you can do to improve the success of such things. I went super deep on Claude recently and found a lot of success on new projects. After talking to @zeeg I also want to see how well Cursor’s background agents work. That might work actually better for existing projects right now because the is quite good at following rules.

Фото профиля Ofek Lev
Ofek Lev1 год назад

What was the cost of 30 minutes?

Фото профиля Armin Ronacher ⇌
Armin Ronacher ⇌1 год назад

I pay a flat rate.

Фото профиля Michal Srb
Michal Srb1 год назад

All AIs suck at Rust. I don’t think it’s the AIs’ fault.

Похожие видео

Ever seen a fresh (20x) Claude Max account's 5-hour usage allowance get drained in ~14 minutes? Feast your eyes on my bizarre life now with this screen recording of a recent live work session, something I've gotten at least 100 requests for over the past month. Maybe you can understand now why I need so many accounts and how I can work on so many different projects. You can also see the truth of what I was saying recently about how, once your plan is done and the beads made and polished, it's mostly just machine-tending the swarm that doesn't require much thought. Lots of just telling it to get the next bead and work on it, to review code, to re-read AGENTS dot md after a compaction, etc. And you can see how I use gemini-cli for code review. I give Google a lot of crap for the harness being broken and the capacity overloads, but when it works, it's actually really good for this code review use case. I don't usually let it write new code, though, because I think Opus and 5.2 do a better job. Also, sorry the recording is a bit blurry; I have a 5K resolution monitor and screen recordings usually are hard to watch from it. And btw, this really wasn't that normal of a session for me, it was more frenetic than usual, because I don't want to dox myself or my clients by accident. Hence all the ceaseless terminal tab swirling. I usually do more planning work while this stuff is going on, but I wanted to minimize the chances of leaking important information. That's also why I didn't refresh the Gemini login in the WezTerm window, which killed me, trust me. It's the reason I hate doing these screen recordings in the first place; it kills my productivity. Anyway, hope you liked it. I will also post to YouTube, see reply for link. Thanks for watching.

Jeffrey Emanuel

86,013 просмотров • 5 месяцев назад