Video wird geladen...
Video konnte nicht geladen werden
Anthropic CPO, Mike Krieger: Dario Amodei predicted the coding benchmark (SWE-bench) would reach 90% by the end of the year I’ve started taking AI timelines more seriously after seeing the progress. "mid-2025 now feels much closer than 2027"
46,067 Aufrufe • vor 1 Jahr •via X (Twitter)
9 Kommentare

It will only reach that for people that don’t write threatening software. Currently my Claude code might as well be my grandmother it’s soooo bad. @anthropic write into the var directory their caches on Mac and Linux - horrible practice

🤖 What to Expect in Cybersecurity in 2025: From AI-driven threats to Zero Trust adoption, the landscape is evolving fast. Are you ready? Stay prepared with CYBERSECURITY DICTIONARY For Everyone, on Amazon: 🛒

bigger the watermark ser

interesting

My hunch is, even after the frontier models totally saturate those tests (solid 100% across the board) they will still suck in many important fields. The should be research level problems, if they can crack decades old unsolved science problems then they would be in another level

50t tokens of premium data 50t tokens of deduplicated premium synthetic data will push beyond this then interactive fine tuning & optimizations its already possible to hit that mark & go further

We have a *tendency* to *notice* when we have actively been involved. Knowledge,plans slipping out of our control seems *exciting* *Smiling fun* (AI roller coaster wasn’t designed funny… designed to fix the rides… and in doing so… change, riders management owners all of it

AI progress is faster than lightning! By achieving the 90% benchmark sooner, it’s igniting thrilling possibilities for fintech’s future. Can’t wait to see its impact!

Why do American corporates always look like AI avatars?

