
Rob Haisfield
@RobertHaisfield • 9,608 subscribers
cofounder of @websim_ai, imagining new internets with our users. GenAI, TfT, BeSci, HCI, UX. Ex-Tana, Edge & Node, Spark Wave
Shorts
Videos

Are AI agents shape rotators? In this new benchmark, we let the models play campaign puzzles in Opus Magnum, a puzzle game by Zachtronics. Ironically, Claude Opus 4.8 performed poorly, being beaten by GPT-5.5, Gemini 3.5 Flash, and GLM 5.2. Claude Fable 5 crushed them all.
Rob Haisfield409,448 просмотров • 2 дней назад

I optimized my Rune Mysteries Quest script 75% to 1530 ticks (15 min normal game time, from 58 min). Loop: write a script checkpoint, run it, note learnings and ideas to a shared log file, repeat. Each loop took five min, I let it rip overnight with a claude team.
Rob Haisfield85,854 просмотров • 4 месяцев назад

o1 became obsolete in websim the moment o3-mini-high came out. It's faster than o1, and often a more powerful coder than Claude 3.5 Sonnet. Consistently high quality outputs, and less than a 7th of o1's costs. With o3-mini-high, I was able to make a 3d falling sand sim on the surface of the globe. Sandglobe is a reasonably performant, works on desktop and mobile. It features multiple elements that have interactions with each other, like lava and sand becoming glass, and water flowing to fill space.
Rob Haisfield24,444 просмотров • 1 год назад
Больше нет контента для загрузки