正在加载视频...
视频加载失败
Matt Maher tested frontier models in Cursor v. other harnesses. Cursor boosted model performance by 11% on average: Gemini: 52% → 57% GPT-5.4: 82% → 88% Opus: 77% → 93% His benchmark measures how well models implement a 100-feature PRD. Cursor consistently outperformed.
911,490 次观看 • 3 个月前 •via X (Twitter)
0 条评论
暂无评论
原始帖子的评论将显示在这里
相关视频
GPT-5.4 is special LisanBench: GPT-5.4 vs Opus 4.6 vs Gemini 3.1 Pro
Lisan al Gaib
265,203 次观看 • 3 个月前


