Loading video...

Video Failed to Load

Go Home

Matt Maher tested frontier models in Cursor v. other harnesses. Cursor boosted model performance by 11% on average: Gemini: 52% → 57% GPT-5.4: 82% → 88% Opus: 77% → 93% His benchmark measures how well models implement a 100-feature PRD. Cursor consistently outperformed.

911,490 views • 3 months ago •via X (Twitter)

0 Comments

No comments available

Comments from the original post will appear here

Related Videos