Loading video...
Video Failed to Load
Matt Maher tested frontier models in Cursor v. other harnesses. Cursor boosted model performance by 11% on average: Gemini: 52% → 57% GPT-5.4: 82% → 88% Opus: 77% → 93% His benchmark measures how well models implement a 100-feature PRD. Cursor consistently outperformed.
911,490 views • 3 months ago •via X (Twitter)
0 Comments
No comments available
Comments from the original post will appear here


