Loading video...

Video Failed to Load

There was a problem loading this video. This could be due to a temporary network issue or the video might be unavailable.

Matt Maher tested frontier models in Cursor v. other harnesses. Cursor boosted model performance by 11% on average: Gemini: 52% → 57% GPT-5.4: 82% → 88% Opus: 77% → 93% His benchmark measures how well models implement a 100-feature PRD. Cursor consistently outperformed.

edwin

30,628 subscribers

911,490 views • 3 months ago •via X (Twitter)

Science & Technology

Anya Rossi• Live Now

Private livecam show

0 Comments

No comments available

Comments from the original post will appear here

Related Videos

GPT 4.5 in Cursor! We've found it surprisingly effective in cases where all other models fail.

GPT 4.5 in Cursor! We've found it surprisingly effective in cases where all other models fail.

Cursor

584,823 views • 1 year ago

You can now run three frontier models at once and select your orchestrator model directly inside Perplexity Computer. Model Council automatically runs GPT-5.4, Claude Opus 4.6 and Gemini 3.1 Pro simultaneously. Three frontier models. One workflow. Best answer wins.

You can now run three frontier models at once and select your orchestrator model directly inside Perplexity Computer. Model Council automatically runs GPT-5.4, Claude Opus 4.6 and Gemini 3.1 Pro simultaneously. Three frontier models. One workflow. Best answer wins.

Computer

84,295 views • 3 months ago

I’ve been using GLM 5.2 in Cursor with MagicPath and really like it so far. Cursor Settings → Models: Add your Fireworks key under “OpenAI API Key” and enable it Base URL: Model: accounts/fireworks/models/glm-5p2 Restart Cursor. Done.

I’ve been using GLM 5.2 in Cursor with MagicPath and really like it so far. Cursor Settings → Models: Add your Fireworks key under “OpenAI API Key” and enable it Base URL: Model: accounts/fireworks/models/glm-5p2 Restart Cursor. Done.

Pietro Schirano

36,538 views • 11 days ago

You can now delegate tasks to Cursor directly from Notion. It's built on the Cursor SDK, so every cloud agent runs on the same models, harness, and runtime that power Cursor. @Cursor on any spec or assign it a task to open a PR your whole team can review.

You can now delegate tasks to Cursor directly from Notion. It's built on the Cursor SDK, so every cloud agent runs on the same models, harness, and runtime that power Cursor. @Cursor on any spec or assign it a task to open a PR your whole team can review.

Cursor

305,212 views • 8 days ago

Try out GPT-V in Cursor! It's pretty good for building/modifying components!

Try out GPT-V in Cursor! It's pretty good for building/modifying components!

Aman Sanger

195,264 views • 2 years ago

Setup MCP on Cursor with Google Docs in less than 2 mins!! I used Cursor to to create PRDs in Google Docs Here's how you can do it too: - Go to the MCP directory - Search for Google Docs and grab your sse url - Paste the url and set up MCP in Cursor - Use Cursor Agent to authenticate and create PRD Check out the 100+ tools available at

Setup MCP on Cursor with Google Docs in less than 2 mins!! I used Cursor to to create PRDs in Google Docs Here's how you can do it too: - Go to the MCP directory - Search for Google Docs and grab your sse url - Paste the url and set up MCP in Cursor - Use Cursor Agent to authenticate and create PRD Check out the 100+ tools available at

Soham

151,139 views • 1 year ago

AI has its PhD and now it’s on the job market. Introducing the AI Productivity Index (APEX), a benchmark that measures how well we’ve automated the most valuable industries in the world. Most benchmarks study abstract capabilities. APEX evaluates model performance on real deliverables across law, finance, consulting, and medicine. The models most capable of doing work today, according to APEX: 🥇 GPT 5 🥈 Grok 4 🥉 Gemini 2.5 Flash Other findings: - GPT 5 demonstrates the strongest performance across all 4 domains - Some cheaper models outperform more expensive models from the same provider (e.g. Gemini 2.5 Flash vs. Gemini 2.5 Pro) - The best open source model, Qwen (7th), performs only 2% behind Grok 4 overall

AI has its PhD and now it’s on the job market. Introducing the AI Productivity Index (APEX), a benchmark that measures how well we’ve automated the most valuable industries in the world. Most benchmarks study abstract capabilities. APEX evaluates model performance on real deliverables across law, finance, consulting, and medicine. The models most capable of doing work today, according to APEX: 🥇 GPT 5 🥈 Grok 4 🥉 Gemini 2.5 Flash Other findings: - GPT 5 demonstrates the strongest performance across all 4 domains - Some cheaper models outperform more expensive models from the same provider (e.g. Gemini 2.5 Flash vs. Gemini 2.5 Pro) - The best open source model, Qwen (7th), performs only 2% behind Grok 4 overall

Brendan (can/do)

451,298 views • 9 months ago

Conductor now supports Cursor with Composer 2.5 With native support for 3 harnesses, we now have the holy trinity (Claude, Codex, Cursor) and have fulfilled the prophecy of the 4C's Conductor, Claude, Codex, Cursor

Conductor now supports Cursor with Composer 2.5 With native support for 3 harnesses, we now have the holy trinity (Claude, Codex, Cursor) and have fulfilled the prophecy of the 4C's Conductor, Claude, Codex, Cursor

matt palmer

27,803 views • 24 days ago

The strongest models are gated and access is granted only to a select few. Hermes Agent now exposes MoA presets as virtual models, giving you capabilities beyond the publicly available frontier: 8% higher than Opus 4.8 and 11% higher than GPT 5.5 on our upcoming benchmark.

The strongest models are gated and access is granted only to a select few. Hermes Agent now exposes MoA presets as virtual models, giving you capabilities beyond the publicly available frontier: 8% higher than Opus 4.8 and 11% higher than GPT 5.5 on our upcoming benchmark.

Nous Research

1,805,611 views • 6 days ago

How fast is serv-swift? Runs roughly 9× faster than frontier models like GPT 5.4 Here's the proof 👇

How fast is serv-swift? Runs roughly 9× faster than frontier models like GPT 5.4 Here's the proof 👇

OpenServ

26,102 views • 2 months ago

Opus 4.5 in-a-loop builds Spiking Neural Net simulations in Cursor that is all :)

Opus 4.5 in-a-loop builds Spiking Neural Net simulations in Cursor that is all :)

echo.hive

42,700 views • 5 months ago

baby cursor already useful: came up with new ideas for stop + queuing made with Cursor + Gemini 2.5 Pro MAX

baby cursor already useful: came up with new ideas for stop + queuing made with Cursor + Gemini 2.5 Pro MAX

Ryo Lu

45,868 views • 1 year ago

GPT-5.4 is special LisanBench: GPT-5.4 vs Opus 4.6 vs Gemini 3.1 Pro

GPT-5.4 is special LisanBench: GPT-5.4 vs Opus 4.6 vs Gemini 3.1 Pro

Lisan al Gaib

265,203 views • 3 months ago

Cursor AI MCP Integration: 🖱️ Cursor ID Reads Console Logs/Errors Auto 📸 Can see your Website 🔍 Analyse Selected Browser Elements 📝 Debug 3x Fast Cursor Here is a step by step Guide on how to do it 👇

Cursor AI MCP Integration: 🖱️ Cursor ID Reads Console Logs/Errors Auto 📸 Can see your Website 🔍 Analyse Selected Browser Elements 📝 Debug 3x Fast Cursor Here is a step by step Guide on how to do it 👇

Mervin Praison

41,573 views • 1 year ago

I tested Gemini Pro 2.5 as my main coding model for 40+ hours. Here're 2 documents that are working brilliantly well with Gemini. "App flow document + App flowchart." This made my Cursor workflow 10x better. Here's why it is working: ↓

I tested Gemini Pro 2.5 as my main coding model for 40+ hours. Here're 2 documents that are working brilliantly well with Gemini. "App flow document + App flowchart." This made my Cursor workflow 10x better. Here's why it is working: ↓

CJ Zafir

85,260 views • 1 year ago

I tested Github Copilot's latest "Cursor killer" features, and the results were... not as I expected Here's my in-depth review of Copilot vs Cursor:

I tested Github Copilot's latest "Cursor killer" features, and the results were... not as I expected Here's my in-depth review of Copilot vs Cursor:

Steve (Builder.io)

41,291 views • 1 year ago

my upcoming cursor clone tutorial just got the feature to clone from github, a breeze to implement with Inngest and Convex 😎

my upcoming cursor clone tutorial just got the feature to clone from github, a breeze to implement with Inngest and Convex 😎

Code With Antonio

36,874 views • 6 months ago

Create datasets, run evals, and even train models directly in Cursor with the Hugging Face plugin. Here's Ben Burtenshaw to show you how:

Create datasets, run evals, and even train models directly in Cursor with the Hugging Face plugin. Here's Ben Burtenshaw to show you how:

edwin

17,119 views • 3 months ago

I was using Opus 4.6 in Cursor and am already out of credits

I was using Opus 4.6 in Cursor and am already out of credits

Karan

25,283 views • 4 months ago

✦ | Cursor Keychain Commissions | ✦ Live2D Cursor models are now available! Accepting: Two Slots Please read all details in the description! If you would like to join the waitlist, DM me or comment! #Live2dCommissions | #Live2DShowcase | #Live2D

✦ | Cursor Keychain Commissions | ✦ Live2D Cursor models are now available! Accepting: Two Slots Please read all details in the description! If you would like to join the waitlist, DM me or comment! #Live2dCommissions | #Live2DShowcase | #Live2D

Venus VT🪐🌟 | VGen Live2D

53,871 views • 1 year ago