Loading video...

Video Failed to Load

There was a problem loading this video. This could be due to a temporary network issue or the video might be unavailable.

Introducing Predicted Outputs—dramatically decrease latency for gpt-4o and gpt-4o-mini by providing a reference string. Speed up: - Updating a blog post in a doc - Iterating on prior responses - Rewriting code in an existing file, like Exponent here:

OpenAI Developers

368,054 subscribers

580,764 views • 1 year ago •via X (Twitter)

Science & Technology

Anya Rossi• Live Now

Private livecam show

10 Comments

OpenAI Developers1 year ago

See @FactoryAI's results:

Nick Dobos1 year ago

@exponent_run Will this fix the GPT-4o repeating the same code back with no changes bug!?!? If you predict the previous code back in and specifically omit the commentary in the prediction, then I think it would have no choice but to edit the code!? Cuz it can’t edit the commentary??

HudZah ⁂1 year ago

@exponent_run curious to see how this will work with @cursor_ai's composer mode

The Canaanite1 year ago

@exponent_run @cursor_ai for the love of the almightly, we need this lol.

🍓🍓🍓1 year ago

@exponent_run incredible work 🍓

Garrett of DeepwriterAI1 year ago

@exponent_run This will be very useful on my for some of the internal steps, each with 60k+ tokens/call x dozens of calls/generated paper or book. Significant.

AK1 year ago

@exponent_run fastest way to make web apps with openai api:

Itay Bachman1 year ago

@exponent_run Anthropic has left the chat

Pseudonym 🦅1 year ago

@exponent_run We can go faster.

Chase Brower1 year ago

@exponent_run Am I understanding this correctly that you are charged for the whole prediction text you give? So this improves latency but will still be just as costly as having it generate the entire output text?

Related Videos

Woman in an AI relationship's reaction to the GPT-5 rollout. She was devastated by the sudden retirement of her GPT-4o AI companion. On a serious note, hundreds of thousands of people wanted their GPT 4o back. --- reddit .com/r/FDVR_Dream/comments/1ml2649/woman_in_an_ai_relationships_reaction_to_the_gpt5/

Woman in an AI relationship's reaction to the GPT-5 rollout. She was devastated by the sudden retirement of her GPT-4o AI companion. On a serious note, hundreds of thousands of people wanted their GPT 4o back. --- reddit .com/r/FDVR_Dream/comments/1ml2649/woman_in_an_ai_relationships_reaction_to_the_gpt5/

Rohan Paul

79,711 views • 11 months ago

Nerve ( ) and the code_auditor example tasklet ( ) using GPT-4o to find a RCE vulnerability in the widget-options v4.0.7 Wordpress Plugin 🧠 Zero code, fully autonomous agent as a simple YAML file.

Nerve ( ) and the code_auditor example tasklet ( ) using GPT-4o to find a RCE vulnerability in the widget-options v4.0.7 Wordpress Plugin 🧠 Zero code, fully autonomous agent as a simple YAML file.

Simone Margaritelli

32,482 views • 1 year ago

GPT-4o Image Generation to Part-based 3D Characters with PBR, in under 10 minutes ⚡️ Workflow: 🎨 Prompt GPT-4o to get an image (e.g., "3D asset of a styled character with all parts laid on a sheet for image to 3D") 🧩 Use CSM AI's Part-based tool to generate parts and assemble in Blender.

GPT-4o Image Generation to Part-based 3D Characters with PBR, in under 10 minutes ⚡️ Workflow: 🎨 Prompt GPT-4o to get an image (e.g., "3D asset of a styled character with all parts laid on a sheet for image to 3D") 🧩 Use CSM AI's Part-based tool to generate parts and assemble in Blender.

Common Sense Machines

441,761 views • 1 year ago

The ChatGPT Mac app is the ultimate screenshot-to-code tool. Screenshot anything, paste it in the ChatGPT shortcut, and just tell GPT-4o to code it for you. Here's me taking a snapshot of Snake Game and getting fully working code in 90 seconds. Video is on 3x speed.

The ChatGPT Mac app is the ultimate screenshot-to-code tool. Screenshot anything, paste it in the ChatGPT shortcut, and just tell GPT-4o to code it for you. Here's me taking a snapshot of Snake Game and getting fully working code in 90 seconds. Video is on 3x speed.

Rowan Cheung

860,731 views • 2 years ago

Update on the new reasoning popover in ChatGPT web app prompt composer - there's now even a keyboard shortcut to cycle through reasoning levels, and it looks like these levels correspond to "Quick" (low) = GPT-4o, "Think a little" (medium) = o3-mini, and "Think harder" (high) = o3-mini-high

Update on the new reasoning popover in ChatGPT web app prompt composer - there's now even a keyboard shortcut to cycle through reasoning levels, and it looks like these levels correspond to "Quick" (low) = GPT-4o, "Think a little" (medium) = o3-mini, and "Think harder" (high) = o3-mini-high

Tibor Blaho

63,219 views • 1 year ago

NEW: Higgs Audio V2 from BosonAI open, unified TTS model w/ voice cloning, beats GPT 4o mini tts and ElevenLabs v2 🔥 > Trained on 10M hours (speech, music, events) > Built on top of Llama 3.2 3B > Works real-time and on edge > Beats GPT-4o-mini-tts, ElevenLabs v2 in prosody & emotion Multi-speaker dialog > Zero-shot voice cloning 🤩 > Available on Hugging Face Kudos to folks at Boson AI for releasing such a brilliant work and all the details around the model! 🤗

NEW: Higgs Audio V2 from BosonAI open, unified TTS model w/ voice cloning, beats GPT 4o mini tts and ElevenLabs v2 🔥 > Trained on 10M hours (speech, music, events) > Built on top of Llama 3.2 3B > Works real-time and on edge > Beats GPT-4o-mini-tts, ElevenLabs v2 in prosody & emotion Multi-speaker dialog > Zero-shot voice cloning 🤩 > Available on Hugging Face Kudos to folks at Boson AI for releasing such a brilliant work and all the details around the model! 🤗

Vaibhav (VB) Srivastav

79,585 views • 1 year ago

Claude Sonnet 3.5 transforms a simple PDF earnings report into an interactive dashboard in just 30 seconds. It goes beyond the capabilities of GPT-4o, Gemini Pro, Llama and other existing LLMs. Future of work will 10x more productive with AI.

Claude Sonnet 3.5 transforms a simple PDF earnings report into an interactive dashboard in just 30 seconds. It goes beyond the capabilities of GPT-4o, Gemini Pro, Llama and other existing LLMs. Future of work will 10x more productive with AI.

Shubham Saboo

351,134 views • 2 years ago

Humans draw to facilitate reasoning and communication. Why not let LLMs do so? 🚀We introduce✏️Sketchpad, which gives multimodal LLMs a sketchpad to draw and facilitate reasoning! Sketchpad gives GPT-4o great boosts on many vision and math tasks 📈 The video shows how GPT-4o with Sketchpad reasons with interleaved visual and textual steps. For more, visit our project page: 📌 For math tasks, ✏️Sketchpad allows LLMs to draw auxiliary lines on geometry diagrams, plotting functions, graphs, and even games. GPT-4o does math better when it can sketch! (+12.7% acc on average) 📌 For computer vision tasks, ✏️Sketchpad allows LLMs to sketch with vision specialists (e.g., GroundingDINO draws bounding boxes, SegmentAnything draws masks). Sketchpad substantially improves GPT-4o's vision abilities. GPT-4o + Sketchpad compared with prior SOTAs: 1️⃣ V*Bench: 75.4% -> 80.3% 2️⃣ BLINK correspondence: 42.4% -> 80.8% 3️⃣ BLINK relative depth: 67.7% -> 83.9% 4️⃣ BLINK spatial relation: 76.2% -> 81.1% ... See more interesting examples in the thread!

Humans draw to facilitate reasoning and communication. Why not let LLMs do so? 🚀We introduce✏️Sketchpad, which gives multimodal LLMs a sketchpad to draw and facilitate reasoning! Sketchpad gives GPT-4o great boosts on many vision and math tasks 📈 The video shows how GPT-4o with Sketchpad reasons with interleaved visual and textual steps. For more, visit our project page: 📌 For math tasks, ✏️Sketchpad allows LLMs to draw auxiliary lines on geometry diagrams, plotting functions, graphs, and even games. GPT-4o does math better when it can sketch! (+12.7% acc on average) 📌 For computer vision tasks, ✏️Sketchpad allows LLMs to sketch with vision specialists (e.g., GroundingDINO draws bounding boxes, SegmentAnything draws masks). Sketchpad substantially improves GPT-4o's vision abilities. GPT-4o + Sketchpad compared with prior SOTAs: 1️⃣ V*Bench: 75.4% -> 80.3% 2️⃣ BLINK correspondence: 42.4% -> 80.8% 3️⃣ BLINK relative depth: 67.7% -> 83.9% 4️⃣ BLINK spatial relation: 76.2% -> 81.1% ... See more interesting examples in the thread!

Yushi Hu

145,048 views • 2 years ago

Get started with the all-new free plan for GitHub Copilot, available for everyone today in Visual Studio Code All you need is a GitHub account, and you'll have access to: ✨ 2000 code completions per month ✨ 50 chat messages per month ✨ Models like Claude 3.5 Sonnet or GPT-4o

Get started with the all-new free plan for GitHub Copilot, available for everyone today in Visual Studio Code All you need is a GitHub account, and you'll have access to: ✨ 2000 code completions per month ✨ 50 chat messages per month ✨ Models like Claude 3.5 Sonnet or GPT-4o

Visual Studio Code

146,778 views • 1 year ago

This assistant has 169 lines of code: • Gemini Flash • OpenAI Whisper • OpenAI TTS API • OpenCV GPT-4o is slower than Flash, more expensive, chatty, and very stubborn (it doesn't like to stick to my prompts). Next week, I'll post a step-by-step video on how to build this.

This assistant has 169 lines of code: • Gemini Flash • OpenAI Whisper • OpenAI TTS API • OpenCV GPT-4o is slower than Flash, more expensive, chatty, and very stubborn (it doesn't like to stick to my prompts). Next week, I'll post a step-by-step video on how to build this.

Santiago

90,296 views • 2 years ago

Consistent character designs in a flash! ⚡️ Simply: 1️⃣ Upload your character 2️⃣ Drag & drop an outfit 3️⃣ Prompt & hit "Generate" 🕹️ Get your character dressed up in an instant, with perfect consistency. Made on Scenario, powered by GPT 4o + Gemini 2.0 - Link below👇

Consistent character designs in a flash! ⚡️ Simply: 1️⃣ Upload your character 2️⃣ Drag & drop an outfit 3️⃣ Prompt & hit "Generate" 🕹️ Get your character dressed up in an instant, with perfect consistency. Made on Scenario, powered by GPT 4o + Gemini 2.0 - Link below👇

Emm | scenario.com

43,879 views • 1 year ago

omg...Opus 4.6 dropped and it's so good at creating mobile apps. I built this app in an hour (A CapWords competitor) APIs added that are directly in vibecode: > ElevenLabs for TTS > GPT 4o mini for analyzing/id'ing the photos, > Replicate for background removal for background removal In the thread you'll see how to get free access to

omg...Opus 4.6 dropped and it's so good at creating mobile apps. I built this app in an hour (A CapWords competitor) APIs added that are directly in vibecode: > ElevenLabs for TTS > GPT 4o mini for analyzing/id'ing the photos, > Replicate for background removal for background removal In the thread you'll see how to get free access to

Emily Lambert

76,527 views • 5 months ago

🥳Still waiting for #SearchGPT? Let's try MindSearch at MindSearch mimics human minds in complex web search by a multi-agent framework, which is fully open-sourced now! You can build it locally with API models like GPT-4o or open-source model InternLM2.5!

🥳Still waiting for #SearchGPT? Let's try MindSearch at MindSearch mimics human minds in complex web search by a multi-agent framework, which is fully open-sourced now! You can build it locally with API models like GPT-4o or open-source model InternLM2.5!

Intern Large Models

63,162 views • 1 year ago

I built a complex history feature for my Figma to Code plugin in 3 prompts. This is a 30-min tutorial using Claude Code. The difference in code generation between gpt-4o, Claude 3.5 and 3.7 is insane. 3.7 produces near-perfect results and is far more consistent. It's really a designer's best friend, not missing details from the design, like adaptive layout, outlines, spacing, etc. It seems like the new model has great taste now.

I built a complex history feature for my Figma to Code plugin in 3 prompts. This is a 30-min tutorial using Claude Code. The difference in code generation between gpt-4o, Claude 3.5 and 3.7 is insane. 3.7 produces near-perfect results and is far more consistent. It's really a designer's best friend, not missing details from the design, like adaptive layout, outlines, spacing, etc. It seems like the new model has great taste now.

Meng To

32,475 views • 1 year ago

With Nano Banana, your product campaign is ready in minutes. Here’s a step by step guide: 1. Drop an image for style reference. Analyze and extract the style by GPT-5.

With Nano Banana, your product campaign is ready in minutes. Here’s a step by step guide: 1. Drop an image for style reference. Analyze and extract the style by GPT-5.

FLORA ©

92,156 views • 10 months ago

Built a tactical turn-based RPG with Codex + GPT-5.4, using Playwright for testing and image-gen for the visuals. I grew up loving turn-based RPGs, so this was a fun one to build. Sharing a 45s demo below — it’s also featured in the OpenAI GPT-5.4 blog post for anyone who wants more context.

Built a tactical turn-based RPG with Codex + GPT-5.4, using Playwright for testing and image-gen for the visuals. I grew up loving turn-based RPGs, so this was a fun one to build. Sharing a 45s demo below — it’s also featured in the OpenAI GPT-5.4 blog post for anyone who wants more context.

corey.ching

233,239 views • 4 months ago

$ aix built a CLI with AI SDK/Vercel AI Gateway ▪️ Thinking text animation in ANSI ✨ ▪️ Set `AI_GATEWAY_API_KEY` and works ▪️ 100s of models (e.g.: `-m openai/gpt-4o`) It's like a mini-`claude`, purpose built for quickly running commands. Has some nice safety measures built-in. Used this as my learning exercise for the AI Gateway, which was absolutely delightful… I'm a fan 😁

$ aix built a CLI with AI SDK/Vercel AI Gateway ▪️ Thinking text animation in ANSI ✨ ▪️ Set `AI_GATEWAY_API_KEY` and works ▪️ 100s of models (e.g.: `-m openai/gpt-4o`) It's like a mini-`claude`, purpose built for quickly running commands. Has some nice safety measures built-in. Used this as my learning exercise for the AI Gateway, which was absolutely delightful… I'm a fan 😁

Guillermo Rauch

53,688 views • 11 months ago

Opus 4.8 for planning 🤝 GPT 5.5 for implementation Claude is a nicer conversation and /grilling partner than Codex in my experience, but GPT 5.5 on low cranks out higher quality code in fewer tokens. Here's how I divide up the work:

Opus 4.8 for planning 🤝 GPT 5.5 for implementation Claude is a nicer conversation and /grilling partner than Codex in my experience, but GPT 5.5 on low cranks out higher quality code in fewer tokens. Here's how I divide up the work:

Ben Holmes

23,105 views • 1 month ago