Loading video...
Video Failed to Load
Announcing Kontext Realtime, an open-source web app for editing images with voice commands.
46,022 views • 11 months ago •via X (Twitter)
12 Comments

It's powered by OpenAI's Realtime API over WebRTC for voice commands. The image generation and editing uses Flux Schnell and Flux Kontext running on Replicate. You can run it locally or deploy it to Cloudflare. Here's the repo:

Built last weekend at @replicate's Kontext hackathon with @bfl_ml 🖤.

I put it on YouTube too:

Scan any documents, convert images into text, PDF files, etc. 👍

Very cool app, Zeke! Thanks for the demo and code walkthrough

Thanks! Let me know if you build anything cool on top of it! Also, pull requests welcome :) Planning to add more tools to it very soon.

Been cooking up a voice intuitive canvas for a few months. And will launch soon. Love this

super cool, you should add it to

Thanks! I'm a spaces newbie. What would make the most sense for that... a blank Docker template? It's a Cloudflare Workers app with a static frontend and a few serverless backend functions.

you're the king of demos, my old friend🤗

Looks great! How are the kontext edits near realtime?!

Camera tricks to keep the video snappy. Actual generation times are around 4 to 5 seconds for Kontext Pro. See examples here:
