Loading video...

Video Failed to Load

Go Home

It's finally possible: real-time in-browser speech recognition with OpenAI Whisper! 🤯 The model runs fully on-device using Transformers.js and ONNX Runtime Web, and supports multilingual transcription across 100 different languages! 🔥 Check out the demo (+ source code)! 👇

261,247 views • 2 years ago •via X (Twitter)

10 Comments

Xenova's profile picture
Xenova2 years ago

Source code: Demo:

Xenova's profile picture
Xenova2 years ago

UPDATE: I added WebGPU support to Whisper Web! 😍

Mike Young's profile picture
Mike Young2 years ago

ok browsers have been able to do this forever tho

Jason Mayes's profile picture
Jason Mayes2 years ago

Very nice!!! Is this a diff incarnation to the whisper web turbo that came out before? I thought that was also faster than real-time no? Or is it the streaming ability that is new here Vs rec and sending to model?

Xenova's profile picture
Xenova2 years ago

This uses Transformers.js (+ ONNX Runtime Web) vs. @fleetwood___'s Ratchet library. His version would certainly be able to run in real-time too though... and is still on his TODO list I'm sure! 😉

Mike Nolivos's profile picture
Mike Nolivos2 years ago

Does this work on mobile?

NOBODY's profile picture
NOBODY2 years ago

@cocktailpeanut

Nodus Labs's profile picture
Nodus Labs2 years ago

Nice! How much extra memory does it take?

Tom Bielecki's profile picture
Tom Bielecki2 years ago

👏👏👏 Is it possible to add an initial prompt/prefix for custom terms/hints?

Xenova's profile picture
Xenova2 years ago

Definitely possible - it would just require updating the initial tokens passed to the decoder. Do you have an example (in python?) for this I can take a look at? Feel free to open a feature request on GitHub so I can track this easier.

Related Videos