Video wird geladen...

Video konnte nicht geladen werden

Zur Startseite

It's finally possible: real-time in-browser speech recognition with OpenAI Whisper! 🤯 The model runs fully on-device using Transformers.js and ONNX Runtime Web, and supports multilingual transcription across 100 different languages! 🔥 Check out the demo (+ source code)! 👇

261,247 Aufrufe • vor 2 Jahren •via X (Twitter)

10 Kommentare

Profilbild von Xenova
Xenovavor 2 Jahren

Source code: Demo:

Profilbild von Xenova
Xenovavor 2 Jahren

UPDATE: I added WebGPU support to Whisper Web! 😍

Profilbild von Mike Young
Mike Youngvor 2 Jahren

ok browsers have been able to do this forever tho

Profilbild von Jason Mayes
Jason Mayesvor 2 Jahren

Very nice!!! Is this a diff incarnation to the whisper web turbo that came out before? I thought that was also faster than real-time no? Or is it the streaming ability that is new here Vs rec and sending to model?

Profilbild von Xenova
Xenovavor 2 Jahren

This uses Transformers.js (+ ONNX Runtime Web) vs. @fleetwood___'s Ratchet library. His version would certainly be able to run in real-time too though... and is still on his TODO list I'm sure! 😉

Profilbild von Mike Nolivos
Mike Nolivosvor 2 Jahren

Does this work on mobile?

Profilbild von NOBODY
NOBODYvor 2 Jahren

@cocktailpeanut

Profilbild von Nodus Labs
Nodus Labsvor 2 Jahren

Nice! How much extra memory does it take?

Profilbild von Tom Bielecki
Tom Bieleckivor 2 Jahren

👏👏👏 Is it possible to add an initial prompt/prefix for custom terms/hints?

Profilbild von Xenova
Xenovavor 2 Jahren

Definitely possible - it would just require updating the initial tokens passed to the decoder. Do you have an example (in python?) for this I can take a look at? Feel free to open a feature request on GitHub so I can track this easier.

Ähnliche Videos