Загрузка видео...

Не удалось загрузить видео

На главную

It's finally possible: real-time in-browser speech recognition with OpenAI Whisper! 🤯 The model runs fully on-device using Transformers.js and ONNX Runtime Web, and supports multilingual transcription across 100 different languages! 🔥 Check out the demo (+ source code)! 👇

262,755 просмотров • 2 лет назад •via X (Twitter)

Комментарии: 10

Фото профиля Xenova
Xenova2 лет назад

Source code: Demo:

Фото профиля Xenova
Xenova2 лет назад

UPDATE: I added WebGPU support to Whisper Web! 😍

Фото профиля Mike Young
Mike Young2 лет назад

ok browsers have been able to do this forever tho

Фото профиля Jason Mayes
Jason Mayes2 лет назад

Very nice!!! Is this a diff incarnation to the whisper web turbo that came out before? I thought that was also faster than real-time no? Or is it the streaming ability that is new here Vs rec and sending to model?

Фото профиля Xenova
Xenova2 лет назад

This uses Transformers.js (+ ONNX Runtime Web) vs. @fleetwood___'s Ratchet library. His version would certainly be able to run in real-time too though... and is still on his TODO list I'm sure! 😉

Фото профиля Mike Nolivos
Mike Nolivos2 лет назад

Does this work on mobile?

Фото профиля NOBODY
NOBODY2 лет назад

@cocktailpeanut

Фото профиля Nodus Labs
Nodus Labs2 лет назад

Nice! How much extra memory does it take?

Фото профиля Tom Bielecki
Tom Bielecki2 лет назад

👏👏👏 Is it possible to add an initial prompt/prefix for custom terms/hints?

Фото профиля Xenova
Xenova2 лет назад

Definitely possible - it would just require updating the initial tokens passed to the decoder. Do you have an example (in python?) for this I can take a look at? Feel free to open a feature request on GitHub so I can track this easier.

Похожие видео