Loading video...
Video Failed to Load
Introducing Scribe — the most accurate Speech to Text model. It has the highest accuracy on benchmarks, outperforming previous state-of-the-art models such as Gemini 2.0 and OpenAI Whisper v3. It’s now the leading model for English, Spanish, Italian, and many more. With support for 99 languages, speaker diarization, character-level... show more
464,392 views • 1 year ago •via X (Twitter)
11 Comments

It achieves the highest accuracy for the most common languages. And it significantly improves the performance of previously underserved languages such as Serbian, Cantonese, and Gujarati.

Learn more about the benchmarking and features in our blog post:

We have a low-latency version of Scribe coming soon, extending Scribe to real-time use cases.

Scribe is available today in both our UI and API. It’s priced at $0.40 per hour of input audio, with an additional 50% discount available for the next 6 weeks. Sign up for an account here:

Hear from @flavioschneide, one of the lead researchers behind the launch.

Join us next week for a virtual event with the team behind the launch:

Our speech-to-text models are the most accurate on the market with top rankings across industry benchmarks. - The highest accuracy rates—up to 95% - Up to 30% fewer hallucinations than other leaders - Low latency—63 minutes converts in 35 seconds Try via API for free today 👇

Awesome!

Not hating but feedback so hopefully you take this correctly: I want locally run models. Whisper can do this and even run in near realtime on mobile devices. Also .40 per hour is sky high compared to compute unless it’s wildly inefficient. Open Source previous generation models.

Yeah but have you considered thag I can run whisper locally instead of paying you

whoa.


