Загрузка видео...

Не удалось загрузить видео

На главную

Introducing Nova-2, our next-gen model for superhuman speech-to-text. TL;DR Nova-2 delivers: 💥 Next-level accuracy: +18% accuracy than Nova-1 & over 36% accuracy than OpenAI Whisper large 💥 Up to 40x faster 💥 Same low cost: 3-7x cheaper 🧵👇

2,184,459 просмотров • 2 лет назад •via X (Twitter)

Комментарии: 10

Фото профиля Deepgram
Deepgram2 лет назад

Extending upon Nova's groundbreaking training, which spanned +100 domains and 47 billion tokens, Nova-2 continues to be the deepest-trained ASR model in the world.

Фото профиля Deepgram
Deepgram2 лет назад

Nova-2 was trained in a 2-stage curriculum starting from the largest, most diverse dataset in Deepgram’s history: nearly 6M resources and an extensive library of high-quality human transcriptions. The result? 👇

Фото профиля Deepgram
Deepgram2 лет назад

A new state-of-the-art model capable of superhuman transcription performance that consistently outperforms any other STT model in the market today across a wide range of speech application domains. Onto the benchmark results…

Фото профиля Deepgram
Deepgram2 лет назад

In our benchmarking, Nova-2 has an overall WER of 8.4% for the median files tested, representing a 16.8% relative error rate improvement compared to the closest provider. Nova-2 surpassed all tested competitors by an average of 30% and outperformed OpenAI Whisper large by 36%.

Фото профиля Deepgram
Deepgram2 лет назад

Modern speech apps are increasingly used to automate real-time interactions with end users for use cases like agent assist and live captioning. But there are limited options for true real-time STT and several providers like OpenAI lack native streaming models...

Фото профиля Deepgram
Deepgram2 лет назад

...However, in our real-time accuracy benchmarking, Nova-2 handily outperforms the field with an average relative reduction in WER of 28.6% across all domains.

Фото профиля Deepgram
Deepgram2 лет назад

Regarding speed, our benchmarks reveal that Nova-2 surpasses all other STT models, achieving a median inference time of 29.8 seconds per hour of diarized audio. This represents a significant speed advantage ranging from 5-40x faster than comparable vendors offering diarization.

Фото профиля Deepgram
Deepgram2 лет назад

In terms of cost, Nova-2 maintains the same starting price as Nova at just $0.0043 per minute of pre-recorded audio, nearly 3-5x more affordable than any other full-functionality provider (based on currently listed pricing) in the market.

Фото профиля Deepgram
Deepgram2 лет назад

Since launching Nova-1 this year, we have also released new features encompassing improved speaker diarization, smart formatting, filler words support, and our inaugural domain-specific language model for summarization.

Фото профиля Deepgram
Deepgram2 лет назад

You can dive deeper into our approach to model development and the benchmarks in the full announcement. Plus, get started with Nova-2 by requesting early access. Link to announcement:

Похожие видео