Video yükleniyor...

Video Yüklenemedi

Ana Sayfaya Dön

Introducing Nova-2, our next-gen model for superhuman speech-to-text. TL;DR Nova-2 delivers: 💥 Next-level accuracy: +18% accuracy than Nova-1 & over 36% accuracy than OpenAI Whisper large 💥 Up to 40x faster 💥 Same low cost: 3-7x cheaper 🧵👇

2,184,459 görüntüleme • 2 yıl önce •via X (Twitter)

10 Yorum

Deepgram profil fotoğrafı
Deepgram2 yıl önce

Extending upon Nova's groundbreaking training, which spanned +100 domains and 47 billion tokens, Nova-2 continues to be the deepest-trained ASR model in the world.

Deepgram profil fotoğrafı
Deepgram2 yıl önce

Nova-2 was trained in a 2-stage curriculum starting from the largest, most diverse dataset in Deepgram’s history: nearly 6M resources and an extensive library of high-quality human transcriptions. The result? 👇

Deepgram profil fotoğrafı
Deepgram2 yıl önce

A new state-of-the-art model capable of superhuman transcription performance that consistently outperforms any other STT model in the market today across a wide range of speech application domains. Onto the benchmark results…

Deepgram profil fotoğrafı
Deepgram2 yıl önce

In our benchmarking, Nova-2 has an overall WER of 8.4% for the median files tested, representing a 16.8% relative error rate improvement compared to the closest provider. Nova-2 surpassed all tested competitors by an average of 30% and outperformed OpenAI Whisper large by 36%.

Deepgram profil fotoğrafı
Deepgram2 yıl önce

Modern speech apps are increasingly used to automate real-time interactions with end users for use cases like agent assist and live captioning. But there are limited options for true real-time STT and several providers like OpenAI lack native streaming models...

Deepgram profil fotoğrafı
Deepgram2 yıl önce

...However, in our real-time accuracy benchmarking, Nova-2 handily outperforms the field with an average relative reduction in WER of 28.6% across all domains.

Deepgram profil fotoğrafı
Deepgram2 yıl önce

Regarding speed, our benchmarks reveal that Nova-2 surpasses all other STT models, achieving a median inference time of 29.8 seconds per hour of diarized audio. This represents a significant speed advantage ranging from 5-40x faster than comparable vendors offering diarization.

Deepgram profil fotoğrafı
Deepgram2 yıl önce

In terms of cost, Nova-2 maintains the same starting price as Nova at just $0.0043 per minute of pre-recorded audio, nearly 3-5x more affordable than any other full-functionality provider (based on currently listed pricing) in the market.

Deepgram profil fotoğrafı
Deepgram2 yıl önce

Since launching Nova-1 this year, we have also released new features encompassing improved speaker diarization, smart formatting, filler words support, and our inaugural domain-specific language model for summarization.

Deepgram profil fotoğrafı
Deepgram2 yıl önce

You can dive deeper into our approach to model development and the benchmarks in the full announcement. Plus, get started with Nova-2 by requesting early access. Link to announcement:

Benzer Videolar