正在加载视频...

视频加载失败

Introducing Real-time Transcription with Speakers! - Step change in accuracy, surpassing top cloud APIs - Faster than real-time on Mac and iPhone - Still under 3 watts when all features are enabled Available in Argmax SDK 2.0 for early access! Benchmarks and details in comments.

72,819 次观看 • 6 个月前 •via X (Twitter)

0 条评论

暂无评论

原始帖子的评论将显示在这里

相关视频

Sarvam Beats GPT-4o: India’s New AI Model Claims Top Spot in Indic Speech Sarvam AI, an Indian startup, recently launched Sarvam Audio, a speech recognition model that claims superior performance over GPT-4o Transcribe on Indic language benchmarks. This development highlights India's push for AI sovereignty in handling local linguistic nuances. Sarvam Audio supports 22 Indian languages from the Eighth Schedule, plus Indian English, with strong handling of code-mixing like Hindi-English blends. It features built-in speaker diarization for up to eight speakers and processes long-form audio such as podcasts or meetings. Trained on the IndicVoices dataset 12,000 hours from over 16,000 speakers across 208 districts it captures real-world noise and spontaneous speech. The model reportedly outperforms GPT-4o Transcribe and Gemini 3 Flash in transcription accuracy (lower Word Error Rate) on IndicVoices benchmarks for unnormalized, normalized, and code-mixed speech. Sarvam attributes this to specialization on Indian accents and patterns, unlike global models trained on Western data. Detailed public benchmarks are pending independent verification. Key Applications 🔴 Call centers and logistics for multilingual transcription. 🔴 Banking, fintech, and e-commerce for customer interactions. 🔴 Podcasts, meetings, and lectures via API for real-time or batch processing. ​ 🔴 This B2B-focused tool aligns with India's IndiaAI Mission, backed by government GPU access for sovereign LLMs. Credit : AIM Networks.

Augadh

43,429 次观看 • 4 个月前