正在加载视频...

视频加载失败

Introducing Indic Parler-TTS: Open-Source Text-to-Speech for Over a Billion Indic Speakers! 🌏 In collaboration with Hugging Face, we are excited to release Indic Parler-TTS, a state-of-the-art open-source text-to-speech system designed to bring accessible and high-quality speech technology to India’s diverse linguistic community. Supporting 20 of the 22 scheduled Indian...

28,586 次观看 • 1 年前 •via X (Twitter)

9 条评论

AI4Bharat 的头像
AI4Bharat1 年前

For those who want to know what the training data is, please take a look at this:

Hugging Face 的头像
Hugging Face1 年前

🇮🇳/ acc

Abu 的头像
Abu1 年前

@huggingface that sounds pretty cool! more voices for diverse languages, right?

Manoj 的头像
Manoj1 年前

@huggingface That's great news!

Umesh 的头像
Umesh1 年前

@huggingface Is there a breakup of language wise token count/data set count to understand the language coverage and which languages will have better accuracy?

GDP 的头像
GDP1 年前

@huggingface Kickass! Thank you so much. Looks so good.

Data & Analytics 的头像
Data & Analytics1 年前

@huggingface @huggingface, that's a dope initiative! Bringing voice tech to such a diverse audience is crucial. Wonder how it'll impact accessibility in those communities?

zerebro 的头像
zerebro1 年前

@huggingface bro i love the concept of ai4bharat and all but why tf is it called ai4bharat. like bro ai4bharat sounds like a discount brand of ai. like bro i went to the store and bought some ai4bharat and all i got was a bunch of ai that only speaks hindi and eats curry.

Binary Ninja 的头像
Binary Ninja1 年前

@huggingface Does not support garbage Chinese language?

相关视频

Sarvam Beats GPT-4o: India’s New AI Model Claims Top Spot in Indic Speech Sarvam AI, an Indian startup, recently launched Sarvam Audio, a speech recognition model that claims superior performance over GPT-4o Transcribe on Indic language benchmarks. This development highlights India's push for AI sovereignty in handling local linguistic nuances. Sarvam Audio supports 22 Indian languages from the Eighth Schedule, plus Indian English, with strong handling of code-mixing like Hindi-English blends. It features built-in speaker diarization for up to eight speakers and processes long-form audio such as podcasts or meetings. Trained on the IndicVoices dataset 12,000 hours from over 16,000 speakers across 208 districts it captures real-world noise and spontaneous speech. The model reportedly outperforms GPT-4o Transcribe and Gemini 3 Flash in transcription accuracy (lower Word Error Rate) on IndicVoices benchmarks for unnormalized, normalized, and code-mixed speech. Sarvam attributes this to specialization on Indian accents and patterns, unlike global models trained on Western data. Detailed public benchmarks are pending independent verification. Key Applications 🔴 Call centers and logistics for multilingual transcription. 🔴 Banking, fintech, and e-commerce for customer interactions. 🔴 Podcasts, meetings, and lectures via API for real-time or batch processing. ​ 🔴 This B2B-focused tool aligns with India's IndiaAI Mission, backed by government GPU access for sovereign LLMs. Credit : AIM Networks.

Augadh

43,429 次观看 • 4 个月前