Video yükleniyor...
Video Yüklenemedi
Here are the best practices for using Eleven v3 (alpha) - the most expressive Text to Speech model.
43,692 görüntüleme • 1 yıl önce •via X (Twitter)
11 Yorum

1. Use longer prompts. Eleven v3 performs better with longer inputs. Prompts shorter than 250 characters are more likely to produce unstable results.

2. Pick the right voice. Some voices are higher quality and more expressive than others. Use voices made for the language you're working in. When creating new voices, include a wider emotional range than before. Explore 22 voices that perform well with v3:

3. Use audio tags to control delivery. Audio tags like [sarcastic], [whispers], [excited], or [strong French accent] shape how the model speaks. Choose a voice suited to your intended delivery. Don’t expect a whispering voice to shout convincingly. Audio tags aren’t universal.

4. Keep experimenting. Eleven v3 (alpha) is a research preview. It often requires more prompt engineering than earlier models, but the results are breathtaking.

Read the full best practices guide:

Our speech-to-text models are the most accurate on the market with top rankings across industry benchmarks. - The highest accuracy rates—up to 95% - Up to 30% fewer hallucinations than other leaders - Low latency—63 minutes converts in 35 seconds Try via API for free today 👇

Great explanation by @alecwilcock_

👏 @alecwilcock_

@alecwilcock_ dropping wisdom once again 📝🤓

nice

Thanks for the explainer. This is one of the coolest updates right now that I’m excited about. Will be playing with this as much as I can 💚 🙌

