Hume AI's banner

Hume AI

@hume_ai • 23,241 subscribers

Build and measure voice models and agents the way people experience them, based on research and grounded in real human judgment.

Shorts

Meet EVI 2, our new foundational voice-to-voice model. Live demo →

Meet EVI 2, our new foundational voice-to-voice model. Live demo →

31,265 次观看

Videos

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

Exclusive private shows

1.2k viewers online

Private Show

Join now for exclusive access

Free preview available • Premium content

Today Hume is introducing Real World VoiceEQ: a benchmark for measuring the human quality of voice AI across 40+ proprietary and open-source models, 15+ dimensions, and 60+ metrics—developed from more than 1 million individual human ratings.

Today Hume is introducing Real World VoiceEQ: a benchmark for measuring the human quality of voice AI across 40+ proprietary and open-source models, 15+ dimensions, and 60+ metrics—developed from more than 1 million individual human ratings.

243,653 次观看 • 12 天前

Introducing Octave 2: our next-generation multilingual text-to-speech model What’s new: - Fluent in 11+ languages - 40% faster (<200ms latency⁠⁠) & 50% cheaper than Octave 1 - Multi-speaker conversation - More reliable pronunciation - New voice conversion & phoneme editing capabilities For the month of October, we’re offering 50% off our Creator plan - use code OCTAVE2 at checkout!

Introducing Octave 2: our next-generation multilingual text-to-speech model What’s new: - Fluent in 11+ languages - 40% faster (<200ms latency⁠⁠) & 50% cheaper than Octave 1 - Multi-speaker conversation - More reliable pronunciation - New voice conversion & phoneme editing capabilities For the month of October, we’re offering 50% off our Creator plan - use code OCTAVE2 at checkout!

7,077,358 次观看 • 9 个月前

Meet EVI 3, another step toward general voice intelligence. EVI 3 is a speech-language model that can understand and generate any human voice, not just a handful of speakers. With this broader voice intelligence comes greater expressiveness and a deeper understanding of tune, rhythm, timbre, and speaking style.

Meet EVI 3, another step toward general voice intelligence. EVI 3 is a speech-language model that can understand and generate any human voice, not just a handful of speakers. With this broader voice intelligence comes greater expressiveness and a deeper understanding of tune, rhythm, timbre, and speaking style.

832,639 次观看 • 1 年前

Today we're releasing our first open source TTS model, TADA! TADA (Text Audio Dual Alignment) is a speech-language model that generates text and audio in one synchronized stream to reduce token-level hallucinations and improve latency. This means: → Zero content hallucinations across 1,000+ test samples → 5x faster than similar-grade LLM-based TTS → Fits much longer audio: 2,048 tokens cover ~700 seconds with TADA vs. ~70 seconds in conventional systems → Free transcript alongside audio with no added latency

Today we're releasing our first open source TTS model, TADA! TADA (Text Audio Dual Alignment) is a speech-language model that generates text and audio in one synchronized stream to reduce token-level hallucinations and improve latency. This means: → Zero content hallucinations across 1,000+ test samples → 5x faster than similar-grade LLM-based TTS → Fits much longer audio: 2,048 tokens cover ~700 seconds with TADA vs. ~70 seconds in conventional systems → Free transcript alongside audio with no added latency

270,323 次观看 • 4 个月前

Meet Hume’s Empathic Voice Interface (EVI), the first conversational AI with emotional intelligence.

Meet Hume’s Empathic Voice Interface (EVI), the first conversational AI with emotional intelligence.

875,750 次观看 • 2 年前

Today, we’re releasing Octave: the first LLM built for text-to-speech. 🎨Design any voice with a prompt 🎬 Give acting instructions to control emotion and delivery (sarcasm, whispering, etc.) 🛠️Produce long-form content on our Creator Studio Unlike traditional TTS that just “reads” words aloud, Octave understands how meaning affects delivery to generate emotional, human-like speech.

Today, we’re releasing Octave: the first LLM built for text-to-speech. 🎨Design any voice with a prompt 🎬 Give acting instructions to control emotion and delivery (sarcasm, whispering, etc.) 🛠️Produce long-form content on our Creator Studio Unlike traditional TTS that just “reads” words aloud, Octave understands how meaning affects delivery to generate emotional, human-like speech.

393,940 次观看 • 1 年前

You can now control a computer with just your voice. Here’s how we did it: 🧵

You can now control a computer with just your voice. Here’s how we did it: 🧵

428,417 次观看 • 1 年前

Introducing Voice Control by Hume We developed an experimental voice modulation approach that enables you to create unique AI voices in seconds. Our voice sliders make it intuitive to adjust base voices along 10 interpretable dimensions including: 👃 Nasality: resonant to nasal 🎼 Masculine/Feminine: from masculine to feminine 🎈 Buoyancy: from deflated to buoyant Check out the sample creations in the thread below 👀

Introducing Voice Control by Hume We developed an experimental voice modulation approach that enables you to create unique AI voices in seconds. Our voice sliders make it intuitive to adjust base voices along 10 interpretable dimensions including: 👃 Nasality: resonant to nasal 🎼 Masculine/Feminine: from masculine to feminine 🎈 Buoyancy: from deflated to buoyant Check out the sample creations in the thread below 👀

200,895 次观看 • 1 年前

Introducing Empathic Voice Interface 2 (EVI 2), our new voice-to-voice foundation model. EVI 2 merges language and voice into a single model trained specifically for emotional intelligence. You can try it and start building today.

Introducing Empathic Voice Interface 2 (EVI 2), our new voice-to-voice foundation model. EVI 2 merges language and voice into a single model trained specifically for emotional intelligence. You can try it and start building today.

165,640 次观看 • 1 年前

OpenAI enters the Expressive TTS Arena 🥊🤖🥊 Now hosted on Hugging Face, this arena is a new way to evaluate voice AI systems with natural language instructions + richer text. Compare Hume's TTS against ElevenLabs and OpenAI and see if you agree with the leaderboard results!

OpenAI enters the Expressive TTS Arena 🥊🤖🥊 Now hosted on Hugging Face, this arena is a new way to evaluate voice AI systems with natural language instructions + richer text. Compare Hume's TTS against ElevenLabs and OpenAI and see if you agree with the leaderboard results!

75,766 次观看 • 1 年前

Introducing the new Hume App Featuring brand new assistants that combine voices and personalities generated by our speech-language model, EVI 2, with supplemental LLMs and tools like the new Claude 3.5 Haiku from Anthropic.

67,159 次观看 • 1 年前

Introducing Expressive TTS Arena 🥊🤖🥊 ⚡️ 🥊🤖🥊 Starting with Hume AI vs ElevenLabs, it's a new way to evaluate voice AI systems with natural language instructions + richer text As voice generation systems evolve, we wanted to show an example of an eval system better suited toward cutting edge models👇

Introducing Expressive TTS Arena 🥊🤖🥊 ⚡️ 🥊🤖🥊 Starting with Hume AI vs ElevenLabs, it's a new way to evaluate voice AI systems with natural language instructions + richer text As voice generation systems evolve, we wanted to show an example of an eval system better suited toward cutting edge models👇

52,628 次观看 • 1 年前

What happens when EVI is customized to act like… a wistful fridge?🤔or… a jealous houseplant? or literally anything else you can imagine? Our Empathic Voice Interface playground is now live!

What happens when EVI is customized to act like… a wistful fridge?🤔or… a jealous houseplant? or literally anything else you can imagine? Our Empathic Voice Interface playground is now live!

61,655 次观看 • 2 年前

The EVI API is finally here! With our demo alone, we were surprised so many people felt a connection to the world’s first emotionally intelligent voice AI: ✨ ~100K conversations ⏱️ 10 min average conversation length 💬 3M user messages Start building applications your users will love even more! We can’t wait to see what you create.

The EVI API is finally here! With our demo alone, we were surprised so many people felt a connection to the world’s first emotionally intelligent voice AI: ✨ ~100K conversations ⏱️ 10 min average conversation length 💬 3M user messages Start building applications your users will love even more! We can’t wait to see what you create.

57,935 次观看 • 2 年前

One performance, infinite voices. Voice Conversion is now live on Hume’s creator studio and API! Generate the same pacing, pronunciation, and intonation with one recording across any voice you choose. Hear it for yourself ⬇️

One performance, infinite voices. Voice Conversion is now live on Hume’s creator studio and API! Generate the same pacing, pronunciation, and intonation with one recording across any voice you choose. Hear it for yourself ⬇️

19,887 次观看 • 8 个月前

Hear ye! Hear ye! By royal degree of Hume AI, let it be known throughout the land: our model Octave doth bring forth speech from text!

Hear ye! Hear ye! By royal degree of Hume AI, let it be known throughout the land: our model Octave doth bring forth speech from text!

23,415 次观看 • 1 年前

EVI, the frontier voice AI with emotional intelligence, is now a lot smarter—and available as an iOS app! 📲 Featuring a bold new and improved AI voice named Kora 💁‍♀️and integrating Claude 3.5 Sonnet into its responses, EVI is ready to listen, answer, and explore →

EVI, the frontier voice AI with emotional intelligence, is now a lot smarter—and available as an iOS app! 📲 Featuring a bold new and improved AI voice named Kora 💁‍♀️and integrating Claude 3.5 Sonnet into its responses, EVI is ready to listen, answer, and explore →

27,562 次观看 • 2 年前

Better AI voices coming soon...

Better AI voices coming soon...

12,360 次观看 • 1 年前

Our CEO Alan Cowen on why emotionally intelligent voice interfaces are the key to AI that understands our needs.

Our CEO Alan Cowen on why emotionally intelligent voice interfaces are the key to AI that understands our needs.

11,883 次观看 • 2 年前

没有更多内容可加载