Video yükleniyor...

Video Yüklenemedi

Ana Sayfaya Dön

Are you using OpenAI's Whisper for speech recognition and finding the timestamps are out of sync? Just dropped: WhisperX with word-level timestamp accuracy by force aligning whisper with wav2vec2.0 🧵 [1/n]

78,290 görüntüleme • 3 yıl önce •via X (Twitter)

11 Yorum

Max Bain profil fotoğrafı
Max Bain3 yıl önce

🧵[2/n] @openAI’s Whisper shows impressive transcription performance, but often the corresponding timestamps are out of sync by several seconds.

Rainmaker profil fotoğrafı
Rainmaker1 yıl önce

Heightened volatility got you on edge? In my latest free Substack post, discover how a Hidden Markov Model (HMM) can help you navigate market corrections and safeguard your investments.

Max Bain profil fotoğrafı
Max Bain3 yıl önce

🧵[3/n] However, phoneme-based models such as Wav2Vec2.0 produce much more accurate timestamps. WhisperX leverages these models using forced alignment on the whisper transcription to generate word-level timestamps.

Max Bain profil fotoğrafı
Max Bain3 yıl önce

🧵[4/n] The result is word-level timestamp output. See more examples and try it yourself at

Max Bain profil fotoğrafı
Max Bain3 yıl önce

🧵[5/n] Of course, it would be better if a single model did everything. One way would be teacher-student, where whisper is learning to output wav2vec's aligned timestamps. If @OpenAI open-sourced the training data and script, it would be cool to try this :)

Benjamin Warberg profil fotoğrafı
Benjamin Warberg3 yıl önce

@philipvollet @OpenAI Awesome!

august kamp profil fotoğrafı
august kamp3 yıl önce

@OpenAI any way to get this working for a musician who struggles aligning vocals to videos ?

Max Bain profil fotoğrafı
Max Bain3 yıl önce

@OpenAI do you mean aligning lyrics to the audio? You can feed the lyrics to the align function in the code, although aligning over such a long sequence could be tricky.

Hen³ profil fotoğrafı
Hen³3 yıl önce

@OpenAI Big bro killing it 🤝

hot-pocket.usd profil fotoğrafı
hot-pocket.usd3 yıl önce

@OpenAI @memdotai mem it

neb profil fotoğrafı
neb3 yıl önce

@OpenAI Thanks !

Benzer Videolar