argmax's banner
argmax's profile picture

argmax

@argmax4,491 subscribers

Frontier Models On Device

Shorts

We are open-sourcing TTSKit! Run state-of-the-art text-to-speech models on your Mac and iPhone. The launch version supports Qwen Qwen3-TTS and generates audio faster than real-time playback with sub-200 ms time-to-first-byte. Voice cloning and advanced speed optimizations will be in the next version. Link to the GitHub repo and models on Hugging Face in comments.

We are open-sourcing TTSKit! Run state-of-the-art text-to-speech models on your Mac and iPhone. The launch version supports Qwen Qwen3-TTS and generates audio faster than real-time playback with sub-200 ms time-to-first-byte. Voice cloning and advanced speed optimizations will be in the next version. Link to the GitHub repo and models on Hugging Face in comments.

61,998 views

WhisperKit-0.7.0 is out! Single file inference is several times faster! The demo below is running distil-whisper large-v3 at 300 tok/s and transcribes 101 seconds of audio in 1 second on an M2 Ultra Mac Studio. Details 🧵 Code (MIT): Demo Audio Input: Demo App: (TestFlight update under review)

WhisperKit-0.7.0 is out! Single file inference is several times faster! The demo below is running distil-whisper large-v3 at 300 tok/s and transcribes 101 seconds of audio in 1 second on an M2 Ultra Mac Studio. Details 🧵 Code (MIT): Demo Audio Input: Demo App: (TestFlight update under review)

50,542 views

Videos

argmax's profile picture

Introducing WhisperKit

argmax

98,325 views • 2 years ago

No more content to load