
Prince Canuma
@Prince_Canuma • 21,857 subscribers
Apple MLX King 🤴🏽• Creator of (mlx-audio & mlx-vlm) • Ex-@arcee_ai • @neptune_ai • https://t.co/iZnxoefJBU
Shorts
Videos

Today we're shipping our biggest MLX-VLM release yet: v0.6.0 ...and we are raising 💸 This one's about turning your Apple devices into real local agent machines. From your desk to your pocket. What's new: ⚡ Speculative decoding everywhere — Gemma 4 EAGLE3 + DFlash, Qwen MTP, DeepSeek V4 MTP. Faster tokens, less waiting. 🤖 Agent-ready server — native Anthropic /v1/messages API, stateful /v1/responses, tool calls, Codex context budgets. Plug Claude Code & Codex straight into local models. 👁️ New models galore — DeepSeek V4, ZAYA1-VL, MiniCPM-V 4.6, LFM2 MoE, Step-3.7 Flash, Laguna + more. 🎨 Image gen & editing — FLUX.2 (base + klein), PrismML Bonsai. 🔊 Audio in — Qwen3 Omni, Gemma 4 audio, base64 chat audio. 🧮 TurboQuant KV cache — RHT-correct fast paths for leaner memory. 📦 Modular server, better metrics, cleaner streaming. Run real agents on the hardware already in your hands. Github:
Prince Canuma63,592 次观看 • 3 天前

DeepSeek-V4-Flash powering 4 parallel agents on Pi (by Mario Zechner) 🚀 Running on M3 Ultra at ~30-34 tok/s and 160-187GB peak URAM using MLX-LM. Special shoutout to clandestine.eth 🦇🔊, Pedro Cuenca, Tarjei Mandt, Ivan Fioravanti ᯅ and others for helping optimize and shape this PR. PR:
Prince Canuma104,550 次观看 • 1 个月前

Day 1 of 3 days of MLX: Introducing MLX-Audio-Swift SDK 🚀 A modular Swift SDK for voice agents and tasks on Apple Silicon built by Lucas Newman and yours truly. iOS, macOS, and visionOS developers can now build native apps with real-time, on-device audio intelligence: 🗣️ Text-to-Speech (TTS) 👂 Speech-to-Text (STT) 🔄 Speech-to-Speech (STS) 🎙️ Voice Activity Detection (VAD) and more. Only import the capabilities you need, nothing extra. Get started today and leave us a star ⭐️
Prince Canuma155,243 次观看 • 3 个月前

DeepSeek-v4 now runs at ~23-26 tok/s on MLX! I made some custom kernels for the sinkhorn and it took gen speeds for 17 -> 26 tok/s. The weights are also significantly smaller thanks to Pedro Cuenca tip about keeping the experts in MXFP4! Now you can use it to power your local coding agents (PI, Open code, Hermes agent or even CC) PR:
Prince Canuma58,272 次观看 • 1 个月前

Local transcription running on iPad Pro M1 🔥 Qwen3-ASR-0.6B from Qwen — fully on-device, no cloud, no API calls. Built with our new MLX-Audio-Swift SDK, hitting 25 tok/s at just 1.9GB of RAM. Test audio? Found Marc Lou's old sales call from one of his awesome newsletters sitting on my device. Try it out & drop us a ⭐️
Prince Canuma81,427 次观看 • 3 个月前

Marvis-TTS-v0.2 is here 🚀 A local first TTS model capable of realtime performance even on older iPhones that Lucas Newman and I built. What’s new: ✨ Blazing fast — 100M (tiny) & 250M parameter models 🌍 Multilingual — English, French, German 🎭 Enhanced voice cloning — More natural & expressive ⚡ Long-form generation — Up to 90 seconds (4x improvement) Get started today: > pip install -U mlx-audio
Prince Canuma118,317 次观看 • 6 个月前

You can now vibecode your own WisprFlow or Monologue alternative that runs completely locally on Apple Silicon using MLX-Audio-Swift 🔥 Check out this live transcription of Dwarkesh Patel interview with Andrej Karpathy using Qwen3-ASR-0.6B quantized to 4bit on a M3 Max. It also runs in realtime on a iPhone 15 Pro and iPad Pro M1. No cloud. No API keys.
Prince Canuma60,915 次观看 • 3 个月前

RF-DETR by Roboflow now on MLX It can do realtime instance segmentation on-device and enable some cool use cases for visual analysis, monitoring and robotics like Reachy Mini. Also augmented VLM and VLA by preprocessing image and video with areas of interest. New release coming soon on mlx-vlm 🚀 For those who can’t wait you can install mlx-vlm from source.
Prince Canuma30,499 次观看 • 2 个月前

Day 1 of 3 MLX Releases: Introducing MLX-Audio 🚀🔥 A text-to-speech (TTS) and Speech-to-Speech (STS) library built on Apple's MLX framework, providing efficient speech synthesis on Apple Silicon. Features ⚡️Fast inference on Apple Silicon (M series chips) 🤖Multiple language support 🗣️Voice customization options 🚀Quantization support for optimized performance Supported models: 🪶Kokoro - A multilingual TTS model with 82M params that supports various languages and voice styles. With more models coming soon. Get started: > pip install mlx-audio Please leave us a star and send a PR :)
Prince Canuma123,478 次观看 • 1 年前

Introducing Marvis-TTS 🔥🚀 A new local-first TTS model Lucas Newman and I built for efficiency, accessibility, and real-time performance right on consumer devices like Apple Silicon, iPhones, iPads, and more. Traditional TTS models often demand full text inputs or sacrifice real-time capabilities, Marvis flips the script. It streams audio chunks as text is processed, creating a truly conversational experience. No more awkward pauses or unnatural breaks—Marvis handles the entire text context intelligently to deliver coherent, expressive speech. Get started today: > pip install -U mlx-audio
Prince Canuma81,222 次观看 • 9 个月前

Introducing MLX-Audio Studio 🚀 An open-source UI for audio gen. This new UI will allow you to easily generate and transcribe audio locally using MLX-Audio, Transformers or any other backend you prefer (i.e. OpenAI). We will be adding more tasks soon, stay tuned! Get started on our GH:
Prince Canuma51,071 次观看 • 6 个月前

Local grounded reasoning using MLX will power a whole new generation of use cases that were previously only available on the cloud! From satellite imagery analysis, security systems all the way to robotics. I’m really excited for the latter. I spoke at length about these during my talk at AI Engineer
Prince Canuma15,119 次观看 • 1 个月前

Maxed out M3 Ultra is here! 🚀 Huge thank you to the entire MLX community for making this possible. With your support, I can dedicate myself full-time to advancing MLX as an independent developer. For the first time, I have the hardware to match the ambition—no more bottlenecks holding back the biggest models and most challenging work. Excited for what’s ahead. Let’s build something incredible together.
Prince Canuma27,551 次观看 • 7 个月前

Day 2 of 3 MLX Releases: Introducing Local Computer-Use 🚀🔥 A powerful tool built with MLX that uses Vision Language models and Voice models to control your Mac through visual understanding, planning and reasoning. Features ⚡️Automate your workflow with natural language 😎 Control your computer “hands-free” This project now supports both: 🤖 Level 1 (GUI Agent) 🧠 Level 2 (Autonomous GUI Agent) Get started: > pip install -U mlx-vlm mlx-audio mlx-whisper Please leave us a star and send a PR :)
Prince Canuma45,867 次观看 • 1 年前

Sam-Audio by AI at Meta is now on MLX-Audio 🔊 You can now isolate voices or sounds from any audio locally in your Mac. I added this simple UI to the MLX-Audio studio app to help you get started Note: - The release is coming soon for now please install from source. - Vision part was excluded for now. Let me know if you need it. - PR for UI:
Prince Canuma19,027 次观看 • 5 个月前