
Angelos Katharopoulos
@angeloskath • 4,738 subscribers
Machine Learning Research @Apple. Previously PhD student at @idiap_ch and @EPFL. Interested in all things machine learnable
Videos

A long time coming but new mlx-lm is here with better batching support in the server and Gemma 4. pip install -U mlx-lm Here is a video where a single M3 Ultra serves 5 opencode sessions with Gemma 4 26B that process ~130k tokens in ~1.5 minutes.
Angelos Katharopoulos66,095 просмотров • 2 месяцев назад

Qwen 3.5 397B prompt processing on M3 Ultra (with MLX distributed + JACCL) - 3.4× speedup on 4 chips - scaling improves as context increases Really fun to use with opencode; generated a playable Asteroids clone in ~4 minutes (real time, including me playing it a bit).
Angelos Katharopoulos23,552 просмотров • 3 месяцев назад
Больше нет контента для загрузки