
Georgi Gerganov
@ggerganov • 50,809 subscribers
24th at the Electrica puzzle challenge | https://t.co/baTQS2bdia
Shorts
Videos
1:09
Sensitive content
This media may contain sensitive content.

Let me demonstrate the true power of llama.cpp: - Running on Mac Studio M2 Ultra (3 years old) - Gemma 4 26B A4B Q8_0 (full quality) - Built-in WebUI (ships with llama.cpp) - MCP support out of the box (web-search, HF, github, etc.) - Prompt speculative decoding The result: 300t/s (realtime video)
Georgi Gerganov782,188 次观看 • 2 个月前
1:32
Sensitive content
This media may contain sensitive content.

GGUF My Repo by Hugging Face Create quantum GGUF models fully online - quickly and secure. Thanks to Vaibhav (VB) Srivastav, Pedro Cuenca and team for creating this HF space! In the video below I give it a try to create a quantum 8-bit model of Gemma 2B - it took about 60 seconds. The resulting model becomes automatically available in your HF profile and is ready to be used with llama.cpp
Georgi Gerganov64,486 次观看 • 2 年前
没有更多内容可加载