Video wird geladen...
Video konnte nicht geladen werden
🔥 Thrilled to have worked with Google AI Developers on day-0 MLX support for Gemma 3 QAT! QAT optimizes models during training by simulating low-precision operations, delivering similar performance to FP16 and dramatic memory savings when quantised: • Gemma 3 27B: 54GB → 14.1GB (74% reduction) • Gemma 3... show more
11,374 Aufrufe • vor 1 Jahr •via X (Twitter)
10 Kommentare

Stormbreaker Max CF design to production Get the next generation of high end performance gaming mice. Shop now:

Thanks @osanseviero @reach_vb and the teams behind this amazing release ❤️

@googleaidevs not seeing it in LM Studio, will it show up there too?

@googleaidevs Probably an update? @yagilb

@googleaidevs Thank you for the efforts, also the Kimi thinking was great and fast but had some registration problmes needed to be bypassed with custom code, if not wrong still exist. But cheers for the efforts and speed.

@googleaidevs Could you share more about it? Perhaps open an issue

Excellent stuff! About double the speed of a prior model of similar size, from recollection. 🤩Thanks for that! 🙏 Would you know if we are to expect Speculative Decoding in @lmstudio to work? I got the 27b, then downloaded the 1b and then 4b versions too. Trying to get them to show up in the "Speculative Decoding" "Draft Model" "Select a compatible draft model" dropdown. So far no luck, none of them show up in the dropdown in @lmstudio. (on m2 mbp 96gb ram) (pic of the exact models versions below)

@googleaidevs @lmstudio Not yet for VLMs only if you use them as text models

@googleaidevs Wish there was an intermediate model between 27B & 12B, lots of cards in that range!

@googleaidevs I can’t wait to check it out!


