Video wird geladen...
Video konnte nicht geladen werden
Quantized Gemma 2B runs pretty fast on my iPhone 15 pro in MLX Swift. code & docs: Comparable to GPT 3.5 turbo and Mixtral 8x7B in LMSYS Org benchmarks but runs efficiently on an iPhone. Pretty wild.
79,702 Aufrufe • vor 1 Jahr •via X (Twitter)
10 Kommentare

Logan Kilpatrickvor 1 Jahr
@lmsysorg Cost of intelligence takes another hit today : )

Christian Schoppevor 1 Jahr
@lmsysorg I have the 6 bit quantized version running on my Pixel. Not quite as fast as yours but still quite usable. After a few initial tests, I still prefer Phi-3-mini.

Eric Hartfordvor 1 Jahr
@lmsysorg that's awesome!

Kirito (e/acc) 🏴☠️vor 1 Jahr
@lmsysorg Great work we all saw it coming - privacy and intelligence at the palm of your hand

Rami El-Masrivor 1 Jahr
@lmsysorg Running advanced models like Gemma 2B efficiently on mobile devices is a game-changing milestone.

Tris Warkentinvor 1 Jahr
@lmsysorg What an incredible demo -- speed and quality are very impressive. Now to work on accessibility =)

NFTPerks 🇵🇹vor 1 Jahr
@lmsysorg awesome

Stavros Kassinosvor 1 Jahr
@lmsysorg 🚀🚀

Manivor 1 Jahr
@lmsysorg Is it 4bit quantized?

Awni Hannunvor 1 Jahr
@lmsysorg Yes
