Loading video...

Video Failed to Load

Go Home

Llama3.1 on a raspberry pi Thanks for Llamafile 😍

126,760 views • 1 year ago •via X (Twitter)

10 Comments

Mike Bird (Hiring)'s profile picture
Mike Bird (Hiring)1 year ago

I used the @Raspberry_Pi 5 with the M.2 Hat with the Hailo AI module

Tom Dörr's profile picture
Tom Dörr1 year ago

@JustineTunney Now do 405B

Mike Bird (Hiring)'s profile picture
Mike Bird (Hiring)1 year ago

@JustineTunney

John T Davies 🇺🇦🇪🇺🌍's profile picture
John T Davies 🇺🇦🇪🇺🌍1 year ago

@JustineTunney My Raspi5 runs Llama3.1 8B (Q4) at just over 1.8 toks/sec, painful if you're waiting for an answer. Qwen2 1.5B (Q4) gave me a pretty good answer for the same example at over 8 toks/sec. We're getting there!

Mike Bird (Hiring)'s profile picture
Mike Bird (Hiring)1 year ago

@JustineTunney Incremental progress!

Prince Canuma's profile picture
Prince Canuma1 year ago

@JustineTunney That’s pretty cool! Btw if you have a Mac l, you can stream the same model (4-bit quant) much faster using fastMLX (+100 tokens/s for M3 Max 96GB) You can even connect multiple Pis to the same server and run parallel requests :)

Mike Bird (Hiring)'s profile picture
Mike Bird (Hiring)1 year ago

@JustineTunney Definitely will experiment! Thanks Prince!

AshutoshShrivastava's profile picture
AshutoshShrivastava1 year ago

@JustineTunney Coolest thing on internet today.

Mike Bird (Hiring)'s profile picture
Mike Bird (Hiring)1 year ago

@JustineTunney 🫡

Chubby♨️'s profile picture
Chubby♨️1 year ago

@JustineTunney I love to see that SLM is getting more and more great possibilities. To think that an SLM as excellent as Llama 3.1 8b is already running on a Raspberry pi, I can't imagine where we'll be in a year's time. Great work

Related Videos