Loading video...

Video Failed to Load

Go Home

Releasing moondream-zig! It is a fast, implementation of moondream2 inference on the CPU written from-scratch in Zig :) moondream-zig provides 1.5-2x faster inference compared to huggingface on the same device. moondream vik

29,245 views • 1 year ago •via X (Twitter)

11 Comments

snow's profile picture
snow1 year ago

No GPU needed! moondream-zig is optimized to use consumer CPUs using my tensor library, katana. It uses Zig's SIMD intrinsics and clever optimizations to do lightning fast computations. Katana was born from moondream-zig and is now its own project :

snow's profile picture
snow1 year ago

I've packed moondream-zig with many features : - an infinite chat client - in-terminal image display - intuitive navigation - multiple samplers and a fast tokenizer. - a cat spinner animation There's also an single query client/API which you can use in your own projects.

snow's profile picture
snow1 year ago

The best part? Thanks to Zig's compiler, you can cross compile it all for ARM, x86 and NEON directly - this means you can run moondream-zig on your phone or on a SLURM server if you want to ; and you won't have to change a single line of code. This is why I love Zig so much.

snow's profile picture
snow1 year ago

I have organized this project from an educational standpoint. If you choose to go through the code, you will find multiple comments detailing what optimization decisions I make, and where. I highly recommend going through the GEMM and attn implementations.

snow's profile picture
snow1 year ago

I wrote moondream-zig is the first Zig project I started working on. It is also my biggest project yet, though I believe Katana will outgrow it. I wanted to challenge myself to learn Zig and put my inference optimization knowledge to the test.

snow's profile picture
snow1 year ago

I am happy to say that I feel that I achieved both to some degree. This project is really close to my heart since I worked on it through many highs and lows. I hope you enjoy using it as much as I did writing it. If you love moondream, you would love this project.

CodeRabbit's profile picture
CodeRabbit1 year ago

AI-first pull request reviewer with context-aware feedback, line-by-line code suggestions, and real-time chat.

snow's profile picture
snow1 year ago

@vikhyatk special thanks to @_vatsadev @httpslinus @k7agar @seatedro @dnbt777 @ParsaKhaz @AdjectiveAlli @sasuke___420 @Guudfit @Aryvyo @felix_red_panda @kalomaze and more on X for giving me ideas, testing my implementation, helping me bench my code and debugging!

snow's profile picture
snow1 year ago

@vikhyatk also, shoutout to the huggingface ML and Tokenizers team ; specifically @alvarobartt and @art_zucker for going through my code and testing my fast tokenizer. you guys are the best!

ronin's profile picture
ronin1 year ago

@moondreamai @vikhyatk MY GOATTTTTTTTT

snow's profile picture
snow1 year ago

@moondreamai @vikhyatk I kneel thanks for testing my code my king

Related Videos