Loading video...

Video Failed to Load

Go Home

Check out mistral.​rs, our #Rust-based open source inference engine allowing for fast #LLM serving for a variety of architectures including X-LoRA mixture-of-expert (MoE) models, Llama-3, Mistral/Mixtral, Gemma & many others. Built on the Hugging Face #Candle framework for #Rust w/ custom CUDA kernels in the backend (as well as...

73,575 views • 2 years ago •via X (Twitter)

10 Comments

nicoarq's profile picture
nicoarq2 years ago

@huggingface This is really rly cool! I thought it was only limited to Mistral related LLMs though maybe it's not too late to change the name to something generic to not confuse people?

Markus J. Buehler's profile picture
Markus J. Buehler2 years ago

@huggingface Yes, the tool with with many architectures, including Llama-3, Phi-3 (up to 128k context), Mistral/Mixtral, Gemma, X-LoRA mixture-of-expert (MoE) models, & many others. Also features in-situ quantization to any level.

Ingo Villnow (DM5DK) 🇺🇦🌻's profile picture
Ingo Villnow (DM5DK) 🇺🇦🌻2 years ago

@huggingface Awesome! More apps like this written in Rust 😍

Jonathan Eisenzopf's profile picture
Jonathan Eisenzopf2 years ago

@huggingface This looks really cool. I’ve been trying to figure out optimal branching in rust candle. Nice job.

Thomas Wolf's profile picture
Thomas Wolf2 years ago

@huggingface wow!

PΔBLØ ᄃΞ's profile picture
PΔBLØ ᄃΞ2 years ago

@huggingface 🦀

Alexander Ocsa's profile picture
Alexander Ocsa2 years ago

@huggingface Does it support GPU hardware acceleration ?

Nicolas Patry's profile picture
Nicolas Patry2 years ago

@huggingface Impressive work !

Markus J. Buehler's profile picture
Markus J. Buehler2 years ago

@huggingface Thank you!

iandanforth 🦋 @iandanforth.bsky.social's profile picture
iandanforth 🦋 @iandanforth.bsky.social2 years ago

@_philschmid @huggingface Is this affiliated with Mistral AI? The name certainly implies it.

Related Videos