Загрузка видео...
Не удалось загрузить видео
Check out mistral.rs, our #Rust-based open source inference engine allowing for fast #LLM serving for a variety of architectures including X-LoRA mixture-of-expert (MoE) models, Llama-3, Mistral/Mixtral, Gemma & many others. Built on the Hugging Face #Candle framework for #Rust w/ custom CUDA kernels in the backend (as well as... show more
73,575 просмотров • 2 лет назад •via X (Twitter)
Комментарии: 10

@huggingface This is really rly cool! I thought it was only limited to Mistral related LLMs though maybe it's not too late to change the name to something generic to not confuse people?

@huggingface Yes, the tool with with many architectures, including Llama-3, Phi-3 (up to 128k context), Mistral/Mixtral, Gemma, X-LoRA mixture-of-expert (MoE) models, & many others. Also features in-situ quantization to any level.

@huggingface Awesome! More apps like this written in Rust 😍

@huggingface This looks really cool. I’ve been trying to figure out optimal branching in rust candle. Nice job.

@huggingface wow!

@huggingface 🦀

@huggingface Does it support GPU hardware acceleration ?

@huggingface Impressive work !

@huggingface Thank you!

@_philschmid @huggingface Is this affiliated with Mistral AI? The name certainly implies it.


