Loading video...

Video Failed to Load

Go Home

first step was getting llama cpp to play nice with electron js so that we can run the model I fine tuned on the client, a couple errors but eventually got it wired up with node-llama-cpp bindings. this way the model + app can be shipped to the user...

16,499 views • 2 years ago •via X (Twitter)

10 Comments

anton's profile picture
anton2 years ago

the nice thing about llama cpp is the user will be able to run inference on CPU or GPU (cuda + metal for mac) in case they have either

anton's profile picture
anton2 years ago

stack is electron-vite, react, llama-cpp using the node-llama-cpp bindings and model is still tbd but currently working with a fine tuned qwen2 500M

Stocko 👊🤖's profile picture
Stocko 👊🤖2 years ago

wow, that’s amazingly fast

Alloy🐍🍀's profile picture
Alloy🐍🍀2 years ago

Isn't this going to be a massive download or are you downloading the model within the client app and then working "offline"?

anton's profile picture
anton2 years ago

the app will ship without the model, which will be downloaded after you install it. how big is the app (w/o the model file)? it is 227mb (will work on bundle size later honestly)

Yam Peleg's profile picture
Yam Peleg2 years ago

very nice work! an integration like this done well has amazing potential

nigh8w0lf's profile picture
nigh8w0lf2 years ago

Looks nice! Lamafile but with a cleaner JS interface.

Caleb's profile picture
Caleb2 years ago

I’m really intrigued at using transformers js to do code autocomplete or something in the browser. Excited to follow along on this

el's profile picture
el2 years ago

what machine is this on?

Ravi Chandra Veeramachaneni's profile picture
Ravi Chandra Veeramachaneni2 years ago

@abacaj Have you tried or considered swift for the purpose. Lately been seeing lots of apps coming out of the swift native and its cross platform bindings.

Related Videos