Video wird geladen...
Video konnte nicht geladen werden
first step was getting llama cpp to play nice with electron js so that we can run the model I fine tuned on the client, a couple errors but eventually got it wired up with node-llama-cpp bindings. this way the model + app can be shipped to the user... show more
16,499 Aufrufe • vor 1 Jahr •via X (Twitter)
10 Kommentare

the nice thing about llama cpp is the user will be able to run inference on CPU or GPU (cuda + metal for mac) in case they have either

stack is electron-vite, react, llama-cpp using the node-llama-cpp bindings and model is still tbd but currently working with a fine tuned qwen2 500M

wow, that’s amazingly fast

Isn't this going to be a massive download or are you downloading the model within the client app and then working "offline"?

the app will ship without the model, which will be downloaded after you install it. how big is the app (w/o the model file)? it is 227mb (will work on bundle size later honestly)

very nice work! an integration like this done well has amazing potential

Looks nice! Lamafile but with a cleaner JS interface.

I’m really intrigued at using transformers js to do code autocomplete or something in the browser. Excited to follow along on this

what machine is this on?

@abacaj Have you tried or considered swift for the purpose. Lately been seeing lots of apps coming out of the swift native and its cross platform bindings.
