正在加载视频...

视频加载失败

first step was getting llama cpp to play nice with electron js so that we can run the model I fine tuned on the client, a couple errors but eventually got it wired up with node-llama-cpp bindings. this way the model + app can be shipped to the user...

16,499 次观看 • 2 年前 •via X (Twitter)

10 条评论

anton 的头像
anton2 年前

the nice thing about llama cpp is the user will be able to run inference on CPU or GPU (cuda + metal for mac) in case they have either

anton 的头像
anton2 年前

stack is electron-vite, react, llama-cpp using the node-llama-cpp bindings and model is still tbd but currently working with a fine tuned qwen2 500M

Stocko 👊🤖 的头像
Stocko 👊🤖2 年前

wow, that’s amazingly fast

Alloy🐍🍀 的头像
Alloy🐍🍀2 年前

Isn't this going to be a massive download or are you downloading the model within the client app and then working "offline"?

anton 的头像
anton2 年前

the app will ship without the model, which will be downloaded after you install it. how big is the app (w/o the model file)? it is 227mb (will work on bundle size later honestly)

Yam Peleg 的头像
Yam Peleg2 年前

very nice work! an integration like this done well has amazing potential

nigh8w0lf 的头像
nigh8w0lf2 年前

Looks nice! Lamafile but with a cleaner JS interface.

Caleb 的头像
Caleb2 年前

I’m really intrigued at using transformers js to do code autocomplete or something in the browser. Excited to follow along on this

el 的头像
el2 年前

what machine is this on?

Ravi Chandra Veeramachaneni 的头像
Ravi Chandra Veeramachaneni2 年前

@abacaj Have you tried or considered swift for the purpose. Lately been seeing lots of apps coming out of the swift native and its cross platform bindings.

相关视频