正在加载视频...
视频加载失败
LET'S GO! Cursor using local 🤗 transformers models! You can now test ANY transformers-compatible LLM against your codebase. From hacking to production, it takes only a few minutes: anything `transformers` does, you can serve into your app 🔥 Here's a demo with Qwen3 4B:
11 条评论

We've iterated on our local `transformers serve`, a server with `transformers` backend, and it now supports more advanced requests -- including the requests from Cursor. Testing new models, quantization methods, KV caches, decoding methods, (...) should be much easier now 🫶

👉5-minute instructions to replicate this demo: (this link will die at some point, and the following will work: 👉The PR where it happened:

If this works as well as it sounds, it’s really going to make so many things possible.

Spent much of the day working with LM Studio and Ollama, Mistral 7b. They are getting a lot better. I'm doing a serious build for local AI next month.

This has been possible for a while but Cursor still makes calls out to their servers. What happens if you turn off your internet connection?

To go fully offline, a different IDE has to be used :(

How well does this fully work? IIRC last I checked @cursor_ai strongly advised against doing self-hosted models? @srush_nlp has that changed?

@ClementDelangue Does that mean I can run cursor fully offline, and point at a local endpoint on my network?

@ClementDelangue Sadly no -- Cursor makes requests through their server (i.e. your request + codebase -> cursor server -> llm -> cursor server -> your cursor app) The best would be to use a different IDE.

It’s either any open-ai compatible endpoint and token+model settings and no data being sent to cursor’s servers or nothing

sadly data still goes to cursor :( see my comment here







