Using Local LLMs with Cursor: Is it Possible?

calmhaycool · September 7, 2024, 7:35pm

Using Local LLMs with Cursor: Is it Possible?

Hello Cursor community!

I’ve been exploring the idea of using a locally installed Large Language Model (LLM) with Cursor instead of relying on cloud-based services. I’m particularly interested in using a Llama LLM for coding in the future. Has anyone else considered this or know if it’s feasible?

Current Situation

From what I understand, Cursor is designed to work with cloud-based AI services, specifically OpenAI’s API and Claude AI. This means we’re currently limited to using these cloud-based LLMs.

Potential Benefits of Local LLMs (like Llama)

Privacy: Keep sensitive code data local
Customization: Potentially fine-tune models for specific coding styles or domains
Offline Use: Work without an internet connection
Cost: Possibly reduce long-term costs for heavy users
Flexibility: Use open-source models like Llama that can be adapted for specific needs

Questions for the Community

Is there any way to integrate a local LLM (such as Llama) with Cursor currently?
If not, is this a feature the Cursor team has considered implementing?
What challenges might prevent this integration?
Would you be interested in using a local LLM with Cursor if it were possible?
Has anyone experimented with Llama or other open-source LLMs for coding tasks? What was your experience?

I’m planning to install a Llama LLM for coding in the future, and I’m curious if others have similar plans or experiences. If anyone from the Cursor development team is reading, we’d love to get your perspective on potentially supporting local LLMs alongside the current cloud-based options.

Let’s discuss!

deanrie · September 7, 2024, 7:44pm

Hi @calmhaycool,

You can try this https://www.cursorlens.com/
But I’m not sure how well it works now.

averman · September 9, 2024, 5:37am

I have use custom api with cursor. You can use any api that use openAi api schemes, including local ones. Small problem though, cursor do not recognize localhost address. I’ve circumvent this with tunnelling the local address onto a cloud tunnel like using ngrok, then put the link that ngrok give to the custom api endpoint in cursor models settings.

Although I’m not using local llm directly, but I’m running litellm proxy server locally. I use this to reroute custom model to several llm provider such as togetherai, openrouter, infermatic, etc. I’m frequently using Llama 3.1 70B instruct and wizardLM2 8x22B MoE. It can used for cmd+K and chat, but unfortunately not for composer or tab completion afaik

HESyang · September 12, 2024, 11:28pm

Thanks for sharing it. I will give it a try. It could be a great backup with Llama 3.1 70B, when the internet is off.

fuxi · September 13, 2024, 7:18am

Thanks for providing this information, I’m just trying to add our LLM(llama3-70b) located in our on-prem server through custom api with cursor. I can add this model in model list of cursor by specifying openai base url but fail to talk and receive : "the model does not work with your current plan or api key " , which makes me so confused. So it seems that I need to access it by ngrok also?

fuxi · September 13, 2024, 7:19am

Have you done it successfully?

averman · September 14, 2024, 2:21pm

Yes, it can be used

input the model name correctly (case sensitive & whitespace sensitive)
enable OpenAi custom api
input override OpenAi base url with your server (it needs to be publicly accessible)
Change the model name in cmd + K or chat

renwei-work · September 18, 2024, 10:32am

Thanks for sharing, and I have a follow up question to @deanrie and Cursor team: if we use customize LLM with Cursor, do we still need to acquire account/payment to Cursor? It looks like Cursor is the one to request the LLMs.

deanrie · September 18, 2024, 11:38am

To use the API key, you don’t need a PRO subscription, but you do need an account with the free plan. You can enter your API keys in the Cursor settings and use it. However, in this case, you’ll only have access to the main chat and the inline chat (cmd+k), and you won’t be able to use Composer and Cursor Tab.

renwei-work · September 18, 2024, 1:54pm

Thanks! This answers my question! We will evaluate more!

reaperhammer · October 3, 2024, 2:00pm

I’m really trying to understand why the privately hosted LLM needs to be publicly accessible for this to work, can anyone share the reasons for that?

averman · October 9, 2024, 9:25am

The actual call to the LLM is not from cursor app, but from the server (this is my guess, because for each query it calls the cursor server). There might be some RAG optimization or other prompt engineering happening that is only happening there? Or perhaps result parsing to accommodate git-like change visuals?

Dropbear · October 16, 2024, 11:34pm

I think you’re on the right path. I watched an interview with the founders and they mentioned how codebase indexing is something that happens on their side, and is stored in their database. I also think they do some magic with the prompt, call caching, etc. that’s happening on their side, not in your local application. While not there, yet, they’ve also said something about using methods to index and transfer only encrypted data, but that doesn’t seem to be how it’s done, now.

bojan · October 23, 2024, 10:15am

So using Cursor with local LLMs diminishes the main points of using local LLMs?

averman · October 28, 2024, 9:57am

So using Cursor with local LLMs diminishes the main points of using local LLMs?

I don’t think offline use works with cursor. But you can still use local LLM with online use, so that cursor still use their API for indexing, apply, etc, but use your local LLM for the main inference.

But this is not because of you want to keep things offline, rather you want to use your own model. For example: let say I have a model that’s extensively trained in a modding language for Crusader Kings 3, I want to use this model rather than public gpt/claude because I have fully finetuned it to my usecase. (point 2/5) or you have a cheaper model that’s faster like I’m using infermatic often (point 4).

But yes, point 3 is impossible, while point 1 we need to trust cursor with their data policy

gumshoe2099 · November 13, 2024, 8:49pm

I don’t see any good reason why this should not be doable, or made doable.

Any RAG or prompt optimization can be made client-side and the cache either made local or cut entirely.

We should be able to use this on international flights, which is not possible right now.

Olney1 · November 21, 2024, 3:12am

I would love to see the native features that Cursor does so well expand to local LLMs. It would be a great addition to the Pro plan.

yawnxyz · November 27, 2024, 12:07am

yeah same… getting on a flight soon and would love to use Cursor w/ a local coding model

would be cool if it at least worked w/ cmd+k inference; don’t need the agent stuff for now

wheattoast11 · November 27, 2024, 12:28am

Probably because it’s an easy way to generate synthetic data on indices and train off those.

I doubt there’s any other reason to keep it server side tbh

gumshoe2099 · November 27, 2024, 6:03am

That would imply they are using our data or metadata without recompensing us, which is not great.

Topic		Replies	Views
Local LLM - Keeping us down Discussions	3	429	June 5, 2025
Deepseek R1 self hosting Discussions	3	1460	January 28, 2025
Make Local Hosting on LLAMA 3.1 Nemotron 70B Possible Feature Requests	10	1099	June 5, 2025
Support local LLM's Feature Requests	50	18863	June 5, 2025
Local LLM support? Source code? Discussions	7	578	March 25, 2025

Using Local LLMs with Cursor: Is it Possible?

Using Local LLMs with Cursor: Is it Possible?

Current Situation

Potential Benefits of Local LLMs (like Llama)

Questions for the Community

Related topics