I’ve been exploring the idea of using a locally installed Large Language Model (LLM) with Cursor instead of relying on cloud-based services. I’m particularly interested in using a Llama LLM for coding in the future. Has anyone else considered this or know if it’s feasible?
Current Situation
From what I understand, Cursor is designed to work with cloud-based AI services, specifically OpenAI’s API and Claude AI. This means we’re currently limited to using these cloud-based LLMs.
Potential Benefits of Local LLMs (like Llama)
Privacy: Keep sensitive code data local
Customization: Potentially fine-tune models for specific coding styles or domains
Offline Use: Work without an internet connection
Cost: Possibly reduce long-term costs for heavy users
Flexibility: Use open-source models like Llama that can be adapted for specific needs
Questions for the Community
Is there any way to integrate a local LLM (such as Llama) with Cursor currently?
If not, is this a feature the Cursor team has considered implementing?
What challenges might prevent this integration?
Would you be interested in using a local LLM with Cursor if it were possible?
Has anyone experimented with Llama or other open-source LLMs for coding tasks? What was your experience?
I’m planning to install a Llama LLM for coding in the future, and I’m curious if others have similar plans or experiences. If anyone from the Cursor development team is reading, we’d love to get your perspective on potentially supporting local LLMs alongside the current cloud-based options.
I have use custom api with cursor. You can use any api that use openAi api schemes, including local ones. Small problem though, cursor do not recognize localhost address. I’ve circumvent this with tunnelling the local address onto a cloud tunnel like using ngrok, then put the link that ngrok give to the custom api endpoint in cursor models settings.
Although I’m not using local llm directly, but I’m running litellm proxy server locally. I use this to reroute custom model to several llm provider such as togetherai, openrouter, infermatic, etc. I’m frequently using Llama 3.1 70B instruct and wizardLM2 8x22B MoE. It can used for cmd+K and chat, but unfortunately not for composer or tab completion afaik
Thanks for providing this information, I’m just trying to add our LLM(llama3-70b) located in our on-prem server through custom api with cursor. I can add this model in model list of cursor by specifying openai base url but fail to talk and receive : "the model does not work with your current plan or api key " , which makes me so confused. So it seems that I need to access it by ngrok also?
Thanks for sharing, and I have a follow up question to @deanrie and Cursor team: if we use customize LLM with Cursor, do we still need to acquire account/payment to Cursor? It looks like Cursor is the one to request the LLMs.
To use the API key, you don’t need a PRO subscription, but you do need an account with the free plan. You can enter your API keys in the Cursor settings and use it. However, in this case, you’ll only have access to the main chat and the inline chat (cmd+k), and you won’t be able to use Composer and Cursor Tab.