Concerns about Privacy Mode and Data Storage

Hi everyone,

I’m currently using Cursor.sh to assist with my development work on a client project. My client has raised concerns about potential NDA violations and data protection, even though I’ve been using Privacy Mode to prevent data collection.

Specifically, my client wants assurance that enabling Privacy Mode ensures that no data is saved or stored in Cursor.sh’s database when scanning files on my local computer. I’ve reached out to the Cursor.sh team for documentation or evidence to verify this, but I was wondering if anyone here has dealt with similar concerns or has any insights.

Have any of you had to prove to your clients that Privacy Mode effectively prevents data storage? How did you go about getting this assurance? Any documentation, experiences, or advice would be greatly appreciated.

Thank you in advance for your help!

4 Likes

You can read a new page on privacy.
Click [ Full Policy ] toggle.

In addition, Business plan includes OpenAI’s zero-data retention policy.

OpenAI’s zero-data retention policy refers to a practice where OpenAI does not retain any personal data or conversation history provided by users through its services. This policy ensures that:

  1. No Data Storage: Any data sent to OpenAI’s servers is not stored permanently. Once the interaction is completed, the data is deleted and not saved in any database or storage system.

  2. Privacy Protection: This policy enhances user privacy by ensuring that their information is not retained or accessible for future use. It minimizes the risk of data breaches and unauthorized access.

  3. Compliance: It helps in complying with various privacy laws and regulations that require minimizing data retention and protecting user privacy.

  4. Security: By not retaining data, OpenAI reduces potential security vulnerabilities associated with data storage.

Overall, OpenAI’s zero-data retention policy is designed to protect user privacy and data security by ensuring that personal data is not stored or retained beyond the duration of the interaction.

3 Likes

Thanks @kinopee but I need a piece of proper evidence to prove that when I turn privacy mode on, Cursor won’t store any of my code to the DB, your privacy page alone doesn’t convince my client enough :cry:

This is a privacy hell. Under no circumstances, you are okay using cursor without getting your client’s okay. Not only they will pass the prompt to other servers, they will also use their servers to proxy and you cannot guarantee data residency will be honoured. This is the major limitation of Cursor. I doubt anyone in the EU and doing any kind of business with customer data can use this.

Just adding the recent tl;dr version of the privacy policy at:

https://www.cursor.com/privacy

TLDR

  • If you enable “Privacy Mode” in Cursor’s settings, none of your code will ever be stored by us or any third-party (except for OpenAI and Anthropic, which persist the prompts we send to them for 30 days for trust and safety. Business plan users’ data will not be retained at all by OpenAI or Anthropic.)

  • If you choose to keep Privacy Mode off, we may save prompts / collect telemetry data to improve the product. If you use autocomplete, Fireworks (our inference provider) may also collect prompts to improve inference speed.

Other notes

  • Even if you use your API key, your requests will still go through our backend! That’s where we do our final prompt building.

  • If you choose to index your codebase, Cursor will upload your codebase in small chunks to our server to compute embeddings, but all plaintext code ceases to exist after the life of the request. The embeddings and metadata about your codebase (hashes, file names) may be stored in our database, but none of your code is.

I don’t get the sense that the Cursor team are ‘bad guys’, they always seem to be open and transparent about what they are building, how they are doing it, and how their services work, but for their benefit, and that of users, I’d love to see a published checklist of business security ‘must haves’ to facilitate adoption and onboarding and save everyone’s time when having related discussions.

To be honest, I find even the big players aren’t very good with explaining their security practices ‘precisely’, I’ve already had too many discussions essentially speculating on what, exactly, is occurring, because the docs aren’t clear enough.

3 Likes

I’m interested in the Enterprise package is they get Claude back on line.

One of the questions I have is if there is a way to remove the index / embeddings on the server? I know there is the delete index option in the editor, but this does not delete the embedding on the server (the resync happens immediately).

Was also wondering what happens when I delete my account in the profile settings. Do the embeddings then get removed from the server?

Did this change, does anyone know they removed the clause

(except for OpenAI and Anthropic, which persist the prompts we send to them for 30 days for trust and safety. Business plan users’ data will not be revealed at all by OpenAI or Anthropic.)

Correct! Zero data retention with OpenAI and Anthropic is now available for all users that have privacy mode enabled, regardless of whether on the business plan or not.

4 Likes

Is it possible to update this as well. Seems like it is still exclusive to business

1 Like

You have to have more than one users , so at least 2 people for $80 per month

It comes with the free tier:

2024-09-19 12_21_11-main.py - skeletoror - Cursor

Don’t forget the fact that Cursor is SOC 2 Certified.

3 Likes

The computed embeddings from your code is stored on Cursor servers and contain a lot of information bits about your code. They currently don’t seem to address the concerns and think these embeddings are not their customer’s data. That is problematic and they need to fix it and give more control over it to their customers.

Here is the section from their current privacy policy (Privacy Policy | Cursor - The AI-first Code Editor):

If you choose to index your codebase, Cursor will upload your codebase in small chunks to our server to compute embeddings, but all plaintext code ceases to exist after the life of the request. The embeddings and metadata about your codebase (hashes, file names) may be stored in our database, but none of your code is.

Until they do so, this is not really enterprise ready.

Here are some more questions that need clarification before our security folks can allow its usage at work:

That said, they have done a very good job on the security side, their explanation of their security story is one of the best I have seen.

1 Like

Expectations of privacy are just that. You especially have to wonder about OpenAI, not only strange was the string of senoir leadership that quit earlier this fall, and Sam’s weird identity crisis as the leader, but also the former head of the NSA joined the OpenAI board of directors.

When sending anything to Anthropic, or OpenAI, I think you can be assured no matter what their respective corporate policy says it is, I ma certain your prompts become property of the NSA. Unless you are writing rules for deep packet DNS inspection, you won’t be affected by it.

The calls to the model APIs is less concerning.

For enterprises usage, you can use the Azure version of OpenAI models, or Google Cloud VertexAI version of Google models and Antrhopic models, which come with top-level security and data handing protections.

That should be good enough for most enterprise customers for that part.