Data Retention in the Business Plan

osmiumOs76 · August 28, 2024, 12:21pm

I’m thinking about using your business plan but I would like to clarify the data retention practice.

You say that in privacy mode you won’t retain any data from my repository for longer than the request duration and OpenAI/Anthrpoic will do the same.

Is that true also when I let you index all of my code?

And what does “enforcing” of the privacy mode mean? Is the privacy mode also available in other plans?

Thank you for your info

msc · August 28, 2024, 1:42pm

I’m interested in this as well for the Enterprise package.

osmiumOs76 · August 28, 2024, 3:58pm

What is the Enterprise package? There are only Free, Pro and Business on their Pricing page

msc · August 28, 2024, 4:43pm

I meant Business plan.

osmiumOs76 · August 28, 2024, 4:59pm

OK, thx for the info

litecode · August 28, 2024, 5:45pm

I recently referenced Cursor’s current tl;dr privacy policy in this post:

And I often wonder what the relationship is between these settings and actions:

01) Privacy Mode Enabled (which I toggle ON)

02) Codebase Indexing (which I also toggle ON)

Is item 01 negated/overridden if I enable item 02?

My, possibly incorrect, understanding after writing this post is that:

Item 02 does not negate item 01
Item 02 does involve storing the vector embeddings of your code in a vector database but not the code itself

I’m still not sure why the vector embeddings aren’t considered as important as the code - aren’t they just multidimensional numerical representations of the code text, and therefore could be ‘un-embedded’ by using the same embedding model that was used to embed them?

I googled this:

Can vector embeddings be converted back?

And it led to things like:

One OpenAI user’s guess about how embeddings work
Another OpenAI user’s reference to a paper that seemingly shows that vector embeddings can be inverted
Microsoft’s article on What are Vector Embeddings?

I’ll leave it to someone with more knowledge than me to provide an authoritative answer .

In regard to your question about enforcing Privacy Mode, I am a Pro user and I can toggle that setting on or off, so I am assuming the Business plan enables admins to enforce this setting for all their users.

Links for additional reference:

For better and more accurate codebase answers…you can index your codebase. Behind the scenes, Cursor computes embeddings for each file in your codebase, and will use these to improve the accuracy of your codebase answers.

https://docs.cursor.com/context/codebase-indexing

Does indexing the codebase require storing code?

It does not! If you choose to index your codebase, Cursor will upload your codebase in small chunks to our server to compute embeddings, but all plaintext code ceases to exist after the life of the request.

The embeddings and metadata about your codebase (hashes, file names) are stored in our database, but none of your code is.

https://docs.cursor.com/miscellaneous/privacy#does-indexing-the-codebase-require-storing-code

With Privacy Mode, none of your code will ever be stored by us or any third-party (except for OpenAI which persists the prompts we send to them for 30 days for trust and safety, unless you’re on the business plan). Otherwise, we may save prompts / collect telemetry data to improve Cursor.

https://docs.cursor.com/miscellaneous/privacy#what-is-privacy-mode

Posts for additional reference:

msc · August 29, 2024, 5:18am

Thank you! I’ll be in touch about the enterprise.

osmiumOs76 · August 29, 2024, 2:36pm

Ah, OK. So “enforcing privacy mode” is a central admin feature to enforce it for all users, got it.

Thank you for your comprehensive analysis, I get quite a good picture from this.

I think this is a sufficient level of security for me. These guys are a startup, so I’m a bit afraid that they are not so experienced in terms of IT security of their systems.

But I’m not too much concerned that someone is putting in the time and effort to reconstruct our code from the embeddings, even if there is a breach and someone is getting their hands on the data.

That would be a different story if our code would be stored in plain text somewhere on their servers.

Thanks a lot

Topic		Replies	Views
Is there anyone who uses cursor at work or only personal projects? Discussions	2	4239	December 17, 2023
Privacy Mode and Codebase Indexing Discussions	2	1275	September 8, 2024
Question regarding cursor's code privacy Discussions	2	1184	February 26, 2025
Concerns about Privacy Mode and Data Storage Discussions	15	13333	November 24, 2024
Codebase Indexing Discussions	16	17091	September 7, 2024

Data Retention in the Business Plan

Related topics