I recently referenced Cursor’s current tl;dr privacy policy in this post:
And I often wonder what the relationship is between these settings and actions:
01) Privacy Mode Enabled (which I toggle ON
)
02) Codebase Indexing (which I also toggle ON
)
Is item 01
negated/overridden if I enable item 02
?
My, possibly incorrect, understanding after writing this post is that:
-
Item
02
does not negate item01
-
Item
02
does involve storing the vector embeddings of your code in a vector database but not the code itself
I’m still not sure why the vector embeddings aren’t considered as important as the code - aren’t they just multidimensional numerical representations of the code text, and therefore could be ‘un-embedded’ by using the same embedding model that was used to embed them?
I googled this:
Can vector embeddings be converted back?
And it led to things like:
-
One OpenAI user’s guess about how embeddings work
-
Another OpenAI user’s reference to a paper that seemingly shows that vector embeddings can be inverted
-
Microsoft’s article on What are Vector Embeddings?
I’ll leave it to someone with more knowledge than me to provide an authoritative answer .
In regard to your question about enforcing Privacy Mode, I am a Pro user and I can toggle that setting on or off, so I am assuming the Business plan enables admins to enforce this setting for all their users.
Links for additional reference:
For better and more accurate codebase answers…you can index your codebase. Behind the scenes, Cursor computes embeddings for each file in your codebase, and will use these to improve the accuracy of your codebase answers.
https://docs.cursor.com/context/codebase-indexing
Does indexing the codebase require storing code?
It does not! If you choose to index your codebase, Cursor will upload your codebase in small chunks to our server to compute embeddings, but all plaintext code ceases to exist after the life of the request.
The embeddings and metadata about your codebase (hashes, file names) are stored in our database, but none of your code is.
https://docs.cursor.com/miscellaneous/privacy#does-indexing-the-codebase-require-storing-code
With
Privacy Mode
, none of your code will ever be stored by us or any third-party (except for OpenAI which persists the prompts we send to them for 30 days for trust and safety, unless you’re on the business plan). Otherwise, we may save prompts / collect telemetry data to improve Cursor.
https://docs.cursor.com/miscellaneous/privacy#what-is-privacy-mode
Posts for additional reference: