Codebase Indexing

suamai · August 23, 2023, 1:44am

Hey there, nice editor.
How does the Codebase Indexing feature works, for it to not store any of the code on the servers? And are the embeddings made through the API?
Thanks!

truell20 · August 23, 2023, 7:41am

Hey! The codebase indexing feature works by:

Chunking your codebase into small pieces locally
Sending each piece to our server which then embeds the code (either with OpenAI’s embedding API or by a custom embedding model)

The embeddings are stored in a remote vector DB, along with starting / ending line numbers and the relative path to that file. None of your code is stored in our databases. It’s gone after the life of the request.

You can turn off codebase indexing by going into settings (gear in the top right or “Cursor Settings” in the command palette).

Would be helpful to know if you’d prefer we do anything differently here. Happy to answer any questions.

JohnZ · August 24, 2023, 1:45pm

So it’s correct to assume that when Local mode is enabled, codebase indexing will still persist the vectors in a remote DB?

truell20 · August 24, 2023, 6:20pm

Yep.

If you prefer, you can turn off indexing in Cursor settings (command + shift + p, “cursor settings”). We also give people an option to turn off indexing in our onboarding flow.

RealityMoez · August 25, 2023, 12:12am

I don’t think they can do much with embedding vectors of our codebases.

RealityMoez · August 25, 2023, 12:14am

Using OpenAI’s embedding API of Cursor, not the user’s API … right?

truell20 · August 25, 2023, 12:17am

Yep

Mat · August 25, 2023, 3:17pm

If we are using private OpenAI key, with local mode enabled where is the vector database stored when doing codebase indexing?

If (with both of those settings enabled) it is still being stored on Cursors servers that should be changed to being stored locally. I would love some more information on that.

Additional suggestion for future - a page on the site with a data storage location matrix would be very helpful. For sensitive code we need to understand where its being store versus temporary in transit etc.

Thanks!!

PS. Absolutely loving Cursor. You guys are crushing it!

Ismael · August 25, 2023, 3:37pm

It would be useful to be able to exclude file from indexing for security reasons, event if not sent to cursor servers and only sent as embeds.

truell20 · August 25, 2023, 7:21pm

The vector DB will always be remote, though again no code is stored in it (if you turn on local mode, none of your code will be stored at-rest by us).

Here’s how you can turn off indexing.

I like this!

truell20 · August 25, 2023, 7:22pm

Like this idea too. Right now, we have a local heuristic scrubber that blocks any secrets/key from being sent of your computer (both for indexing and chat and command + K). But would be good to allow for more control here.

daaniyaan · August 25, 2023, 8:28pm

Where is the option to Turn off indexing completely?
because i’m not seeing it.
when open a project or create a new project it automatically start indexing.
and i have to remove it “after” it got synced.
what if there was a setting to “completely” turn it off and you only use the “+ index” option whenever you needed?

truell20 · August 25, 2023, 8:36pm

If you upgrade to the latest version of Cursor, there should be a button that says Advanced in the bottom left under indexing. If you click that, it’ll show you a toggle to turn off indexing.

daaniyaan · August 25, 2023, 9:10pm

isn’t this the latest update?

arvid220u · August 26, 2023, 11:19pm

Ah, apparently this toggle would only show up if you had a Git repo. Should be fixed in 0.8.5 — thank you for letting us know!

jitendravyas · September 5, 2024, 9:20am

Does what you wrote apply to Free plan users, or is it only for Business plan subscribers?

litecode · September 7, 2024, 12:55pm

For reference, here is related docs links for Codebase Indexing:

https://docs.cursor.com/context/codebase-indexing

Topic		Replies	Views
Where is the data generated by Codebase indexing stored locally? Discussions	3	934	October 12, 2024
Codebase indexing VS chat with codebase Discussions	7	3347	April 17, 2025
Codebase in Cursor How To	2	1016	November 20, 2023
Global codebase embeddings privacy Discussions	1	147	December 30, 2024
Indexing locally Discussions	1	247	September 17, 2024

Codebase Indexing

Related topics