Hey! The codebase indexing feature works by:
- Chunking your codebase into small pieces locally
- Sending each piece to our server which then embeds the code (either with OpenAI’s embedding API or by a custom embedding model)
The embeddings are stored in a remote vector DB, along with starting / ending line numbers and the relative path to that file. None of your code is stored in our databases. It’s gone after the life of the request.
You can turn off codebase indexing by going into settings (gear in the top right or “Cursor Settings” in the command palette).
Would be helpful to know if you’d prefer we do anything differently here. Happy to answer any questions.