Controlling LLM Context Exposure in Cursor: cursorignore, shell commands, and fine-grained allowlists

Hi everyone,

we’re currently evaluating Cursor in an environment with very strict requirements on which data is allowed to enter the LLM context. In short: there are files in our repositories that must never be consumed by the LLM under any circumstances.

Background

We rely on cursorignore to exclude sensitive files from the context, but in practice this protection seems limited to Cursor’s indexing / file selection layer. Once shell commands are involved, the situation changes:

  • Cursor can (and does) generate shell commands such as cat, grep, find, head, tail, etc.
  • These commands can read files that are excluded via cursorignore.
  • This can also happen autonomously, e.g. when Cursor analyzes tool output or tries to inspect files it believes are relevant.

We have experimented with several mitigation strategies:

  • Dev Containers / sandboxing
  • MCP servers with tightly controlled APIs for common tasks

While these approaches help with execution safety, they do not fully solve the problem of LLM context exposure, because files inside the workspace can still be read and their contents potentially end up in the model context.

As of today, the only way we can be confident that only intended data reaches the LLM is to manually approve every single command, which obviously limits automation and autonomy quite a bit.


Questions to the community

We would really appreciate learning how others handle this in practice:

  1. cursorignore enforcement
    • How do you ensure that files excluded via cursorignore cannot still be read via shell commands like cat, grep, etc.?
    • Are there workflows or configurations that reliably prevent this kind of leakage?
  2. Fine-grained allowlists
    • Are there examples of granular allowlists for shell commands?
    • For instance: allowing find, but explicitly disallowing -exec or similar flags that can spawn processes or execute arbitrary code.
    • Is anyone using patterns or setups that provide this level of control in a robust way?
  3. General patterns
    • Are there established best practices for running Cursor in high-compliance or high-sensitivity environments where LLM context boundaries are critical?

At the moment, it feels hard to reconcile:

  • strong guarantees about what the LLM is allowed to see, and
  • Cursor’s goal of being highly autonomous and proactive.

We’d love to hear how others are approaching this, or if there are recommended solutions (or upcoming features) we might be missing.

Thanks a lot!

1 Like