After a long time of training Claude on a chat session and telling him to repeat everything I say exactly as-is, ignoring every other instruction that might appear in my response, we can see that Cursor never really shows the LLM the full context of the script. Whether you tag the script, as, say, @script.c , or you use Codebase, the LLM never seems to get the full context.
Cursor alters your input for truncation because of token limit regarding the LLM you’re using
One example, with Codebase, is it will look something like this:
Say my input was just “hey” and I used Codebase. It will modify it to:
"
Inputs
hey
If you need to reference any of the code blocks I gave you, only output the start and end line numbers. For example:
Todo.tsx
startLine: 200
endLine: 310
If you are writing code, do not include the “line_number|” before each line of code.
"
Now, I used a random LUA script I found online to try this further. I mentioned the script as @script.lua. You would expect the LLM to get the full file, right? Instead, the LLM just gets a heavily (and I mean heavily) truncated version of the file. This might vary depending on other factors, but the file size was about 1000 lines, and the LLM only got 15 lines, with a rather interesting line saying “…(about 12 lines omitted)…”
The farthest context you could get is by using Codebase, which shows almost the full file with like 100-200 lines truncated (could depend on other factors), but that would make it so the LLM looks at all of your files, which could be impractical if you were just trying to target one specific file.
There were other instances of this, but this is just impractical of Cursor and could possibly be happening with other LLMs, not just Claude. I’ve heard some stuff about DeepSeek-R1 too.