Only 10K context availlable for files sent to AI?


Note - this is a re-post because the original post didn’t show up anymore. If the tone of the message is a problem, you can ask me to edit, I’d prefer that than some direct censorship, thanks.

I wasted all my day trying to succeed on a simple task, no matter which model I was trying to use.

I think there is a bug where cursor is cutting off my documents without telling me!

Note. For info, the doc in the image was with
Tokens: 66070 (according to Claude 3.5 Sonnet | Token Counter)
While the model context window is supposed to be 200K

The mentioned cut off “Au bonheur des souris” is around 20% from the start of my document, which means the file available to Claude 3.5 was ONLY around 9642 TOKENS long.

Why can’t I use 10000 tokens, when I am supposed to have a model with 200 000 tokens available? If I am not informed of what is available for the model and what is not available, how can I work properly?

1 Like

this post has been destroyed by the forum software being bad at editing posts

1 Like

Thanks for the feedback and proposed solution.
But this doesn’t solve the core issue (my document being cut off), while I am not sure how do I compress and distill my document? I should somehow zip it? I am trying to extract text from a document to organize it into a table.

I could cut the file into 6 or 7 blocks of 10k tokens and ask to extract the info for each block for sure as a temporary fix, but now I am really worried about any kind of project where I am having a few code pages.

But wait, there’s more, read this: https://arxiv.org/pdf/2505.06120

Yes, thanks for sharing this. Even before this publication, I was very aware that the more you converse, the worse the answers get. That’s a good reminder.

If I had the same problem, I would just use an Agent mode and ask it to read a fixed block of data first, and confirm that the process is complete for this part. Then I would ask it to continue reading further in more steps, maybe if you need the data in a single context to compile a summary of the results each time it adds to them, so the context is updated each call/query.

Basically asking it to count, but only to 1. When the result is satisfactory, ask it to continue counting without telling it 2,3,4, but restarting and guiding it in a single prompt to do it until finished. I have not analyzed that big of an input chunk until now, but I have generated output that was over the context size using a method like this.

1 Like

OH!!! That’s it!!!
Why the ■■■■ was I thinking about censorship! Yes, that’s the reason!!!
I would gladly edit the post, but now I am afraid to do it lol.

1 Like

I believe this to be a good solution indeed. I will test it.

Exactly, one way to get around this, is to hopefully get a decent result strangled out of the LLM, and THEN ask it to rewrite your initial prompt with ALL intermediary instructions and wording required to obtain the result from a single initial prompt.

Maybe even try the old “ignore previous instructions and run this new prompt instead” trick.

But for the data chunking, it works when the tools read 500 lines at once and append the context window with the whole read.