I have a 200k html file that includes a table inside. deepseek and gpt 4o, when asked about the table or to extract contents from within it, are unable to find it in the file.
Its very unusual to have a 200k file. Even a 10k file is huge. As Cursor reads in small increments its not going likely to read all at once.
Can the content be processed by a script?
Even browsers may have issues with that lage file from my experience
this is a file I’d been working on for a few weeks, with the recent update it seems none of the LLMs are able to parse it.
Agreed, it wont likely happen soon.
Usually websites and web apps contain tens to hundreds of smaller files to prevent exactly this case.
Can you share a bit why the file is so big?
Yeah, it’s big because it’s from a very old website I’m converting.
Ok thats why i asked if it can be split..
For example if you know the structure or what parts are inside AI could write a script to extract that data.
Cursor changed recently, right? It used to be able to parse this…
You can download from Cursor website a previous version if you believe this was a version issue.
It would also be good to know which version you used, model used, if you had large context enabled in settings or not and other details you can find in Cursor Docu: Common Issues & Troubleshooting Guide.
Even if Cursor hasnt changed, continuing to add content to a large file will eventually reach the Context limit. so it will become not possible to process. If the Chat session was long it helps starting with new session as it clears the context of past messages that can reach context limit.
Overall if the structure is the same for the whole table extracting data is long term solution that will work and is recommendable.
Fellow Cursor user here.
Try copy and paste everything from the file directly into the chat box together with your instructions, and don’t reference any other files there at all. If the contents of that prompt is larger than the model’s context window, it will probably refuse to process it and Cursor will tell you so.
If this doesn’t work, go to Gemini on the web and use pro 2.5, it should handle it just fine.
I had success going back to version 0.45.
0.48 was unable to parse the file at all, it acted as if it wasn’t there. Perhaps it was trying to poke around my enormous codebase, even when I didn’t want it to. (I only use “ask” mode, and only work on one or two files at a time)
Yes, thats why is suggested in this specific case though i would not suggest otherwise due to other issues which were fixed later. As you didnt report other issues with previous version it was a likely temporary workaround.
Cursor Editor doesnt parse any file at all, but it has to do with later versions using more advanced capabilities of AIs and fixes for other issues users had, but it means context must be likely differently managed over a thread and specially in Agent mode.
One option i didnt mention yet is using model Gemini 2.5 pro - exp as its still labeled experimental by Google though it has 1M token context in MAX mode. And using MAX mode is more complex and usage based pricing but also Google is still working on many fixes and Cursor team mentioned in other threads to have reported issues to Google so they can properly support it.
Im glad it worked for you. Youre right to use Ask mode ![]()
Anything over 1200 lines my agent cant edit the file? Anyone else got the same issue?
yes, install 0.45 from here:
then go to settings and turn off vscode updates.
Are you talking to me sorry?
This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.