I have noticed that whenever i prompt for multiple files edit or creation with each file lets say 1000 or more lines of code, even for pretty templated code with clear instructions more of copy paste and modify a file name formatting and just following same template, the model does first few files as expected and later start to try to condense code, trying to make multlines single line, try to cut corners start to miss functions etc
Most times it will do first few files and stop and tell me to follow same and do rest of files myself. I have to fight it couple of times to complete all files as instructed. Pretty much telling me it is here for just few files and basic tasks only. I have tried this with all of the models basically and they all do same, so could be a cursor thing.
So my question to you agentic coding and background coding hype machines, how are you using cursor or these models to perform tasks in the background for you when even under supervision these things can barely do things right for a few minutes???
Tried the models from the big 3(openai, anthropic and google)
Programming language: Golang
Prompt used: same prompt i use for regular tasks and same prompt that make it work fine on first few files fine with no issues before going rogue on its own. And prompt is clear to complete for all files and follow same format, and it ends up doing this after back and forth till it finally does and couple more prompts to fix the mess it created along the way
Project complexity: I mentioned each file like 1000 to 2000 lines of code, but mostly templated and just same format. pretty much clear direction of what to do and mostly copy and paste and few changes between files. I will say medium complexity given the clean codebase and clear direction of what to do
From my experience when the number of lines of code is getting larger, it struggles a lot more and also when number of files is large it struggles more
So lets say 5 files with like 500 lines of code each, the it is fine with it, when you increase that to 50 files and 500 lines each then it starts to do what i described and start to give up after a few files or start to cut corners for the next files. Will even stop sometimes and tell you to continue or finish up yourself.
When it is like 50 files and 3000 lines of code each, you might as well forget expecting things to be done right.
try it with 25 files to add and each 1000 lines of code, you could just generate some sample code to test this and report back and let me know if it completed all 25 files cleanly with no stopping telling you to finish up or cutting corners after first 5 files