(Disclaimer: I don’t work for Cursor)
Are you using a large prompt size? I was experiencing similar symptoms with New AI Project stalling or giving weird results, which I talk about here: AI Project "Cognitive Computing" bork
Maybe try dropping your project prompt into the OpenAI Tokenizer to measure the size. I found that things got a lot more reliable if you keep it under about 1.5k tokens. Cursor devs are aware so no doubt they’re looking at improving the handling with larger prompts.