I feel that when asking a question, that once the LLM starts to answer, the ansewr should run to completion. The chatbot should never freeze (at worst, a message should be printed to the user stating what’s going on.
Steps to Reproduce
I doubt this can be reproduced since it is strongly dependent on token usage, context, OS, etc.
Hi, I am using “auto”. It is possible that this happens once I have exceeded allocation (which I am not warned about). I quit Cursor and get back in, and can get another 1-2 messages, and then it happens again. I then changed to Sonnet 4, and the problem keeps occurring. I suspect the issue would happen with most models. Never in thinking or Max mode.