Anyone else notice that the composer seems to have gotten a lot less intelligent (usiing claude-20241022) in the past day or 2? It was reaching near insane level of good and the last day or so just goes off the deep end in to randomland all the time. Wondering if a context window was shortened or something?
Hey, does this also happen in new Composer sessions? If your session lasts for a long time, the context window might overflow, and the model may start hallucinating. You should start a new session occasionally to keep the work fresh, and you’ll also need to add the necessary context in the new session.
Yes, most recently (today) its been off the charts making things up even in short sessions whereas before even very long sessions used to remain fairly coherent.
Yes… for about 2 weeks (after the december release) it was insanely good and could solve most complex tasks end-to-end with little supervision. Now it looses track and struggles to do even the most basic things. It’s super frustrating because since it’s not possible to know what is being sent to the LLM it’s impossible to know when the agent will loose track of what it was doing and start going on a rampage/become completely useless. I imagine the devs realized that the agent mode was causing extremely high costs (which I can understand) and just decided to drastically reduce context size as a stopgap… please don’t do that and just be transparent on pricing. I’d gladly pay 2x (or 4x really) more for cursor but the fact that reliability and performance is so hit-and-miss is making me seriously reconsider whether it’s a good tool for me and my team.
Also… you guys should monitor curse words in user prompts and plot them on a graph, you may find changes quickly that way
I suspect it wasnt cost, it was the bug where things would get long and it would just hang forever on some chats they’re trying to solve.
This is a huge problem. Its actually doing the dumbest stuff Ive ever seen in the last week. Must be falling back to a super low performance model because it completely broke my website. 70% of the work gone. Have to start over. Ive tried everything, context, work docs, pre-boiler plate instructions and it simply ignores EVERYTHING. It doesnt even reflect on the past 30 minutes of work. Whatever you guys did on the backend, please restore to when its was actually good!!!
Im at a point where Im about to cancel the sub and wait for a better agentic tool to surface. Literally poured 100+ hours trying to fix these issues.
hahaha it pretty much admitted it cant follow the instructions/context etc.
Hey, best recommendation right now would be to routinely start new Composer sessions every so often, especially when jumping to new areas of your codebase or starting a new “task”.
We’ve found the performance of the LLM to fall off sharply once the chat history and/or the context provided to the AI gets too long, and the outputs start to become generic or not actually fix the issue you are trying to solve!
It was handling this better before… 200k windows? Perhaps the composer window itself could suggest when the window is now too large
e.g.:
(!) The composer history is now exceeding where the LLM can reason effectively, please consider starting a new chat session
Definitely could be more reporting in this to make it clear when things may be degrading!