Agent lost some IQ points last few days?

charles · January 13, 2025, 3:34am

Anyone else notice that the composer seems to have gotten a lot less intelligent (usiing claude-20241022) in the past day or 2? It was reaching near insane level of good and the last day or so just goes off the deep end in to randomland all the time. Wondering if a context window was shortened or something?

deanrie · January 13, 2025, 5:50pm

Hey, does this also happen in new Composer sessions? If your session lasts for a long time, the context window might overflow, and the model may start hallucinating. You should start a new session occasionally to keep the work fresh, and you’ll also need to add the necessary context in the new session.

charles · January 23, 2025, 6:44pm

Yes, most recently (today) its been off the charts making things up even in short sessions whereas before even very long sessions used to remain fairly coherent.

ldorigo1 · January 23, 2025, 7:08pm

Yes… for about 2 weeks (after the december release) it was insanely good and could solve most complex tasks end-to-end with little supervision. Now it looses track and struggles to do even the most basic things. It’s super frustrating because since it’s not possible to know what is being sent to the LLM it’s impossible to know when the agent will loose track of what it was doing and start going on a rampage/become completely useless. I imagine the devs realized that the agent mode was causing extremely high costs (which I can understand) and just decided to drastically reduce context size as a stopgap… please don’t do that and just be transparent on pricing. I’d gladly pay 2x (or 4x really) more for cursor but the fact that reliability and performance is so hit-and-miss is making me seriously reconsider whether it’s a good tool for me and my team.

charles · January 23, 2025, 7:14pm

Also… you guys should monitor curse words in user prompts and plot them on a graph, you may find changes quickly that way

charles · January 23, 2025, 7:15pm

I suspect it wasnt cost, it was the bug where things would get long and it would just hang forever on some chats they’re trying to solve.

well-this-sucks · February 1, 2025, 9:55am

This is a huge problem. Its actually doing the dumbest stuff Ive ever seen in the last week. Must be falling back to a super low performance model because it completely broke my website. 70% of the work gone. Have to start over. Ive tried everything, context, work docs, pre-boiler plate instructions and it simply ignores EVERYTHING. It doesnt even reflect on the past 30 minutes of work. Whatever you guys did on the backend, please restore to when its was actually good!!!

Im at a point where Im about to cancel the sub and wait for a better agentic tool to surface. Literally poured 100+ hours trying to fix these issues.

hahaha it pretty much admitted it cant follow the instructions/context etc.

danperks · February 4, 2025, 1:47am

Hey, best recommendation right now would be to routinely start new Composer sessions every so often, especially when jumping to new areas of your codebase or starting a new “task”.

We’ve found the performance of the LLM to fall off sharply once the chat history and/or the context provided to the AI gets too long, and the outputs start to become generic or not actually fix the issue you are trying to solve!

charles · February 5, 2025, 11:00pm

It was handling this better before… 200k windows? Perhaps the composer window itself could suggest when the window is now too large
e.g.:

(!) The composer history is now exceeding where the LLM can reason effectively, please consider starting a new chat session

danperks · February 7, 2025, 2:26am

Definitely could be more reporting in this to make it clear when things may be degrading!

Topic		Replies	Views
Agent hallucinations crazy yesterday/today Discussions	1	94	January 23, 2025
Composer has gone from IQ 100 to IQ 10 Discussions	1	217	January 12, 2025
Agent Performance Decreased Significantly Today Discussions	8	652	January 20, 2025
Overeager compose agent killing me Discussions	8	216	January 23, 2025
It has been said, but I will repeat it, Agent mode is baaad, bring back composer Bug Reports	3	213	April 5, 2025

Agent lost some IQ points last few days?

Related topics