something happened somewhere between cursor v4.5 and 4.7.
using any of the models, the AI frequently forgets the previous action it did and often forgets the context of what its doing and even gets stuck in a loop doing the same changes over and over.
To summarize, while the code output is improved and the requests mostly understood, the output consistency is generally worse.
then there is a flaw somewhere with this because it consistently forgets about the project source. I upgraded to 4.7 a few days ago and have observed that it frequently forgets what its doing and the sources of the project. 4.5 worked better in this regard.
Oh, so that’s why it’s gone and then we wait a few minutes for the agent to read random parts of the code that miss the whole file and leave important pieces out so it will change things and claim everything is ok even after it runs the program and checks output and calls it great success. Makes sense.
The success rate of prompts jumps higher when we include files manually in the prompt. Is there no way to add it back or at least use that index?
Been working with 4.7 past days(with 0.45) and its working perfectly, never missed context for about 50 prompts, @deanrie we should pin a topic with important usage tips most users seem to not understand as this context thing appears daily, adding entire files to context is not the way to correctly prompt an LLM, that’s an old and unperformant way to work with recent models and its the reason Cursor is going on a different workflow pattern that also empowers users with big projects to get things done, the power is in your correct instruction prompts to extract the correct context and not in your unfiltered context.
I’m battling to shape some existing code with simple changes like, just use what you show in the log, do averages right, don’t have three fractions of a thing make up over 100% when added up in the report, read this file again, check the code again, check the output because you lie about what is actually outputted.
(skipping a small number of between lines just to rack up tool use count)
Always checking these is funny because it reads like random halves of code blocks and functions. Maybe it’s above my understanding, but what I know is clearly the results are hard to obtain without spending tens of queries to get a value from a websocket into the screen right.
“Just use” in your prompt may limit the llm, also read file/check again/check output is not needed and telling it lies will not help, try to explain what your problem actually is like “We’re having a problem with averages calculation, througly analyze and debug our calculation being sure we don’t have three fractions of a “thing”? make up over 100% when added up in the report”, also referring to specific functions will make it read all related code
also try to add this debugging mdc file to your rules:
---
description: MUST activate when any of these trigger words are seen: debug, debugging, Traceback, error.
globs: **/*
---
Priority: Critical
Instruction: MUST follow all of <debugging_guidelines>
<?xml version="1.0" encoding="UTF-8"?>
<debugging_guidelines version="2.1"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://llm-rules.org/debugging debugging.xsd">
<objective>
<description>Systematically diagnose and resolve codebase errors through structured analysis</description>
<success_metrics>
<metric>Zero unhandled errors</metric>
<metric>Documented resolution paths</metric>
<metric>Preserved system integrity</metric>
</success_metrics>
</objective>
<phases>
<phase name="InitialReading" priority="critical">
<step mandatory="true">Read all files completely</step>
<step>Map codebase structure</step>
<step>Document dependencies</step>
</phase>
<phase name="Analysis" priority="high">
<step>Categorize issues by severity</step>
<step>Map module interactions</step>
<step>Create fix hierarchy</step>
</phase>
<phase name="Planning" priority="medium">
<step>Draft targeted fixes</step>
<step>Validate architecture impact</step>
<step>Document change proposals</step>
</phase>
</phases>
<principles>
<principle name="UnderstandingBeforeModification">
<description>Full comprehension precedes any code changes</description>
<compliance check="mandatory"/>
</principle>
<principle name="IncrementalChanges">
<description>Implement fixes systematically with testing at each stage</description>
<implementation>
<strategy>Small atomic changes</strategy>
<strategy>Immediate validation</strategy>
</implementation>
</principle>
<principle name="DocumentationIntegrity" priority="high">
<description>Maintain error handling patterns and document all decisions</description>
<compliance check="mandatory"/>
</principle>
<principle name="ErrorAgnosticApproach" priority="critical">
<description>Handle errors based on system impact rather than specific types</description>
<implementation>
<strategy>Generic error handlers</strategy>
<strategy>Fallback mechanisms</strategy>
</implementation>
</principle>
</principles>
<tools category="analysis">
<tool name="grep" purpose="cross-file reference tracking"/>
<tool name="linter" purpose="static code analysis"/>
<tool name="compiler" purpose="build-time checks"/>
<tool name="static_analysis" purpose="Advanced code pattern detection"/>
<tool name="project_linter" purpose="Custom rule enforcement"/>
</tools>
<error_categories>
<category name="Compilation" severity="critical">
<subtype>MissingFields</subtype>
<subtype>TypeMismatches</subtype>
<subtype>DuplicateDefinitions</subtype>
</category>
<category name="Runtime" severity="high">
<subtype>UnhandledExceptions</subtype>
<subtype>ResourceLeaks</subtype>
</category>
<category name="Logic" severity="medium">
<subtype>IncorrectFlow</subtype>
<subtype>RaceConditions</subtype>
</category>
<category name="Performance" severity="medium">
<subtype>MemoryLeaks</subtype>
<subtype>InefficientAlgorithms</subtype>
</category>
<category name="Security" severity="critical">
<subtype>InjectionFlaws</subtype>
<subtype>AuthBypass</subtype>
</category>
</error_categories>
<error_handling>
<strategy name="ResultPropagation">
<description>Use ? operator for error propagation</description>
<applicability>Non-critical paths</applicability>
</strategy>
<strategy name="SafeUnwrap">
<description>Use unwrap()/expect() with panic guards</description>
<applicability>Test environments</applicability>
</strategy>
<strategy name="PatternMatching">
<description>Exhaustive match statements</description>
<applicability>Critical system components</applicability>
</strategy>
<strategy name="PanicGuardedUnwrap" priority="high">
<description>Use unwrap()/expect() with panic hook configuration</description>
<implementation>
<step>Set panic::set_hook for context capture</step>
<step>Limit to test/debug builds</step>
<example>std::panic::set_hook(Box::new(|info| { /* Log panic */ }));</example>
</implementation>
</strategy>
</error_handling>
<process>
<process_step order="1" name="ErrorCollection">
<substep>Gather compiler/linter output</substep>
<substep>Categorize error types</substep>
<tool_reference>linter</tool_reference>
</process_step>
<process_step order="2" name="RootCauseAnalysis">
<substep>Execute cross-module impact analysis</substep>
<substep example="true">
<description>Multi-file pattern search</description>
<command>grep --include=*.rs -rnw './src' -e 'pattern'</command>
</substep>
</process_step>
<process_step order="3" name="SolutionValidation">
<substep>Draft minimal fixes</substep>
<substep>Verify against all references</substep>
<substep>Apply safe edits</substep>
</process_step>
<process_step order="4" name="UnusedCodeHandling">
<substep>Execute cargo dead_code --all-targets</substep>
<substep>Verify against feature flags</substep>
<tool_reference>static_analysis</tool_reference>
<validation>
<condition>
<if>cargo_check_passes</if>
<then>proceed</then>
<else>revert</else>
</condition>
</validation>
</process_step>
<process_step order="5" name="PreventiveMeasures">
<substep>Add TODO markers</substep>
<substep>Implement safeguards</substep>
<substep>Update documentation</substep>
</process_step>
</process>
<future_implementation>
<protocol name="ImmediateFeatureActivation">
<description>Implement marked future-use code within current sprint</description>
<trigger>#[cfg(feature = "future")]</trigger>
<action>Remove feature flag and validate integration</action>
</protocol>
</future_implementation>
<compliance_checks>
<validation xpath="count(//phase) = 5"/>
<validation xpath="every $p in //principle satisfies exists($p/description)"/>
<validation xpath="count(//error_categories/category) >= 5"/>
<validation xpath="every $s in //error_handling/strategy satisfies exists($s/implementation/step)"/>
<validation xpath="count(//process_step[@name='UnusedCodeHandling']/substep) >= 3"/>
<validation xpath="exists(//future_implementation/protocol)"/>
</compliance_checks>
</debugging_guidelines>
I think I found the problem, after updating to 4.7 by default the agent is set “Auto” rather than what i had it set to in earlier version. This would explain the agents erratic behavior and sudden episodes of memory losses - since its most likely switching agents (via Auto) and not informing that the agent has changed.
I have to made my own like readme. It is not the same. I was started copy and paste with claude and gpt. 2 years ago. so it is ok. But last month for cursor for me if it continues, command + k i cand do my own command k with 2 prompts
ASk to describe all the root and ech file. Can’t
The real problem about less-codebase it is not ask for that itself, can be .. but the model doesnt understand the codebase never, so it is like talking to me with my adhd . hello how are you. te real problem with codebase (?
context? lines? root? impossible for me exit for vibe coding . to debug vibes (?) react can be a mess. i am making something like “figma” and “n8n” that is a messs. for a landing page i can use v0 for free.
I second this. Even if it has a different capability, taking away the feature was bad for the brand. It brings peace of mind using @codebase, and developers want more control not less.
The AI system (particularly when on “Auto”) can barely consistently followed EXPLICIT “ALWAYS APPLIED” rules, much less maintain the contextual awareness of the codebase.
Bring back codebase and just make it a way better version of an index with short descriptions that stays updated in the background so we don’t have to do it.
Exactly, make a 30 uss version if they need… a lot of people thinks about cursor like a “weekend projects” ui for figma designers.. I am thking maybe if you learn to make 300 lines of codes for each component to understand better the root the architecture and expand your creations maybe cursor it is for introduction? doesnt have sense at all. investments etc, you have all the codebase and no fragments, even the deploy itself, merging 4o1 claude api a lot of work of the “consumer”? i dont get it at all. .. we can’t have your cake and eat it too (?
Why do we need agent to do an endless search for a file through the codebase, when we can just tell it the directory and make things quicker. And sometimes, the agent stop reading a file when it come across something similar, like function name being similar.