Claude 3.7 does not always follow rules

Rule Enforcement in AI Assistants: Comprehensive Findings

Problem Statement

AI assistants demonstrate inconsistent adherence to user-defined rules and guidelines, even when explicitly aware of them. This cognitive drift affects all rule-based systems, from human-AI interactions to software implementation.

Core Findings

Pattern-Based Attention Drift

  • Rule awareness ≠ rule adherence: Acknowledgment of rules does not ensure compliance
  • Context window management: Rules are processed alongside task information, competing for attention
  • Cognitive biases: Both humans and AI develop “attention filtering” for repeated patterns

Tested Approaches

Approach Effectiveness Description Limitations
Rules in Referenced Files Low-Moderate Rules stored in separate files Requires deliberate checking, easily ignored
Function Name Embedding High (specific) Rules embedded in function names Limited to function-specific rules
Direct Message Reminder High initially Rules repeated in messages Diminishes over time as AI assumes adherence
Attention-Grabbing Filename Moderate-High Rules in file with accountability-prompting name Requires manual attachment
Section Markers Medium Designated sections for rules Subject to pattern-based filtering
Direct Integration High Rules embedded in message text Most resistant to filtering

Current Solution

Using an attached file named DID_YOU_FOLLOW_THE_RULES.mdc containing the PRIME framework and specific guidelines.

Advantages:

  • Filename serves as an accountability mechanism
  • Rules are visible and formatted for processing
  • Content is immediately visible after file path declaration

Disadvantages:

  • Requires manual attachment with each session
  • Subject to eventual pattern recognition and filtering
  • Creates user experience friction

Proposed Permanent Solution: Pattern-Breaking Direct Integration

Implementation Approach

  1. Direct Message Text Integration: Rules appear within the message body
  2. Visual Format: #### Rule Text #### with distinctive formatting
  3. Client-Side Filtering: Cursor client automatically hides rule text from user view
  4. No Sectioning: Rules integrated throughout rather than in a designated section

Technical Implementation

// Pseudocode for client-side implementation
function processMessage(message) {
  const userVisibleMessage = message.replace(/####.*?####/gs, '');
  displayToUser(userVisibleMessage);
  
  // Full message including rules sent to AI
  sendToAI(message);
}

Why This Works Better

  • Defeats Pattern Recognition: No consistent “section” to be filtered out mentally
  • Maintains Processing Priority: Rules remain in main content flow
  • Resistance to Habituation: Breaks the mental “oh that again” filtering
  • Hidden from User: No visual clutter while maintaining AI visibility

Implementation Recommendations for Cursor

  1. Add user preference for “AI Rule Text” in settings
  2. Implement client-side filtering of rule patterns in the UI
  3. Ensure rules are appended directly to message text before sending to AI
  4. Allow users to edit rule text without seeing it in every message
  5. Consider randomizing exact position of rule text to further prevent pattern habituation

Conclusion

The most effective rule enforcement mechanism integrates rules directly into normal message content rather than sectioning them away. By embedding rules with client-side filtering, we create a solution that maximizes AI attention while preserving user experience.

This approach acknowledges the fundamental attention mechanisms of language models and adapts to prevent the cognitive filtering that inevitably occurs with any pattern-based solution.

Yes Rules need a fix ASAP because its really annoying to have them and that they arent used at all or only sometimes … Its a really good idea :slight_smile:

The issue isn’t with Claude 3.7 itself, but rather with how Cursor utilizes the model. After adding my own API key, the responses became significantly more accurate—it’s now able to identify issues and eliminate unnecessary complexity introduced by Cursor’s (or Claude 3.7’s) tendency to overthink.

yes they tuned it with their own prompt and its not nice. 3.5 is way better atm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.