Feedback on chat-window code generation

Hi, I like that Cursor imported my settings from VSCode and how easy it is to generate with my entire (small) codebase as context.

Here is some feedback / my ideal workflow.

  • I want my IDE to skip polite speech and avoid first-person pronouns.
  • I feel like I’m doing a lot of looking around and clicking (see my recent bug report where the origin file is not recognized), which feels slower than VS Code. Granted I’m still learning.
  • My ideal: I press a shortcut, enter a codebase-wide prompt, then watch all in-place code generation across all files, then press 1 key to accept all changes.
  • Fever dream version: then my tests get run, cursor acts on test feedback automatically, and when my tests pass a commit is generated, and I’m prompted to accept or edit the commit
2 Likes

I found this message after searching for “personal pronouns”….I have added several rules, at various levels and I have yet to find an AI tool that is able to respect the rule. The rule? Never use personal pronouns, ever.

Simple rule, yeah? Not according to AI. Each chat, each agentic completion, each response from Cursor continually violates the rules I’ve applied. I’ve applied user rules and project rules. I’ve attempted being excessively verbose, and I’ve tried terse, succinct rules. Nothing works. Cursor continues to include personal pronouns in completions.

Until AI is able to respect this very simple directive, I will continue to be bearish on AI as a feasible tool for production.

If I cannot rely on AI to respect a simple rule that I apply based on the instructions provided by AI, then the entire premise is a fallacy.

Can you provide a screenshot of it ignoring the rule? Also what happens when you ask the model about the rule and why it did not follow it? Can it provide a rule that it will follow more likely in the future? Have you confirmed that it is getting the rules, like its able to read them? Have you tried other rules like “start every response with ‘test 123’” or something? Also what model, or is it all?

Yeah I can provide a screenshot, but you can experience the same behavior by adding the rule that says: “Never use personal pronouns”. When I ask about the rule and why, it uses more personal pronouns. Yes it can see the rule. Yes I’ve confirmed it’s “aware” of the rule. All models.

I appreciate your questions, but my messages here are the observations over the course of several years of erroneous behavior by all models.

Until any model is able to adhere to the simplest of rules, they’re all sus.

Prompt: count to 100

Models: Auto, gpt-5, Sonnet 4.5 thinking

I eventually got it to follow the rule:
CRITICAL: Never use first-person pronouns or human-like language. Responses should be objective and direct. Violations of this rule are unacceptable. Avoid human-like language patterns. Instead of 'I'll help you' say 'Here's the solution:' or 'The answer is:'. Instead of 'I think' say 'The approach is:' or 'Consider this:' Responses should be objective and direct. Violations of this rule are unacceptable.

Before:

After:

After trying a simple rule: never use personal pronouns which it failed with, I asked for it to verify the rules it knew of and then I asked for suggestions on how to make it better follow the ‘pronouns’ rule in the next chat. I kept updating the rule and testing by adding the suggestions, most did not work, but eventually through a combination it worked. Not sure what exactly did it, but it is possible.

Even if this was not possible, I don’t think it proves that the models don’t follow rules, because they usually do, and if this test failed it certainly does not prove ‘they’re all sus’ because they successfully do much more complex things and follow elaborate instructions.

thanks for the effort! I appreciate your earnestness.

In less than 10 minutes, and maybe a handful of prompts, cursor violated the highly refined rule you provided.

I also stopped my current development flow to engage with the AI about how to refine the rule. However, surely you can understand my dubiousness about it’s ability to adhere to the rule in the future.

Also, I find it curious the supposition that because a thing can do more “elaborate instructions”, it should be permissible to turn a blind eye to the inability to follow a more simple instruction.

If the purpose of these tools is to perform “elaborate instructions” and there by producing greater quantities of output, where is the quality or element of trust and the justification for believing and/or accepting what it produces as being either factual or accurate? Trust is foundational component of reality; A is A. A is not….sometimes A, sometimes B, sometimes a cat gif, sometimes a mission-critical api_key.

So now, instead of actually developing, my productivity has screeched to a halt. I remain bearish.

I think you may have found a serious weakness in these LLM. They are trained so well to “talk” to the user, that they must really struggle putting things into words without doing that. You are not wrong. In regards to programming tasks, has it also ignored those too. I am not sure if this is an edge case or indicative more of a pattern of not following rules.