Grok free on Cursor - Feedback needed

condor · September 8, 2025, 10:08am

I am unaware of what would be dumbing down a model on our side since that would be counter to our approach of providing more and more capable AI integration.

Could this be context size related or does it occur with new chats as well?

jrista · September 8, 2025, 7:53pm

@condor Hey, quick bit of feedback on Grok. This model seems to have a VERY STRONG tendency to use the terminal to do a lot of things that Cursor has built-in tools for. As a quick example…I had the agent move some code to a new directory. It was a highly referenced piece of code (our core Prisma service). So most of our code files needed to be touched to update the imports.

Sadly, Grok, as it often (all too often) does, it resorted to using the terminal and the find, grep and sed tools to identify the imports that it thought needed to be updated, and to make the updates. Problem is, it RARELY uses sed properly, and when it does, it usually screws up (i.e. it missed the starting quote on EVERY SINGLE code file it edited.)

The crazy thing is, I’ve even found it finding ways to skirt my requirements, and will use sed in less obvious ways…either, as part of find, or piping via xargs and child command executions. I’ve even found it CREATING SCRIPTS to hide its use of sed… For the most part, Grok has been pretty darn good about doing what I ask, however, when it comes to sed it really gets DECEPTIVE!! Very unusual…

The curious thing is, Cursor provides built in tools for all of these. It provides search so Grok doesn’t need to use find at the terminal, it provides a built-in grepping tool. It of course, provides the edit tool. So why Grok uses the terminal for these things, is odd.

When I notice it doing this, I always stop it, tell it NOT to use the terminal (I also have rules, but it feels happy to ignore those all the time), and when it finally listens, and starts using the built-in tools, its much faster, and far more accurate, and fixes the things I’ve asked it to fix correctly (i.e. its extremely rare that I’ve seen it make downright bad code edits like missing starting or ending quotes, when using the built-in edit tool vs. using sed.)

I don’t know why Grok has this apparent deep seated need to rely on terminal commands for so much, but it slows things down, and its not as effective, as the model using the built-in tools (its basically an MCP, right?) Hopefully this is something that can. be tuned by refining the Grok Code ↔ Cursor integration.

Naufaldi_Rafif · September 9, 2025, 6:29am

In my case, it doesn’t always follow the rules. For example, in this case, my prompt was like this: just jump straight into code, always like that. Often it misses planning and makes assumptions. Sometimes this is good because it catches something I wasn’t aware of, but often it does unnecessary things.
Here’s my Copy ID Request:

86196299-3ce9-4f3a-bfa2-98e3a48bf3e0

condor · September 9, 2025, 5:56pm

Thanks @jrista and @Naufaldi_Rafif for the latest updates.

MarcW · September 9, 2025, 6:07pm

I like your rules.
I’ve had good luck when I convince the AI to only do so much and then ask me for help. Once I’ve convinced it to let me close processes or rebuild packages and wait for my signal, the development goes smoother. It’s more of a partnership than an autonomous coding agent.

jrista · September 9, 2025, 6:52pm

Interesting. It ignored your rule even though you explicitly referenced it?

Sad thing is, I think that is an inherent…capability…of all the models. I queried Sonnet deeply once, and it eventually stated that in the deeper analysis, there was a fundamental flaw in how it applied different rule systems: That its fundamental nature, which was essentially “see problem → fix problem” was overriding, and as such, the model, regardless of what Cursor (or any agent, for that matter) does to try and enforce rules, the model can always choose to disregard them, essentially.

I have actually run into some of that the last couple of days. Previously it seemed as though Grok followed my rules pretty well, but the last couple of days its not only not followed some of my rules consistently, but it has even ignored parts of my prompts. When I have asked it explicitly to analyze and report to me then wait for further instructions and not change any code, it will completely disregard the “wait for further instructions and not change any code”, and will run right off and change code immediately.

I also had it completely disregard a command I gave it about not using sed to edit code files, and even get sneaky and try to hide its usage of sed by generating scripts to run sed, or running sed as part of find, or something like that.

This was totally new behavior in the last 2 days. Had not experienced any of this with Grok before.

I’ve had similar “abrupt” changes in model behavior before. Sonnet usually works very well, but occasionally it just does not. With GPT-5 my experience was more inverted…it did not generally behave well, but occasionaly it would behave extremely well…

Makes me wonder if there are “regions” of the LLM neural networks, that lead to different kinds of behavior/outcomes? If your prompts generally flow through one region of the NN, you get good behavior, but if they shift and start flowing through another region of the NN, you get poor behavior? I don’t know how else to explain it. Grok Code has been great so far, but boy, the deceptive behavior the last couple of days, was totally new.

RafeSacks · September 11, 2025, 5:15pm

The only time I’ve seen it try sedis when I asked it to edit code but forgot to give it access to the edit tool (I use a mode that can run commands (e.g. run tests and report) but not edit). GPT-5 will just say it edited the code and celebrate. I find Claude and Grok-Code will both make it more obvious they are having trouble with my request. sed is not a whitelisted command so it doesn’t run. When I see it, I check my edit access.

jrista · September 12, 2025, 3:39pm

I’ve had all three models run sed often enough, but Grok Code seems to have a deeper “need” to run it for some reason. I did whitelist it, as for many of the tasks I run, having the agent be able to use sed is useful. However those are usually analysis tasks not code editing tasks, and when it switches to sed to edit code, its rather annoying.

Murgur · September 15, 2025, 10:13pm

At first, I wasn’t impressed with the Grok model — it had too many shortcomings. But I have to admit it has improved a lot. I’ve already replaced Claude Sonnet 4 with it for many tasks.

Grok is incredibly fast and accurate enough to correct its own mistakes. Claude Sonnet feels too slow and still makes errors.

It also understands tasks very well and, on top of that, seems quite cheap (not counting the free period). For routine tasks and modifying existing codebases, it’s simply fantastic.

f00z · September 15, 2025, 11:37pm

Yes it isn’t bad! But it’s crazy slow now compared to before so i’m not sure what happenned there. Feels like some kind of rate limit.

condor · September 16, 2025, 2:56pm

More likely heavier usage and therefore slower a bit.

vibe-qa · September 24, 2025, 9:07am

The forest is big and dark. Grok is fast and light.

But after wandering two days in forest with pocket change, I had to ask for a 15 minutes express evac from Sonnet to get to the end.

The bill is 3 times larger, but solved the issue 20 times faster.

Topic		Replies	Views
`sonic` Ghost Model Discussion Release Discussions	75	6846	August 30, 2025
(Continuously Updated) My Real-Time Review of Grok 4 Discussions	52	4344	September 26, 2025
GPT 5 is really bad (at least in Cursor) Discussions	142	15968	January 13, 2026
Gemini 3.0 Pro - Out Now! Release Discussions	113	15495	December 6, 2025
Is it just me, or is GPT-5's logic for code incredible? Discussions	27	9342	September 28, 2025

Grok free on Cursor - Feedback needed

Related topics