MCP Server AI-interaction - Solution to utilize rate limit

DemonVN · June 10, 2025, 3:28am

Hi, Cursor’s new Pro plan is really awesome because I can use Claude 4 sonnet MAX for days without hitting my rate limit with this tool.

This tool helps create a chat channel through the tool, instead of chatting on the live chat channel. This actually reduces the token consumption more than sending a real message on the main chat channel. You can research about this on Anthropic’s page.

This tool free.#1

DemonVN · June 10, 2025, 4:39am

It seems like no one cares about this tool even though they always complain that 500 premium requests is too little @condor huh?=))

condor · June 10, 2025, 4:45am

lol, not true, I saw your tool and thought thats a very interesting solution.

Does it use any additional AI API? or just interaction?

How do you manage context pollution?

Some users have even on regular Claude 4 Sonnet the issue of the regular request context getting full and model hallucinating.

Security would be my main concern

Is there any communication with external servers?

DemonVN · June 10, 2025, 5:35am

This tool is for interaction only, with rules set up so that Claude can strictly follow. The rules of this tool will resonate with the tool’s user guide so that Claude is more sure about following the rules such as: always restart the tool so that the user can continue chatting, always show thinking block with the antml:thinking tag

By applying this rule, each response of Claude will be an inferred response, ensuring the response result is of good quality. There is no need for API or any other AI model inside this tool, it completely relies on the Rule context (mode_specific_rule) in Claude’s system prompt to operate. The better the agent’s ability to follow the rules, the more effective the AI-interaction.

Regarding forgetting context, I used AI-interaction for many days and chatted for a long time in many conversations, to the point that Claude-4-sonnet -thinking forgot the previous context, but it still did not forget to restart the chat channel for me to continue the request. This shows that even if Claude forgets the context, it will still follow the rule very well.

I think you should test this tool @condor

DemonVN · June 10, 2025, 6:58am

I forgot, there is a bug that often happens with Claude when I use this tool. I have to press “resume” to continue chatting with Claude, but then the thinking blocks are no longer used properly. Even though everything is working fine, this bug is annoying me.

condor · June 10, 2025, 7:26am

Does this occur only with this MCP or with any MCP?

DemonVN · June 10, 2025, 7:30am

Only with this MCP. I dont know why, can you check rule file and use tool for testing? I don’t know if other MCPs have the same problem.

condor · June 10, 2025, 7:39am

I will have a look later at the MCP and try it out once I have a bit more time available.

On other MCPs I didnt notice the same issue.

From a quick look at the rules following may be the issue

🔴 MISSING thinking: Stop → Add thinking block → Continue

In some cases AI will hallucinate the thinking block so it looks like real.

<.a.n.t.m.l.:.t.h.i.n.k.i.n.g.>

Sometimes also too frequent repetition of a rule may cause conflicts with other rule parts.

There are many rule parts that talk about thinking. Try to disable some of them, maybe use <....> with backticks instead of dots to escape?

As the rule file is very detailed and long it may also cause certain issues, but would need to try it first to be sure.

DemonVN · June 10, 2025, 7:49am

The reason I use <.a.n.t.m.l.:.t.h.i.n.k.i.n.g.> is because when I explicitly write this tag as antml:thinking it seems that Claude will not be able to see “<.antml:thinking.>” in my rule file. I tried asking Claude and it told me that what it sees in the rule I set up is the “” tag and not “<.antml:thinking.>”, even if Claude intentionally talks about the “<.antml:thinking.>” tag it will immediately get a very bad bug (either stopping the conversation or only writing instead of <.antml:thinking.>). I tested it many times and I am sure that Claude cannot print out the string “<.antml:thinking.>” if you ask it to do so. So using <.a.n.t.m.l.:.t.h.i.n.k.i.n.g.> is the only way to avoid this bug.

This error is definitely not because my rule is too long and repetitive, but because of the thinking block tag <.antml:thinking.>
If I remove the part related to <.antml:thinking.>, I will never get this error again, but Claude will not apply the thinking block to reason when executing my request, which will reduce the quality of Claude’s response. The bug I encountered, I think it is just a bug from the UI, Claude has no error.

condor · June 10, 2025, 7:55am

Ah I see, yeah thats possible. will try it out

alexx-ftw · June 10, 2025, 9:44am

Stared and trying it right now. Thank you for sharing!

DemonVN · June 10, 2025, 10:01am

Great, let me know your experience.

normalnormie · June 10, 2025, 11:04am

There are multiple implementations of this concept so people are interested, here’s my implementation together with a bunch of others: By using this MCP, you can significantly improve code quality and increase the utilization of Cursor and Windsurf credits - #5 by normalnormie

Pros: permits multiple question-answer iterations
Cons: it degrades quality as reasoning token budget decreases on each iteration
My experience: perfect for quick fixes after a refactor, mostly good for debugging back-and-forth, I avoid it for planning or intensive reasoning tasks

DemonVN · June 10, 2025, 11:18am

When using AI-interaction. By requiring the Agent to always reason before responding, I don’t feel that Claude-4-sonnet-thinking becomes weaker as the number of chats increases. On the contrary, thanks to “thinking”, Claude-4-sonnet-thinking even automatically detects errors and stops, changing to another solution. For example: “Wait! I see that this method is not good because it still causes errors A, B, C. So method D is the best, I will implement method D”

normalnormie · June 10, 2025, 11:38am

Degradation coming from token reasoning budget is not debatable, you’re right it works on most cases but not where complexity arises and each iteration degrades the result, in your example the model gives you 4 options, it means it already reasoned on those 4 options, when you start making it reason on other solutions you’ll see it degrade, check this image coming from a scientific paper:

DemonVN · June 11, 2025, 1:15am

Does that mean that the longer the chat, the lower the quality of the response? Or the same response but replying too much to AI-interaction will gradually decrease the quality? Actually, the article is just an article. It is not necessarily true, unless that article is from Anthropic, which confirms it 100% because they are the creators of Claude-4-sonnet-thinking, the chart you are looking at is of other models, it is not certain that Claude-4-sonnet-thinking works the same. And even if Claude-4-sonnet-thinking works the same, it is still Claude-4-sonnet-thinking, an extremely powerful model, so if it degrades in quality, it is still more than enough to surpass Claude-3.7-sonnet-thinking or other models.

And in my case. Claude doesn’t work in a “I already have a reason to choose D” way, Claude reasoned while it was answering the user, not thinking through the whole thing and then answering all at once. Let me give a more complete example: Claude is implementing option A, while it’s doing it, it automatically says “WAIT! This still causes error B. I’ll change the code to implement option B” - And I confirm that option A does indeed cause error B because I fully understand what the code Claude is implementing is, it’s my code. Claude really reasoned while answering, it didn’t pre-calculate the answer it would give as if it couldn’t predict what the next sentence in the future would be.

I used Claude-4-sonnet-thinking.

DemonVN · June 11, 2025, 2:42am

Hi @danperks , I don’t know if there is a problem with o3 but o3’s ability to follow rules is really poor. Completely inferior to Claude-4-sonnet-thinking and Gemini 2.5 pro. Is it because of Agent or Cursor?

qhm.gpt · June 11, 2025, 3:31am

Your MCP seems similar to the “mcp-feedback-enhance” I’m using.

DemonVN · June 11, 2025, 3:36am

Exactly, thanks to “mcp-feedback-enhance” I came up with AI-interaction. Really thanks to “mcp-feedback-enhance”.

qhm.gpt · June 11, 2025, 3:42am

So what improvements does it have?

Topic		Replies	Views
MCP Server Agent-to-Agent for Cursorr Discussions	19	546	July 8, 2025
MCP that lets the AI ask you for help while responding Discussions	5	382	July 13, 2025
If Possible Connceting to OpenAI Assistant Feature Requests	0	25	March 13, 2025
Intervening with agent in mid-work How To	2	63	June 22, 2025
Cursor uses tools without approval during thinking step with claude-3.7-thinking Bug Reports	1	100	April 1, 2025

MCP Server AI-interaction - Solution to utilize rate limit

Related topics