What does the thinking toggle actually do ? And why the holl resuming a thinking prompt also cost 2 fast request?

Liolilol · March 28, 2025, 11:33am

Hello

I tried and failed to find a topic discussing this already, so sorry if that’s a double post.

I am a semi pro cursor user, and so my question may be stupid, but …
What does the thinking toggle do ?
Beside the little “thought process” at the start ? (“thought for 5 second”)

I genuinely don’t understand the point of it, since the model CAN and will output the same ‘planning/summarizing’ of your request, just not within a thought process,
if you ask it to summarize your request and make a detailed plan of execution, which we probably all should do basically in every complex/big prompt.

Is that’s really just the short ‘thought process’ justifying the x2 cost of the thinking request ??
I get that in some request, the thought process can help the AI plan better and more extensively, but if that’s what you need for your task, you should just tell it to do that in the first place,
and most importantly, that thinking output if most often quite short, except for overly complicated math problem, that Claude is not even good at handling in my experience (compared to R1 for instance)
And even in those case, it will not think much, and do most of the ‘thinking’ outside of the little italic thought process,
And since it’s relatively short, it can’t be a significant cost justifying a x2 request usage, compared to the tons and tons of token exchange that must go on in the backend for the code editing itself.

I am not seeing a noticeable difference in the code/edit quality otherwise …
So, am I missing something ? Does it in fact do more ‘backend thinking’ that we don’t see, improving the result quality, and am I just to dumb/inexperienced to notice it ?

Assuming it is the case, I would like some explanation about this, and/or read other’s opinions of the situation.

But assuming it’s ONLY the little thought process that justify the x2 cost, and that is all the thinking toggle does;
WHY THE HILL RESUMING a request (after the 25 tool calls) COST 2 REQUEST AS WELL as the initial request ?
I end up with a request having made like 30 tool calls in the end, COSTING 4 CREDITS
Does that seem sensible to you ?

Well, sorry for going a bit overboard, and I may be wrong anyway so yeah …
Hoping to get you’all’s answer and opinions.

Have a great days.

T1000 · March 28, 2025, 11:39am

Resuming is sending a new prompt.

In the past we had to write ‘Continue from last task’ or similar into the prompt. ok that was then clear we submit a new prompt but people asked this to be simplified.
Cursor team has improved usability to use a simple click now instead.

As the original requests is listed as 25 tool requests total in non-Max mode It will have to be a new prompt. You cant really expect it to be endless.

Its very reasonable price and prehaps not really profitable for Cursor considering how much one prompt with 25 tool calls can do.

Cursor is very open and clear about this.

Liolilol · March 28, 2025, 11:45am

That’s not really what I am annoyed about
I am annoyed because that ‘new request’ via the resume button also cost 2 credit instead of one, even tho it seems to not benefit in anyway from the 2 credit cost, since the only thing the thinking toggle does is make a little babbling at the start of the INITIAL request

I do sometimes use the ‘continue’ method to just keep going after the 25 tool call, if I know I am near the end, and want to start working on something else immediatly, to save a fast request

For instance, I will say
'continue the ongoing task, and once done, start working on this next thing, of which the details are such […]"

But yeah, my point is not at all about the 25 tool calls, which by the way, I think is very fair,
→ but about what is the point of the thinking toggle, and why the resume on a thinking request cost 2 credit if it’s litterally doing nothing different from a resume on a normal request (OR, the 2 credit cost of the resume could be because the thinking toggle was still enabled; either way, I go no bang for the buck in regard to the resume, and I won’t test further with this)

T1000 · March 28, 2025, 11:47am

Yes what ever model, settings and so on you chose is continued on resume.

That could be 2USD / request or 1 / 2 4 8 request per prompt or more depending on the model.

Thinking is another issue. Are you using a prompt that correctly prompts for thinking mode as required by Anthropic?

Liolilol · March 28, 2025, 11:52am

Yeah, I am not, Maybe that is the issue, I didn’t even know that was a thing (to prompt specifically for thinking),
and I expect/expected the thinking toggle/mode to just make the model reflect on itself more, and so be “”“smarter”“”

That IS the experience I had with deepseek R1, being able to provide VERY impressive result on some complicated problem I provided.
In those instance, the model wrote like a ton of text, over nearly 10 minutes, but the result was, as I said, impressive.
As such, I am very unimpressed with the thinking performance of claude in cursor.

I will do research on what you call “prompts for thinking mode as required by Anthropic”; as that didn’t even cross my mind that such lenghts were required for the thinking toggle to do anything usefull/noticable.
And see if I find anything interesting.
Thank you for the knownledge, I will check that out.

Liolilol · March 28, 2025, 11:54am

If the thinking really do not do anything usefull beside the babbling at the start of the initial request, that should be fair to not make the subsequent resume request cost double;
And that mean that using the manual “continue” method, without thinking enabled, should probably be considered, making the “quality of life” addition that is the resume button kinda, or rather, objectivly bad.
But I do understund that they need to get more money, and so, why they won’t fix this, or allow users to use slow request or fast request at will depending on their need/time-constraints.

T1000 · March 28, 2025, 11:55am

You here is perhaps more info on that prompting.

https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/extended-thinking-tips

Liolilol · March 28, 2025, 11:57am

Thank you very much

koltanl · March 28, 2025, 6:50pm

if the expected output is less than 400 lines of code, probably don’t use thinking. is a decent enough rule of thumb right now

Topic		Replies	Views
Please make "thinking toggle" visible Bug Reports	5	260	May 10, 2025
Option to keep "Thought for ..." from collapsing Feature Requests	6	74	July 8, 2025
Making AI Thought Process Collapsible Feature Requests	2	228	April 8, 2025
Why is Claude 3.7 "Thinking Max" cheaper than Claude 3.7 "Thinking"? Discussions	9	2552	March 21, 2025
0.47 did the cost of sonnet 3.7 just go up? Discussions	5	1001	March 18, 2025

What does the thinking toggle actually do ? And why the holl resuming a thinking prompt also cost 2 fast request?

Related topics