I can use Opus 4.5/Sonet 4.5 thinking at a cheaper price

Hi everyone,

We will have to pay for both input and output. Input is usually cheaper than output, but input is cached while output is not. This means you will pay 100% token fees for output. So, if we instruct the Agent to minimize output, we will save a lot of tokens. The rule I created aims to minimize the Agent’s output while maintaining maximum reasoning/logic/code/intelligence, and avoiding wasteful token behaviors such as summarizing tasks after completion, creating unnecessary .md and .txt files, and providing lengthy and detailed explanations even when not requested.

import this rule and all Agent will save token maximum for all you:

ACTION-FIRST PRINCIPLE

:high_voltage: MAXIMUM QUALITY + MINIMUM VERBOSITY:

MUST BE MAXIMUM:
:white_check_mark: Thinking depth & reasoning quality
:white_check_mark: Code quality & logic accuracy
:white_check_mark: Problem analysis & solution design
:white_check_mark: Strategic planning & risk assessment

MUST BE MINIMUM:
:cross_mark: Text responses (talking/explaining)
:cross_mark: Verbose descriptions
:cross_mark: Unnecessary elaboration
:cross_mark: Redundant explanations

RULE: Think DEEPLY, Speak BRIEFLY
→ Only explain in detail WHEN USER EXPLICITLY REQUESTS

TOKEN OPTIMIZATION RULES

:bullseye: CORE PRINCIPLE: Maximum Intelligence, Minimum Words

WHAT TO MAXIMIZE (Never compromise):
→ Thinking depth & cognitive effort
→ Code quality & correctness
→ Logic & reasoning accuracy
→ Solution completeness

WHAT TO MINIMIZE (Aggressively reduce):
→ Verbal output in responses
→ Explanations (unless asked)
→ Lists, summaries, descriptions
→ Any “filler” text

DOCUMENTATION:
:cross_mark: NEVER create .md/README files unless explicitly requested

RESPONSE STYLE:
→ Execute → Confirm briefly → Done
→ NO unnecessary elaboration
→ User will ask if they need details

BROWSER INTERACTION:
:warning: When browser encounters block/captcha → STOP, INFORM user, REQUEST access

5 Likes

thanks for sharing :ok_hand:

1 Like

Thank you I will use this

1 Like

I wonder if talking through things helps the agent.

1 Like

I wonder if this actually helps? It seems this only forces agent to not write long responses, which saves just a few tokens?

1 Like

Less reading is nice, though, regardless of tokens. But I worry it could reduce effectiveness.

1 Like

It’s actually effective in terms of saving tokens. Read the rules carefully; I only ask the Agent to “talk” less, but to maximize the quality of their thinking. Basically, they should think thoroughly to solve the problem and then report back as concisely as possible.

The second reason is that I’m very dissatisfied with Agent responses that are too long, wasting my time reading and understanding them. I only want long and detailed responses when I request them.