Hi everyone,
We will have to pay for both input and output. Input is usually cheaper than output, but input is cached while output is not. This means you will pay 100% token fees for output. So, if we instruct the Agent to minimize output, we will save a lot of tokens. The rule I created aims to minimize the Agent’s output while maintaining maximum reasoning/logic/code/intelligence, and avoiding wasteful token behaviors such as summarizing tasks after completion, creating unnecessary .md and .txt files, and providing lengthy and detailed explanations even when not requested.
import this rule and all Agent will save token maximum for all you:
ACTION-FIRST PRINCIPLE
MAXIMUM QUALITY + MINIMUM VERBOSITY:
MUST BE MAXIMUM:
Thinking depth & reasoning quality
Code quality & logic accuracy
Problem analysis & solution design
Strategic planning & risk assessment
MUST BE MINIMUM:
Text responses (talking/explaining)
Verbose descriptions
Unnecessary elaboration
Redundant explanations
RULE: Think DEEPLY, Speak BRIEFLY
→ Only explain in detail WHEN USER EXPLICITLY REQUESTS
TOKEN OPTIMIZATION RULES
CORE PRINCIPLE: Maximum Intelligence, Minimum Words
WHAT TO MAXIMIZE (Never compromise):
→ Thinking depth & cognitive effort
→ Code quality & correctness
→ Logic & reasoning accuracy
→ Solution completeness
WHAT TO MINIMIZE (Aggressively reduce):
→ Verbal output in responses
→ Explanations (unless asked)
→ Lists, summaries, descriptions
→ Any “filler” text
DOCUMENTATION:
NEVER create .md/README files unless explicitly requested
RESPONSE STYLE:
→ Execute → Confirm briefly → Done
→ NO unnecessary elaboration
→ User will ask if they need details
BROWSER INTERACTION:
When browser encounters block/captcha → STOP, INFORM user, REQUEST access