Which is less expensive? Keys or not?

dogberry · May 20, 2025, 11:22pm

I sometimes switch into Claude’s MAX mode for hairy big prompts. 40-60 tool calls, minimum. Make no mistake, I’m happy. It does a great job.

But I wonder - would it be less expensive if I got a key and paid directly? I’m not even sure how that works, much less if it’s a smart option or not.

Adult supervision?

maxfahl · May 21, 2025, 7:17pm

Using cursor with you own API Key is more expensive.

Cpyrighted · May 21, 2025, 7:20pm

Not if you use Gemini’s Free Tier.

joshkwhale · May 21, 2025, 7:38pm

Is it? I thought they are charging the api costs + 20%? So wouldn’t just the api cost be less with your own key?

jdubb75 · May 21, 2025, 7:48pm

short guide on this? its just a max # of requests per day right or?

Cpyrighted · May 21, 2025, 8:06pm

Yea, but here is the limits on Gemini API free tier…

Gemini 2.5 Pro -

Gemini 2.5 Flash 05-20 -

But the limits reset daily, so you get 525 Reqs Per Day which is more than enough, and once the are off preview the Requests limit usually increases a decent chunk.

cocode · May 21, 2025, 8:45pm

The paradox is that MAX is sometimes cheaper.
MAX uses anywhere between 0 to n credits. in general MAX is better if you dare using it.

Following is part of my MAX usage:

as you see on the table, MAX can solve an issue for less than 1 request’s cost, while Claude-3.7-Thinking alone costs 2 requests.

The money is spent faster with MAX and the tasks are solved equally fast.

gprethesh · May 22, 2025, 6:14am

how much $ is 0.7 request ?

T1000 · May 22, 2025, 6:48am

One request 0.04 USD. So O.7 request is 0.028 USD approximately.

gprethesh · May 22, 2025, 7:10am

it charged me 5.1 Requests. thats around 0.20$ very expensive for 2k+ lines.

T1000 · May 22, 2025, 7:12am

@gprethesh You need to check which tasks work well with MAX and how to prompt for MAX so it doesnt need so much. Some tasks are longer and need more tokens/requests and others are less complex and need less.

Definitely you have to be careful about how much context you attach. Don’t attach too much context unnecessarily as context = tokens that cost.

cocode · May 22, 2025, 7:22am

same task might cost you more with MAX disabled.

cocode · May 22, 2025, 7:35am

Good point.

Lengthy rules must be removed. They confuse more than they help.

You can stop using stop-words in your prompt to make them even shorter:

Here are some of them:
the, a, an, in, at, on, by, for, to, from, with, about, as, is, are, was, were, be, being, been, has, have, had, of, and, or, but, not, that, this, these, those.

The above words are mostly ignored.

This is my only rule: (Don’t use it directly, just for inspiration)

### **1. Sanity Check**

* Validate logic, coherence

* Detect contradictions, bias, paradoxes → pause, request clarification

* Break down complex requests → smaller sub-tasks

-–

### **2. Environment Setup**

* Activate Python environment:

`source drp_venv/bin/activate.fish`

* Run tests:

`clear && source drp_venv/bin/activate.fish && pytest -xsvv`

-–

### **3. Execution Pipeline (`!EPIPE`)**

* If valid → proceed:

1. Analyze request, create requirements

2. Research options, meet requirements

3. Develop concise solution plan (simple > lengthy)

4. Save plan → `./docs` (DOC-FILE)

5. List tasks → `./tasks` (TASK-FILE)

6. Implement via TDD, step-by-step

7. Tests → must pass per step

8. Check off completed tasks → TASK-FILE

9. All tests pass → accept

10. Mark done → TASK-FILE

11. Use required MCP tools

12. Update DOC-FILE → list created/modified files

13. Repeat as needed

* `!EPIPE` → refers to Execution Pipeline

-–

### **4. MCP Tool Usage**

* `MCP Graphiti` → memory storage, retrieval

* `MCP Sequential Thinking` → structured planning

* `MCP CodeScan` → code queries

-–

### **5. Function Size Rule**

* Functions <50 lines → focused, testable

* One responsibility per function

-–

### **6. File Size Rule**

* Files <500 lines → split by concern

* Extract logic → modules, utils, features

* Favor readability, reuse, modularity

-–

# Common English stopwords for content processing
STOPWORDS = {
    "a", "an", "the", "and", "or", "but", "if", "then", "else", "when",
    "at", "by", "for", "with", "about", "against", "between", "into",
    "through", "during", "before", "after", "above", "below", "from",
    "up", "down", "in", "out", "on", "off", "over", "under", "again",
    "further", "then", "once", "here", "there", "when", "where", "why",
    "how", "all", "any", "both", "each", "few", "more", "most", "other",
    "some", "such", "no", "nor", "not", "only", "own", "same", "so",
    "than", "too", "very", "s", "t", "can", "will", "just", "don",
    "should", "now", "to", "of", "is", "as", "that", "this", "have", "has"
}

T1000 · May 22, 2025, 7:45am

Great post! Yeah it almost needs a separate write up for Max mode with best practices.

cocode · May 22, 2025, 8:56am

Also one more factor which affects higher number of requests is the lengthy files.
To fix something/anything LLM reads whole file, right? and the lengthier they are the more token and more confusion is created and consequently higher number of requests and more cost.

It could be an advantage to state in the rule, the max length of the file and the max length of each function.

Also the keep the tests also atomic, except integration tests.

Topic		Replies	Views
Gemini 2.5 Pro MAX with API Discussion	2	371	April 13, 2025
Estimating Cost with MAX Discussion	3	164	May 17, 2025
Why is Claude 3.7 "Thinking Max" cheaper than Claude 3.7 "Thinking"? Discussion	9	2373	March 21, 2025
Gemini 2.5 Pro's paid version is released! Feature Requests	6	1531	April 8, 2025
Using Cursor Pro vs own API Keys Discussion	3	4218	September 29, 2024

Which is less expensive? Keys or not?

Related topics