Which is less expensive? Keys or not?

I sometimes switch into Claude’s MAX mode for hairy big prompts. 40-60 tool calls, minimum. Make no mistake, I’m happy. It does a great job.

But I wonder - would it be less expensive if I got a key and paid directly? I’m not even sure how that works, much less if it’s a smart option or not.

Adult supervision?

1 Like

Using cursor with you own API Key is more expensive.

1 Like

Not if you use Gemini’s Free Tier.

Is it? I thought they are charging the api costs + 20%? So wouldn’t just the api cost be less with your own key?

short guide on this? its just a max # of requests per day right or?

Yea, but here is the limits on Gemini API free tier…

Gemini 2.5 Pro -

Gemini 2.5 Flash 05-20 -

But the limits reset daily, so you get 525 Reqs Per Day which is more than enough, and once the are off preview the Requests limit usually increases a decent chunk.

The paradox is that MAX is sometimes cheaper.
MAX uses anywhere between 0 to n credits. in general MAX is better if you dare using it.

Following is part of my MAX usage:

as you see on the table, MAX can solve an issue for less than 1 request’s cost, while Claude-3.7-Thinking alone costs 2 requests.

The money is spent faster with MAX and the tasks are solved equally fast.

1 Like

how much $ is 0.7 request ?

1 Like

One request 0.04 USD. So O.7 request is 0.028 USD approximately.

1 Like

it charged me 5.1 Requests. thats around 0.20$ very expensive for 2k+ lines.

@gprethesh You need to check which tasks work well with MAX and how to prompt for MAX so it doesnt need so much. Some tasks are longer and need more tokens/requests and others are less complex and need less.

Definitely you have to be careful about how much context you attach. Don’t attach too much context unnecessarily as context = tokens that cost.

1 Like

same task might cost you more with MAX disabled.

1 Like

Good point.

Lengthy rules must be removed. They confuse more than they help.

You can stop using stop-words in your prompt to make them even shorter:

Here are some of them:
the, a, an, in, at, on, by, for, to, from, with, about, as, is, are, was, were, be, being, been, has, have, had, of, and, or, but, not, that, this, these, those.

The above words are mostly ignored.

This is my only rule: (Don’t use it directly, just for inspiration)

### **1. Sanity Check**

* Validate logic, coherence

* Detect contradictions, bias, paradoxes → pause, request clarification

* Break down complex requests → smaller sub-tasks

-–

### **2. Environment Setup**

* Activate Python environment:

`source drp_venv/bin/activate.fish`

* Run tests:

`clear && source drp_venv/bin/activate.fish && pytest -xsvv`

-–

### **3. Execution Pipeline (`!EPIPE`)**

* If valid → proceed:

1. Analyze request, create requirements

2. Research options, meet requirements

3. Develop concise solution plan (simple > lengthy)

4. Save plan → `./docs` (DOC-FILE)

5. List tasks → `./tasks` (TASK-FILE)

6. Implement via TDD, step-by-step

7. Tests → must pass per step

8. Check off completed tasks → TASK-FILE

9. All tests pass → accept

10. Mark done → TASK-FILE

11. Use required MCP tools

12. Update DOC-FILE → list created/modified files

13. Repeat as needed

* `!EPIPE` → refers to Execution Pipeline

-–

### **4. MCP Tool Usage**

* `MCP Graphiti` → memory storage, retrieval

* `MCP Sequential Thinking` → structured planning

* `MCP CodeScan` → code queries

-–

### **5. Function Size Rule**

* Functions <50 lines → focused, testable

* One responsibility per function

-–

### **6. File Size Rule**

* Files <500 lines → split by concern

* Extract logic → modules, utils, features

* Favor readability, reuse, modularity

-–

# Common English stopwords for content processing
STOPWORDS = {
    "a", "an", "the", "and", "or", "but", "if", "then", "else", "when",
    "at", "by", "for", "with", "about", "against", "between", "into",
    "through", "during", "before", "after", "above", "below", "from",
    "up", "down", "in", "out", "on", "off", "over", "under", "again",
    "further", "then", "once", "here", "there", "when", "where", "why",
    "how", "all", "any", "both", "each", "few", "more", "most", "other",
    "some", "such", "no", "nor", "not", "only", "own", "same", "so",
    "than", "too", "very", "s", "t", "can", "will", "just", "don",
    "should", "now", "to", "of", "is", "as", "that", "this", "have", "has"
}
1 Like

Great post! Yeah it almost needs a separate write up for Max mode with best practices.

1 Like

Also one more factor which affects higher number of requests is the lengthy files.
To fix something/anything LLM reads whole file, right? and the lengthier they are the more token and more confusion is created and consequently higher number of requests and more cost.

It could be an advantage to state in the rule, the max length of the file and the max length of each function.

Also the keep the tests also atomic, except integration tests.

1 Like