When using claude w/ api key, howto increase/control max tokens in response?

I keep getting truncated responses, and doing “continue” gets all jumbled up.

2 Likes

You can’t

is this worth a feature request @truell20 ? seems like a pretty simple change?

1 Like

I have the same request, it would be great to control context length and/or max tokens when using our own API key

1 Like

The dev’s are getting greedy. Instead of just allowing us to change the dam max_response token count, they limit it first to useless levels. 1024 response tokens. This means if you have 500 tokens in instructions, you get a 500 token repsonse. 1000 instruction token system prompt and cursor does not work.

Why these guys did this is beyond me, they never used to limit tokens and for whatever reason they put a hardcoded cap at 1024, and its hidden somewhere either on their server end or compiled in a DLL so you CANNOT CHANGE IT. And they charge $20 a month for this broken shit./

+1 here, it would be really helpful to be able to specify a custom limit on our own API key or via the invoice system (like it currently works with more than 10 Claude requests). Sometimes certain problems need a lot of context in large applications for the AI to implement everything correctly, and it would be very helpful in those cases.

Or they could give the full 200k token limit on Haiku, cost wise it would still be 4 times cheaper than 1 opus request with 10k token limit.

1 Like

I think it should allow control of context size and what data gets sent to the ai endpoint, when it runs in “my own api keys” mode. but for ‘all inclusive’ as a service, with the monthly subscription, they’ll go broke if people start using 10 queries a day with Opus and up to 100-200k context :see_no_evil: it costs 0.6$ for a single query with 40k context for me on openrouter. you literally deplete 20$ in an hour, not in a day.

btw this feature request is not new, they know we need it, let’s wait for updates. but yeah the limitations of ‘as a service’ mode are part of the service price, for 100$ a month they could afford 10 queries a day with large context, but nobody will subscribe. the Haiku and Sonnet modes should help, I’d guess this is all in development or being considered, I like the product so far.

Its basically useless without being able to have longer replies. The cursor AI window has significantly worse quality that just using the claude.ai.

Have you tried the long context chat? It’s supposed to have no token limits