When using claude w/ api key, howto increase/control max tokens in response?

unicomp21 · March 27, 2024, 2:28pm

I keep getting truncated responses, and doing “continue” gets all jumbled up.

debian3 · March 27, 2024, 2:53pm

You can’t

unicomp21 · March 27, 2024, 2:56pm

is this worth a feature request @truell20 ? seems like a pretty simple change?

l4time · March 27, 2024, 3:40pm

I have the same request, it would be great to control context length and/or max tokens when using our own API key

Cytranic · March 29, 2024, 12:43pm

The dev’s are getting greedy. Instead of just allowing us to change the dam max_response token count, they limit it first to useless levels. 1024 response tokens. This means if you have 500 tokens in instructions, you get a 500 token repsonse. 1000 instruction token system prompt and cursor does not work.

Why these guys did this is beyond me, they never used to limit tokens and for whatever reason they put a hardcoded cap at 1024, and its hidden somewhere either on their server end or compiled in a DLL so you CANNOT CHANGE IT. And they charge $20 a month for this broken shit./

fun_strange · March 30, 2024, 10:46am

+1 here, it would be really helpful to be able to specify a custom limit on our own API key or via the invoice system (like it currently works with more than 10 Claude requests). Sometimes certain problems need a lot of context in large applications for the AI to implement everything correctly, and it would be very helpful in those cases.

debian3 · March 30, 2024, 12:54pm

Or they could give the full 200k token limit on Haiku, cost wise it would still be 4 times cheaper than 1 opus request with 10k token limit.

HappyQuokka · April 1, 2024, 10:29pm

I think it should allow control of context size and what data gets sent to the ai endpoint, when it runs in “my own api keys” mode. but for ‘all inclusive’ as a service, with the monthly subscription, they’ll go broke if people start using 10 queries a day with Opus and up to 100-200k context it costs 0.6$ for a single query with 40k context for me on openrouter. you literally deplete 20$ in an hour, not in a day.

HappyQuokka · April 1, 2024, 10:32pm

btw this feature request is not new, they know we need it, let’s wait for updates. but yeah the limitations of ‘as a service’ mode are part of the service price, for 100$ a month they could afford 10 queries a day with large context, but nobody will subscribe. the Haiku and Sonnet modes should help, I’d guess this is all in development or being considered, I like the product so far.

ccccc · August 14, 2024, 7:11pm

Its basically useless without being able to have longer replies. The cursor AI window has significantly worse quality that just using the claude.ai.

fun_strange · August 14, 2024, 7:28pm

Have you tried the long context chat? It’s supposed to have no token limits

Topic		Replies	Views
[Feature Request] Add support for new Claude Sonnet 3.5 8K output token limit Feature Requests	2	347	July 18, 2024
Claude 3 Haiku with a larger context window Feature Requests	8	1825	April 10, 2024
Stop Limiting output tokens for my own API key Feature Requests	1	233	September 29, 2024
Please allow max number of tokens supported by the models Feature Requests	2	2190	August 5, 2024
Custom API Response Discussions	0	239	March 25, 2024

When using claude w/ api key, howto increase/control max tokens in response?

Related topics