The o1-mini model doesn’t seem to have streaming responses

shaun · September 14, 2024, 8:09am

Open AI panel
Toggle model to o1-mini
Ask for help

The o1-mini model returns the full response at once, not gradually.

sixteenstudio · September 14, 2024, 8:14am

I suspect this may be due to using OpenRouter under the hood.

Big conclusion jump - however I use OpenRouter for my products and they announced that while streaming is supported, all the tokens come in at one time (for o1-preview and o1-mini specifically)

This is probably a wise decision as 20RPM rate limit from their standard OpenAI account would not serve all of us!

shaun · September 14, 2024, 8:41am

I have a question: It seems that rate limits are designed to alleviate server pressure by delaying request processing. For instance, with a 20 RPM limit, the server would process my request after a delay and return the first token, rather than sending all tokens at once.

If I’m misunderstanding how RPM works, please correct me.

javicitou · September 14, 2024, 8:54am

OpenAI has not yet enabled streaming for those models through the API.

Topic		Replies	Views
What's the rate limit for o1-preview & o1-mini Discussions	2	471	November 30, 2024
Why is the o1-mini model limited? Discussions	10	1249	October 25, 2024
OpenAI New o1 models Discussions	49	14931	September 27, 2024
Will o1-mini 7x quota bump affect Cursor's o1-mini pricing/availability? Discussions	6	777	September 18, 2024
[Resolved] Cursor status: service problems -- openai streaming is down Discussions	1	332	September 21, 2023

The o1-mini model doesn’t seem to have streaming responses

Related topics