I recently came across an experimental version of GPT-4o called GPT-4o Long Output offered by OpenAI. This model can generate outputs with a maximum of 64K tokens per request, which is significantly higher than the standard models. I believe this could open up exciting new use cases for longer and more detailed completions.
Just adding 4o pricing for comparison in case it saves anyone some time, as I found myself researching related things to self-educate myself.
I couldn’t find specific official docs on the max token output, the models page just says 128,000 token ‘context window’. Various unofficial, anecdotal posts I have come across suggest it is 4,096.