I read “We largely rely on 8k but sometimes use 32k” (OpenAI's API pricing - #4 by truell20) in August/24.
Is the use of 32k discontinued?
Also, what exactly happens when I fill the prompt with more tokens than the context window?
I saw that no error is raised, so I presume that it uses embeddings to just take chunks of my prompt. If this is the case, I’d like it to inform me (the user) about this.