Max token output in chats with api key

Cursor has been extremely helpful for me. However, due to my specific use case, I need to extract even more from it, even if that means paying $20 per day for the API.

I’ve observed that the outputs from the chat are sometimes cut off, requiring manual continuation. Could this be a result of a max_token parameter sent via the API? While I understand that this limitation makes sense for the subscription, I wonder if there are ways to customize or bypass it when using user-provided APIs?

Moreover, I realize that implementing changes from suggested code in the chat necessitates a non-public model, which comes with the subscription, and is not available with just the API. Since I now have both the subscription and the API, would this work? If so, that would save me a significant amount of time by reducing copy/pasting and visual comparisons.

it would be nice if I do not have to manually turn off “using key” when entering the interpretator mode too, since that can only be used with the subscription GPT model anyway.