Would like custom timeouts for model inference in chat

Feature request for product/service

Chat

Describe the request

Can I please have a setting somewhere in the chat settings that allows me to set an inference timeout per model, or even a global one, and a way to set it to infinite? I suspect Cursor may be timing out mid-inference even though the model is working really hard on something I’ve asked it to do, and is just taking a long while but I get errors like the following:

Connection failed. If the problem persists, please check your internet connection or VPN
Stream closed with error code NGHTTP2_INTERNAL ERROR [internal]
(Request ID: 31b63dfc-43f5-4115-86e5-e8c4d63006f6)

Admittedly the prompt is very large, but there is no way to make it smaller. A 1m token model should easily be able to work through it, but I suspect the client or something else up-stream is just giving up on it because it hasn’t finished inferring in the requisite timeframe.