Feature request for product/service
Chat
Describe the request
Can I please have a setting somewhere in the chat settings that allows me to set an inference timeout per model, or even a global one, and a way to set it to infinite? I suspect Cursor may be timing out mid-inference even though the model is working really hard on something I’ve asked it to do, and is just taking a long while but I get errors like the following:
Connection failed. If the problem persists, please check your internet connection or VPN
Stream closed with error code NGHTTP2_INTERNAL ERROR [internal]
(Request ID: 31b63dfc-43f5-4115-86e5-e8c4d63006f6)
Admittedly the prompt is very large, but there is no way to make it smaller. A 1m token model should easily be able to work through it, but I suspect the client or something else up-stream is just giving up on it because it hasn’t finished inferring in the requisite timeframe.