Where does the bug appear (feature/product)?
Cursor IDE
Describe the Bug
When using a custom model (qwen3-coder-plus) via a self-hosted LLM proxy server with an OpenAI-compatible endpoint, which normalizes tool calls and works around tool invocation issues for this model, the agent is frequently stopped with the error “Unrecoverable agent model looping detected” even though it is making legitimate progress on the task.
The issue appears when the model performs code analysis that requires many tool calls (searching the codebase, reading files), including calling several tools in parallel within a single turn.
What happens is that, before each block of tool calls, the model produces a short explanation of what it is going to do next.
This explanatory text is often very similar or even identical across turns, while the actual tool calls in the same turn are different (different tools, different parameters, or multiple parallel tool invocations).
The loop detection mechanism seems to focus mainly on the similarity of this reasoning text and therefore incorrectly concludes that the agent is stuck in a loop, even though the sequence of tool calls is changing and the model is progressing through the codebase.
Steps to Reproduce
-
Configure Cursor to use the custom model qwen3-coder-plus served by vLLM, or a similar self-hosted LLM proxy, via an OpenAI-compatible endpoint that supports Qwen models and normalizes and fixes tool invocation and tool result handling for this model.
-
In Ask or Agent mode, give the model a task that requires extensive project-wide analysis, for example: “Analyze the project files and find code duplication that can be optimized.”
-
Let the model run; observe that it starts invoking tools (often multiple tools in parallel in a single turn) and prints similar or identical explanatory text before each group of tool calls.
-
After a few such turns, even though the tools and their parameters are different, the run is terminated with the error “Unrecoverable agent model looping detected”.
Expected Behavior
-
The agent should be allowed to continue as long as the sequence of tool calls is actually progressing, including cases where several tools are invoked in parallel in a single turn.
-
Loop detection should take into account which tools are being called, with what parameters, and how the set of files or queries changes over time, rather than relying mainly on the similarity of the explanatory text between tool calls.
-
Repetitive reasoning text alone (for example, repeating the same description of the next high-level step) should not be enough to classify the behavior as an infinite loop if the underlying tool actions are different.
Screenshots / Screen Recordings
Operating System
Windows 10/11
Current Cursor Version (Menu → About Cursor → Copy)
Version: 2.1.47 (user setup)
VSCode Version: 1.105.1
Commit: 2d3ce3499c15efd55b6b8538ea255eb7ba4266b0
Date: 2025-12-04T02:31:50.567Z
Electron: 37.7.0
Chromium: 138.0.7204.251
Node.js: 22.20.0
V8: 13.8.258.32-electron.0
OS: Windows_NT x64 10.0.17763
For AI issues: which model did you use?
Сustom model qwen3-coder-plus served by vLLM, or a similar self-hosted LLM proxy, via an OpenAI-compatible endpoint that supports Qwen models and normalizes and fixes tool invocation and tool result handling for this model.
Additional Information
This issue has been reported with other custom models (Qwen family, Gemini, Grok). The loop detection logic needs to be refined for models that use structured reasoning patterns where explanatory text may be similar but underlying actions are distinct and progressive, especially when parallel tool calling is involved.
Partially - It does not make using custom models impossible, but significantly complicates and slows down the workflow for multi-step agent tasks.
Does this stop you from using Cursor
No - Cursor works, but with this issue
