Deepseek-R1 in Cursor seems degraded compared to chat.deepseek.com

normalnormie · February 6, 2025, 2:06pm

I think to have found what’s the problem, its called underthinking
In complex thinking(like math) there’s this problem called underthinking and evidenced in this paper https://arxiv.org/pdf/2501.18585
This paper uses TIP(thought switching penalty)

We conducted the experiments using QwQ-32B-Preview, as the DeepSeek-R1-671B API does not
allow for the modification of logit

Searching further found this gist exploring the paper with AI to use prompts instead of modifying logit: AI Prompt python programming usecase (TIE Methodology) · GitHub

First try with a clean context shows behavior difference:

They would produce different results for cases where:

* event.start < segment.start but close to boundary

* Floating point precision edge cases near the min_gap threshold

Second try using the mega prompt shows no behavior difference:

Conclusion:

Both functions exhibit identical behavior across all possible input scenarios. The structural differences in condition checking are logically equivalent.

Try using this mega-prompt from the gist to solve complex thinking scenarios:

Solve this problem using a focused and persistent approach. Begin by selecting a single line of reasoning and explore it as deeply as possible. Demonstrate sustained reasoning by meticulously showing every step of your logic, even if the solution path becomes challenging or complex. Prematurely changing your strategy is strongly discouraged. Do not use phrases that suggest a shift in approach, such as "Alternatively, " unless your current strategy is definitively and demonstrably incorrect. If, and only if, you have completely exhausted all possibilities within your initial approach and can prove it will not lead to a correct solution, you may then consider a new strategy. If a strategy change becomes absolutely necessary, provide a clear and detailed justification, explaining why the first approach failed before moving on. Your goal is to demonstrate thoroughness and depth in your reasoning process, prioritizing a complete exploration of a single approach over premature exploration of multiple approaches.

Its possible deepseek chat automatically uses thinking switching penalties for complex thinking as model used by Cursor is the 671B one as tested from this user(and confirmed by Cursor team): Is Cursor using the full power of DeepSeek R1? - #23 by Kirai

Topic		Replies	Views
Which version of Deepseek R1 does Cursor use? Discussions	2	1114	January 29, 2025
Deepseek r1 with cursor Bug Reports	9	3322	January 23, 2025
Deepseek R1 succesive user messages error Bug Reports	2	1144	January 22, 2025
Is Cursor using the full power of DeepSeek R1? Discussions	23	6340	February 8, 2025
The chat dialog does not return reasoning_content Bug Reports	6	65	February 8, 2025

Deepseek-R1 in Cursor seems degraded compared to chat.deepseek.com

Related topics