[Guide] Maximizing Coding Efficiency with MCP Sequential Thinking & OpenRouter AI

This is really a useful tutorial!
Step 3, why we need OpenRouter AI to write code, I actually found that OpenRouter AI is very slow than cursor itself. In addition, I only see the call of Sequential Thinking, cound not seen the call of OpenRouter AI MCP, is there any special setting required? Thank you!

i could not see cursor auto call the openrouter ai mcp,
here is my chat history link:
https://share.specstory.com/stories/0e90476e-132d-49a9-bf65-d7186bf107c1
i only see once call of sequential thinking, none of openrouter ai call.

All configurations should be error-free, and the MCP service is normal and can be called.




3 Likes

@atlas I’ll be replying to the two messages right below the one I’m replying to. I’m getting home and catching onto all the replies.

Those parameters are part of the smithery-ai/server-sequential-thinking package, these parameters are defined in the smithery-ai/server-sequential-thinking package. You don’t need to extend them yourself, but you can adjust their values to optimize the thinking process.

I’m currently working on revision and refinement and implementing a proper workflow automation.

  • Error Analysis: Sequential thinking to understand error causes
  • Query Refinement: Sequential thinking to formulate effective search

As we fetch with other database, I’m still working on what was placed above.

Here are the parameters I’ve used and what they do:

  • maxDepth: Expands thinking depth the higher the more complex.
  • parallelTasks: Enables simultaneous processing of thoughts
  • enableSummarization: Automatically summarizes lengthy thought chains
  • thoughtCategorization: Clusters similar thoughts
  • progressTracking: Tracks progress of thought chains
  • dynamicAdaptation: Adjusts thinking strategies based on outcomes
  • contextWindow: Maximum processing context size (32768 tokens)

Note: .specstory/ is totally optional.

I am continuing to work on a guide to retain as much as we can and to automate even further, but in a simple way.

Below is a demonstration of the usage of Openrouter with sequential thinking.

Note: I’ve been experimenting with a couple more MCP servers that automate even the browsing with Playwright, and I have added six new MCP servers to my current MCP.json file, and I’m stress testing it.

Current MCP Server Setup:

  • Sequential Thinking + OpenRouter form the cognitive core for reasoning
  • All-in-One Dev provides system access for direct automation
  • Exa Search + Documentation Server provide knowledge access
  • GitHub Integration + Fetch Content manage external resources
  • Playwright offers web automation capabilities

Based on your latest reply.

I experienced the same issue yesterday when I was calling the openrouter API. It is now fixed as of today. I will be pushing the changes to the repository soon with a proper explanation and hope that it will fix your latest issue with the implementation.

  • WIthin the .cursorrules context, there’s been a whole re-write about the system and enforcing rules, that I’ve been testing and stress testing with multiple queries to maintain consistency, etc.
2 Likes

Big Nick!

Thanks for the kind words big nick.

Most services like OpenRouter, Anthropic, etc have free tiers that are perfect for learning.

API costs are opaque when you’re starting out. Here’s the reality:

  1. Free tiers (OpenRouter) cover most learning/experimentation.
  2. Paid usage typically starts at ~$0.001–$0.01 per 1,000 tokens. For context, a small project might use 10K–50K tokens/month.
  3. Cursor Pro already includes Claude access—use that for now.

Normal projects can go from $30 to $50, depending on the user request or the features you will be using with Cursor or any other budgeting tokenization usage.

If you’re just vibe-coding and learning, costs will likely stay under $5 per month. As projects grow, you’ll intuitively learn where to spend.


@sneo To address your question and clear up some mental fog there.

We are using OpenRouterAI to magnify the approach from the output call. By using the logical sequence approach, we can obtain a more direct and clear vision to be brought to light instead of just some random matches..

Specialized reasoning: OpenRouter model as used within the MCP Server handles complex, multi-step logic, and combined with sequential thinking, it is extremely powerful.

The guide uses it for edge cases where its reasoning depth matters and aims to get as close as possible to the user’s idea instead of some randomly generated code.

Can you take a screenshot of where the parameters are in the mcp source code and why I can’t find where they are defined?

I looked at the repository of sequentialthinking and found that it actually is

code,
I searched for the package installed on my computer,

Then look inside the code, and only the definitions of these parameters:

thought: data.thought,
thoughtNumber: data.thoughtNumber,
totalThoughts: data.totalThoughts,
nextThoughtNeeded: data.nextThoughtNeeded,
isRevision: data.isRevision,
revisesThought: data.revisesThought,
branchFromThought: data.branchFromThought,
branchId: data.branchId,
needsMoreThoughts: data.needsMoreThoughts,

When I tried to search for the ‘enableSummarization’ parameter, I didn’t see it.


Many Thanks !

2 Likes

@atalas

I have rewritten and optimized multiple things within the repository, once I have it fully implemented will totally push the repository.

To answer your question:

The parameters you’re referencing (thought, thoughtNumber, etc.) are part of the core sequential logic in the MCP server’s operational runtime . The enableSummarization and other configuration flags you’re missing are likely defined in a separate configuration layer (e.g., mcp.config.js, environment variables, or the settings section of your mcp.json).

Here’s why they aren’t visible in the source code you inspected:

  • Dynamic Configuration : Parameters like enableSummarization are often injected at runtime via environment variables or config files, not hardcoded.

  • Modular Architecture : The MCP server separates logic (what you found in the code) from configuration (where these flags reside).

Knowing this: The parameters you listed (thought, thoughtNumber, etc.) are correct for tracking sequential state—your observation aligns with the system’s design. The missing flags simply exist in a different layer of the architecture.

The parameters you see (thought, thoughtNumber, etc.) are part of the runtime logic for managing sequential workflows including other parameters like enableSummarization or thoughtCategorization to be used as they are not hardcoded in this file because they belong to the MCP server’s configuration layer which is user made and not server made.


The MCP works on two different ways:

Logic (this code):

  • Handles thought validation (validateThoughtData), history tracking, and output formatting.
  • Defines how sequential steps are processed (e.g., branching, revisions).

Parameters like enableSummarization or contextWindow are not hardcoded here because they control server behavior , not individual workflows: since we are using an user custom made configuration parameter not necesarilly has to follow along with the MCP as far as my knowledge.

Adding parameters into our customized layer, which is our MCP.json they must be defined in the correct layer meaning that those parameters belong to configuration, not runtime logic.

Configuration Parameters

  • Parameters like enableSummarization or contextWindow are not hardcoded here because they control server behavior , not individual workflows. This is the main reason we are using MCP.json file

parallelTasks

image

progressTracking

dynamicAdaptation

maxdepth

enable summarization

2 Likes

This new MCP for reasoning looks very promising.
It’s based on the one I was using before, but added transformer and hybrid reasoning.

Probably overly complicated to incorporate into this, but could be an idea for a similar rules setup (e.g. offloading reasoning to OpenRouter, etc.)

MCTS Reasoning

Use the /reason-mcts command followed by your query to start a MCTS-based reasoning chain

Beam Search Reasoning

Use the /reason-beam command for beam search-based reasoning

R1 Transformer Reasoning

Use the /reason-r1 command for single-step Transformer-based reasoning

Hybrid Reasoning

Use the /reason-hybrid command to combine Transformer and MCTS reasoning

1 Like

I tried the mcp server mentioned at he beginning it reasons but then to execute the plan and write really the code it coudent make it do it.

I think we need some video tutorials on this its to complicated and we need to see a real world proof of how it would work when building an application. But many thanks for all your efforts i will try it again later with your new findings but for now this mcp stuff hasent really improved any of my code

- `maxDepth`:
- `parallelTasks`
- `enableSummarization`
- `thoughtCategorization`
- `progressTracking`
- `dynamicAdaptation`:
- `contextWindow`

According to your explanation, I understand that these parameters can indeed be passed to the MCP service through the config.
However, the functionality corresponding to these parameters has not yet been implemented. Are you planning to implement them later in the sequential thinking MCP service?
Is that the correct understanding?
thank you! :+1:

@AbleArcher

Currently testing it: thanks for the approach: It’s working flawlessly with the other two MCP servers without issues.


I have been planning to open the hosting on AWS for it to work in the cloud instead of locally and implement it later into the MCP, as I will be forking and improving the original thinking server into a more structured approach.

3 Likes

great, i think you could use these open mcp proxy to fullfil it.

1 Like

Is that mean that your mcp server will be cost money?
Do you plan to publish your server side for local running?

Thanks for the share, I am going to take a deep look into this and see how we can approach the MCP properly.

Since it’s a small project that I’ve been running using different MCPs, something in the background is being worked out. Meanwhile, this guide will continue to have proper evolution over time, including the rules for the right automation: it’s all about testing and failing over and over until the desired outcome is achieved, especially when it’s aimed at a community where it’s beneficial for everyone, not just myself.

Hello, klesor.

I am a beginner programmer who loves to tinker, and I want to develop some small projects through cursor.

I think I have already invested at least 10 hours in browsing this article and the subsequent configuration, and some mcp services have been embedded in my cursor usage scenario, but it is currently quite chaotic.

I hope you can sort it out from your perspective, from top to bottom, and I am very much looking forward to such an occurrence!

If possible, I hope to be a 'little white mouse ’ , and quickly join this game!

2 Likes

Hey there, shawn.

Here’s some tips to avoid these chaotic issues.

  1. Establish a Baseline with One-Shot Prompts
  • Begin with a one-shot prompt to define clear context and expectations upfront. This mimics human collaboration and prevents Claude from making unwarranted assumptions by anchoring the interaction in a shared understanding. ONESHOT WORKS PERFECT WITH CLAUDE
  1. Implement Step-by-Step Validation
  • Break tasks into smaller steps and validate outputs iteratively. This avoids overly complex solutions and ensures alignment with your goals at each stage, reducing the risk of miscommunication.
  1. Combine Zero-Shot + Token Mitigation
  • Use zero-shot prompting for simplicity where possible, paired with strict token limits to minimize hallucinations. This balances creativity with precision
Prompting Guide for you to learn more about prompting and not letting the AI just rage and be brutal with your project.

In addition:

@atalas V3.1 It’s out thanks for your help and contribution have been tremendously helpful. (I added your username into the new push, by the way.)

1 Like

Great to know!
I think it could have a lot of potential.

Flash 2.0 Experimental and Flash 2.0 Thinking Experimental are both free, have 1M context, and ~200t/s throughput so could be great choices for rapid intensive reasoning (MCTS etc.)

The non-experimental Flash 2.0 is also very cheap and #2 behind Sonnet 3.7 for programming on OpenRouter (so must be good.)

Btw, now that 0.48 has added ‘Custom Modes’ it could be possible to replicate a variation of this system using those. :thinking:

2 Likes

1. Push a PR :rocket:

  • According to the description, the tools mentioned in 008.mdc* are integrated in this PR.
  • Additionally, two tools have been added: jina_reader.py and jina_search.py.
  • Based on your previous MCP file, a mcp.json.example has been added.

2. Possibility of Integrating and Adding 1 Git Server :link:

One of the challenges I’m currently facing with coding is how to commit in real-time. Here’s a link related to the topic:

3.Understanding the KleoSr Code Workflow with the Kleo Matrix Protocol :hammer_and_wrench:

As mentioned in your GitHub code, the Kleo Matrix Protocol is a structured workflow specifically designed for AI-assisted development. It aims to improve code quality, enhance development efficiency, and ensure project integrity. The protocol divides the software development process into five distinct phases, each with clear objectives and outputs to ensure transparency and traceability throughout.

Core Concept :sun:

The core idea of the Kleo Matrix Protocol is to break down complex development processes into ordered phases, each with specific goals and activities. This structured approach helps prevent common development pitfalls such as unclear requirements, improper architectural design, or inconsistent code implementation.

graph TD
    A[ANALYZE 📊] -->|Understand Requirements| B[CONCEPTUALIZE 💡]
    B -->|Design Architecture| C[BLUEPRINT 📘]
    C -->|Detail Design| D[CONSTRUCT 🏗️]
    D -->|Implement Code| E[VALIDATE ✔️]
    E -->|Verify Results| A

    style A fill:#f9d5e5,stroke:#333,stroke-width:2px
    style B fill:#eeac99,stroke:#333,stroke-width:2px
    style C fill:#e06377,stroke:#333,stroke-width:2px
    style D fill:#c83349,stroke:#333,stroke-width:2px
    style E fill:#5b9aa0,stroke:#333,stroke-width:2px

I think the process is fascinating! :thinking: Whether this whole process can also form an MCP is an interesting question, as it requires users to think sequentially by entering specific commands. Moreover, these instructions could potentially be simplified, as I agree they seem a bit lengthy. Alternatively, you might consider formatting them in .mdc or integrating them in the cursor notepad, which makes it easy to use @notepad to execution.:page_facing_up:

- "INITIATE ANALYZE PHASE"
- "INITIATE CONCEPTUALIZE PHASE"
- "INITIATE BLUEPRINT PHASE" 
- "INITIATE CONSTRUCT PHASE"
- "INITIATE VALIDATE PHASE"
1 Like

Thanks for the detailed insights—really appreciate it. I’ve been implementing the latest changelog and testing the new version released from Cursor sitting on 0.48, which has optimized the rules significantly. @atalas is actively collaborating on GitHub with his latest pull and we’re working towards refining the system further.

I’ve also considered the non-experimental Flash 2.0 for programming-heavy workflows and the quality output on Claude in general, since I’ve been doing some prompting techniques and so far there’s one shot prompting that works fascinating without breaking the system or hallucinating, there’s a lot being done in the background that will end up being worth.

Besides that the addition of ‘Custom Modes’ in 0.48 opens up some exciting possibilities. I’m drafting a variation of the system to leverage those capabilities—it could streamline the entire workflow even further.

Will keep this thread posted as we progress.


@atalas Regarding Python in the environment, Jina Reader and Jina Search I’ve been experimenting with running it in locally setup rather than server based, as I’ve been doing some implementation on AWS as-well, this approach helps maintain chat memory consistency and reduces bias drift while keeping the planning phase more structured and clean. It’s still early, but the results look promising.

The sequential thinking framework could itself evolve into an MCP, especially as we refine the command structure and make it more intuitive for users in the community, I’ll experiment with this approach and see how it impacts usability since formatting them in .mdc or integrating them directly into Cursor’s notepad via @notepad

I’ll start drafting a prototype for this Git-MCP integration and share updates.
Let me know if you have specific use cases

Note: I’ve been training my own local model to provide me proper prompt usage with Cursor that integrates nicely with Claude 3.5 and 3.7 sonnet giving the output on a one shot prompting, I’ve crawled +75 pages to index every detailed and advantage/disadvantage in regards to using the righteous promting techniques, please refer to the following documentation to gather more information about how I’ve trained my agent.

kleosr_prompting004.pdf (218.4 KB)

3 Likes

Great thread, will follow updates as it can be a great framework, some suggestions are already useful, about prompting, you’ll really like these papers:
(SC-DIR) [2402.03667] Large Language Models as an Indirect Reasoner: Contrapositive and Contradiction for Automated Reasoning
(TR + weighted voting > CoT, ToT) [2412.19707] Toward Adaptive Reasoning in Large Language Models with Thought Rollback
(5 personas > CoT) [2502.15725] Town Hall Debate Prompting: Enhancing Logical Reasoning in LLMs through Multi-Persona Interaction
I’m particularly interested in Personas which shows improvements of around +10% which is obvious, as specific expertise narrows context and models to better match user requirements, unfortunately frameworks like crewai or npcsh don’t currently offer an mcp.

2 Likes

Amazing work! thank you :folded_hands:

1 Like