This is really a useful tutorial!
Step 3, why we need OpenRouter AI to write code, I actually found that OpenRouter AI is very slow than cursor itself. In addition, I only see the call of Sequential Thinking, cound not seen the call of OpenRouter AI MCP, is there any special setting required? Thank you!
i could not see cursor auto call the openrouter ai mcp,
here is my chat history link:
https://share.specstory.com/stories/0e90476e-132d-49a9-bf65-d7186bf107c1
i only see once call of sequential thinking, none of openrouter ai call.
All configurations should be error-free, and the MCP service is normal and can be called.
@atlas I’ll be replying to the two messages right below the one I’m replying to. I’m getting home and catching onto all the replies.
Those parameters are part of the smithery-ai/server-sequential-thinking
package, these parameters are defined in the smithery-ai/server-sequential-thinking
package. You don’t need to extend them yourself, but you can adjust their values to optimize the thinking process.
I’m currently working on revision and refinement and implementing a proper workflow automation.
- Error Analysis: Sequential thinking to understand error causes
- Query Refinement: Sequential thinking to formulate effective search
As we fetch with other database, I’m still working on what was placed above.
Here are the parameters I’ve used and what they do:
maxDepth
: Expands thinking depth the higher the more complex.parallelTasks
: Enables simultaneous processing of thoughtsenableSummarization
: Automatically summarizes lengthy thought chainsthoughtCategorization
: Clusters similar thoughtsprogressTracking
: Tracks progress of thought chainsdynamicAdaptation
: Adjusts thinking strategies based on outcomescontextWindow
: Maximum processing context size (32768 tokens)
Note: .specstory/ is totally optional.
I am continuing to work on a guide to retain as much as we can and to automate even further, but in a simple way.
Below is a demonstration of the usage of Openrouter with sequential thinking.
Note: I’ve been experimenting with a couple more MCP servers that automate even the browsing with Playwright, and I have added six new MCP servers to my current MCP.json file, and I’m stress testing it.
Current MCP Server Setup:
- Sequential Thinking + OpenRouter form the cognitive core for reasoning
- All-in-One Dev provides system access for direct automation
- Exa Search + Documentation Server provide knowledge access
- GitHub Integration + Fetch Content manage external resources
- Playwright offers web automation capabilities
Based on your latest reply.
I experienced the same issue yesterday when I was calling the openrouter API. It is now fixed as of today. I will be pushing the changes to the repository soon with a proper explanation and hope that it will fix your latest issue with the implementation.
- WIthin the .cursorrules context, there’s been a whole re-write about the system and enforcing rules, that I’ve been testing and stress testing with multiple queries to maintain consistency, etc.
Big Nick!
Thanks for the kind words big nick.
Most services like OpenRouter
, Anthropic
, etc have free tiers that are perfect for learning.
API costs are opaque when you’re starting out. Here’s the reality:
- Free tiers (
OpenRouter
) cover most learning/experimentation. - Paid usage typically starts at ~$0.001–$0.01 per 1,000 tokens. For context, a small project might use 10K–50K tokens/month.
Cursor Pro
alreadyincludes Claude
access—use that for now.
Normal projects can go from $30 to $50, depending on the user request or the features you will be using with Cursor or any other budgeting tokenization usage.
If you’re just vibe-coding and learning, costs will likely stay under $5 per month. As projects grow, you’ll intuitively learn where to spend.
@sneo To address your question and clear up some mental fog there.
We are using OpenRouterAI to magnify the approach from the output call. By using the logical sequence approach, we can obtain a more direct and clear vision to be brought to light instead of just some random matches..
Specialized reasoning: OpenRouter model as used within the MCP Server handles complex, multi-step logic, and combined with sequential thinking, it is extremely powerful.
The guide uses it for edge cases where its reasoning depth matters and aims to get as close as possible to the user’s idea instead of some randomly generated code.
Can you take a screenshot of where the parameters are in the mcp source code and why I can’t find where they are defined?
I looked at the repository of sequentialthinking and found that it actually is
code,
I searched for the package installed on my computer,
Then look inside the code, and only the definitions of these parameters:
thought: data.thought,
thoughtNumber: data.thoughtNumber,
totalThoughts: data.totalThoughts,
nextThoughtNeeded: data.nextThoughtNeeded,
isRevision: data.isRevision,
revisesThought: data.revisesThought,
branchFromThought: data.branchFromThought,
branchId: data.branchId,
needsMoreThoughts: data.needsMoreThoughts,
When I tried to search for the ‘enableSummarization’ parameter, I didn’t see it.
Many Thanks !
I have rewritten and optimized multiple things within the repository, once I have it fully implemented will totally push the repository.
To answer your question:
The parameters you’re referencing (thought
, thoughtNumber
, etc.) are part of the core sequential logic in the MCP server’s operational runtime . The enableSummarization
and other configuration flags you’re missing are likely defined in a separate configuration layer (e.g., mcp.config.js
, environment variables, or the settings
section of your mcp.json
).
Here’s why they aren’t visible in the source code you inspected:
-
Dynamic Configuration : Parameters like enableSummarization are often injected at runtime via environment variables or config files, not hardcoded.
-
Modular Architecture : The MCP server separates logic (what you found in the code) from configuration (where these flags reside).
Knowing this: The parameters you listed (thought
, thoughtNumber
, etc.) are correct for tracking sequential state—your observation aligns with the system’s design. The missing flags simply exist in a different layer of the architecture.
The parameters you see (thought
, thoughtNumber
, etc.) are part of the runtime logic for managing sequential workflows including other parameters like enableSummarization
or thoughtCategorization
to be used as they are not hardcoded in this file because they belong to the MCP server’s configuration layer which is user made and not server made.
The MCP works on two different ways:
Logic (this code):
- Handles thought validation (validateThoughtData), history tracking, and output formatting.
- Defines how sequential steps are processed (e.g., branching, revisions).
Parameters like enableSummarization
or contextWindow
are not hardcoded here because they control server behavior , not individual workflows: since we are using an user custom made configuration parameter not necesarilly has to follow along with the MCP as far as my knowledge.
Adding parameters into our customized layer, which is our MCP.json they must be defined in the correct layer meaning that those parameters belong to configuration, not runtime logic.
Configuration Parameters
- Parameters like
enableSummarization
orcontextWindow
are not hardcoded here because they control server behavior , not individual workflows. This is the main reason we are using MCP.json file
parallelTasks
progressTracking
dynamicAdaptation
maxdepth
enable summarization
This new MCP for reasoning looks very promising.
It’s based on the one I was using before, but added transformer and hybrid reasoning.
Probably overly complicated to incorporate into this, but could be an idea for a similar rules setup (e.g. offloading reasoning to OpenRouter, etc.)
MCTS Reasoning
Use the
/reason-mcts
command followed by your query to start a MCTS-based reasoning chainBeam Search Reasoning
Use the
/reason-beam
command for beam search-based reasoningR1 Transformer Reasoning
Use the
/reason-r1
command for single-step Transformer-based reasoningHybrid Reasoning
Use the
/reason-hybrid
command to combine Transformer and MCTS reasoning
I tried the mcp server mentioned at he beginning it reasons but then to execute the plan and write really the code it coudent make it do it.
I think we need some video tutorials on this its to complicated and we need to see a real world proof of how it would work when building an application. But many thanks for all your efforts i will try it again later with your new findings but for now this mcp stuff hasent really improved any of my code
- `maxDepth`:
- `parallelTasks`
- `enableSummarization`
- `thoughtCategorization`
- `progressTracking`
- `dynamicAdaptation`:
- `contextWindow`
According to your explanation, I understand that these parameters can indeed be passed to the MCP service through the config.
However, the functionality corresponding to these parameters has not yet been implemented. Are you planning to implement them later in the sequential thinking MCP service?
Is that the correct understanding?
thank you!
Currently testing it: thanks for the approach: It’s working flawlessly with the other two MCP servers without issues.
I have been planning to open the hosting on AWS for it to work in the cloud instead of locally and implement it later into the MCP, as I will be forking and improving the original thinking server into a more structured approach.
Is that mean that your mcp server will be cost money?
Do you plan to publish your server side for local running?
Thanks for the share, I am going to take a deep look into this and see how we can approach the MCP properly.
Since it’s a small project that I’ve been running using different MCPs, something in the background is being worked out. Meanwhile, this guide will continue to have proper evolution over time, including the rules for the right automation: it’s all about testing and failing over and over until the desired outcome is achieved, especially when it’s aimed at a community where it’s beneficial for everyone, not just myself.
Hello, klesor.
I am a beginner programmer who loves to tinker, and I want to develop some small projects through cursor.
I think I have already invested at least 10 hours in browsing this article and the subsequent configuration, and some mcp services have been embedded in my cursor usage scenario, but it is currently quite chaotic.
I hope you can sort it out from your perspective, from top to bottom, and I am very much looking forward to such an occurrence!
If possible, I hope to be a 'little white mouse ’ , and quickly join this game!
Hey there, shawn.
Here’s some tips to avoid these chaotic issues.
- Establish a Baseline with One-Shot Prompts
- Begin with a one-shot prompt to define clear context and expectations upfront. This mimics human collaboration and prevents Claude from making unwarranted assumptions by anchoring the interaction in a shared understanding.
ONESHOT WORKS PERFECT WITH CLAUDE
- Implement Step-by-Step Validation
- Break tasks into smaller steps and validate outputs iteratively. This avoids overly complex solutions and ensures alignment with your goals at each stage, reducing the risk of miscommunication.
- Combine Zero-Shot + Token Mitigation
- Use zero-shot prompting for simplicity where possible, paired with strict token limits to minimize hallucinations. This balances creativity with precision
Prompting Guide for you to learn more about prompting and not letting the AI just rage and be brutal with your project.
In addition:
@atalas V3.1 It’s out thanks for your help and contribution have been tremendously helpful. (I added your username into the new push, by the way.)
Great to know!
I think it could have a lot of potential.
Flash 2.0 Experimental and Flash 2.0 Thinking Experimental are both free, have 1M context, and ~200t/s throughput so could be great choices for rapid intensive reasoning (MCTS etc.)
The non-experimental Flash 2.0 is also very cheap and #2 behind Sonnet 3.7 for programming on OpenRouter (so must be good.)
Btw, now that 0.48 has added ‘Custom Modes’ it could be possible to replicate a variation of this system using those.
1. Push a PR 
- According to the description, the tools mentioned in 008.mdc* are integrated in this PR.
- Additionally, two tools have been added: jina_reader.py and jina_search.py.
- Based on your previous MCP file, a mcp.json.example has been added.
2. Possibility of Integrating and Adding 1 Git Server 
One of the challenges I’m currently facing with coding is how to commit in real-time. Here’s a link related to the topic:
3.Understanding the KleoSr Code Workflow with the Kleo Matrix Protocol 
As mentioned in your GitHub code, the Kleo Matrix Protocol is a structured workflow specifically designed for AI-assisted development. It aims to improve code quality, enhance development efficiency, and ensure project integrity. The protocol divides the software development process into five distinct phases, each with clear objectives and outputs to ensure transparency and traceability throughout.
Core Concept 
The core idea of the Kleo Matrix Protocol is to break down complex development processes into ordered phases, each with specific goals and activities. This structured approach helps prevent common development pitfalls such as unclear requirements, improper architectural design, or inconsistent code implementation.
graph TD
A[ANALYZE 📊] -->|Understand Requirements| B[CONCEPTUALIZE 💡]
B -->|Design Architecture| C[BLUEPRINT 📘]
C -->|Detail Design| D[CONSTRUCT 🏗️]
D -->|Implement Code| E[VALIDATE ✔️]
E -->|Verify Results| A
style A fill:#f9d5e5,stroke:#333,stroke-width:2px
style B fill:#eeac99,stroke:#333,stroke-width:2px
style C fill:#e06377,stroke:#333,stroke-width:2px
style D fill:#c83349,stroke:#333,stroke-width:2px
style E fill:#5b9aa0,stroke:#333,stroke-width:2px
I think the process is fascinating! Whether this whole process can also form an MCP is an interesting question, as it requires users to think sequentially by entering specific commands. Moreover, these instructions could potentially be simplified, as I agree they seem a bit lengthy. Alternatively, you might consider formatting them in .mdc or integrating them in the cursor notepad, which makes it easy to use @notepad to execution.
- "INITIATE ANALYZE PHASE"
- "INITIATE CONCEPTUALIZE PHASE"
- "INITIATE BLUEPRINT PHASE"
- "INITIATE CONSTRUCT PHASE"
- "INITIATE VALIDATE PHASE"
Thanks for the detailed insights—really appreciate it. I’ve been implementing the latest changelog and testing the new version released from Cursor sitting on 0.48, which has optimized the rules significantly. @atalas is actively collaborating on GitHub with his latest pull and we’re working towards refining the system further.
I’ve also considered the non-experimental Flash 2.0 for programming-heavy workflows and the quality output on Claude in general, since I’ve been doing some prompting techniques and so far there’s one shot prompting that works fascinating without breaking the system or hallucinating, there’s a lot being done in the background that will end up being worth.
Besides that the addition of ‘Custom Modes’ in 0.48 opens up some exciting possibilities. I’m drafting a variation of the system to leverage those capabilities—it could streamline the entire workflow even further.
Will keep this thread posted as we progress.
@atalas Regarding Python in the environment, Jina Reader and Jina Search I’ve been experimenting with running it in locally setup rather than server based, as I’ve been doing some implementation on AWS as-well, this approach helps maintain chat memory consistency and reduces bias drift while keeping the planning phase more structured and clean. It’s still early, but the results look promising.
The sequential thinking framework could itself evolve into an MCP, especially as we refine the command structure and make it more intuitive for users in the community, I’ll experiment with this approach and see how it impacts usability since formatting them in .mdc
or integrating them directly into Cursor’s notepad via @notepad
I’ll start drafting a prototype for this Git-MCP integration and share updates.
Let me know if you have specific use cases
Note: I’ve been training my own local model to provide me proper prompt usage with Cursor that integrates nicely with Claude 3.5 and 3.7 sonnet giving the output on a one shot prompting, I’ve crawled +75 pages to index every detailed and advantage/disadvantage in regards to using the righteous promting techniques, please refer to the following documentation to gather more information about how I’ve trained my agent.
kleosr_prompting004.pdf (218.4 KB)
Great thread, will follow updates as it can be a great framework, some suggestions are already useful, about prompting, you’ll really like these papers:
(SC-DIR) [2402.03667] Large Language Models as an Indirect Reasoner: Contrapositive and Contradiction for Automated Reasoning
(TR + weighted voting > CoT, ToT) [2412.19707] Toward Adaptive Reasoning in Large Language Models with Thought Rollback
(5 personas > CoT) [2502.15725] Town Hall Debate Prompting: Enhancing Logical Reasoning in LLMs through Multi-Persona Interaction
I’m particularly interested in Personas which shows improvements of around +10% which is obvious, as specific expertise narrows context and models to better match user requirements, unfortunately frameworks like crewai or npcsh don’t currently offer an mcp.
Amazing work! thank you