How Much Does Custom Tree-sitter Grammar Impact Cursor Indexing?

SemperFidelis0510 · April 17, 2025, 8:58am

Hey all

I’m working with a custom programming language inside Cursor, and I’ve got a setup that includes:

A working Tree-sitter grammar, integrated as an extension (syntax highlighting works)
A language server (LSP) using glspc, also based on the same Tree-sitter grammar

From what I’ve seen in Codebase indexing VS chat with codebase, Cursor splits code into syntactically meaningful chunks using Tree-sitter and uses those chunks to compute embeddings for its indexing system (used in @codebase, completions, explanations, etc.).
However, I’m not sure if this leverages the Tree-sitter grammar provided by a custom extension, or instead uses an internal Tree-sitter parser that’s general-purpose for all languages.

So I’d love some clarification on how much the quality and structure of my Tree-sitter grammar actually affects Cursor’s indexing.

Here’s what I’m trying to figure out:

How important is the level of structural detail in the grammar?

If my grammar produces deeper and more specific trees (vs. shallow or generic rules), does that give Cursor more semantic precision?
Do finer-grained distinctions between constructs help the indexer better understand the codebase?

How important are the actual node names?

Are there specific node names Cursor expects or prioritizes (e.g., function_definition)?
Or is it mostly pattern-based or positional?
For example, will Cursor index better if I define nodes like:
- function_definition instead of unit
- doc_comment or comment instead of comm
- import_statement instead of macro

I’m also assuming — and would like to confirm — that Cursor uses the internal VSCode LSP framework under the hood, which (as far as I know) may rely on Tree-sitter for tokenization and syntax parsing. Is that correct?

Would really appreciate any insight from the team or anyone who’s worked with custom languages in Cursor. Just trying to understand how much control I have by refining the grammar.js file.

Thanks a lot!

Let me know if you’d like to cross-post this to GitHub Discussions or a Discord community — or want help tracking responses.

Topic		Replies	Views
Does Improving Custom Tree-sitter Extension Grammar Help Cursor Indexer? Discussion	0	14	April 21, 2025
Cursor Feature Deep Dive Discussion	3	992	September 3, 2024
Codebase in Cursor How To	2	986	November 20, 2023
Codebase indexing VS chat with codebase Discussion	7	3253	April 17, 2025
Will Cursor learn my own programming language? How To	9	833	February 12, 2025

How Much Does Custom Tree-sitter Grammar Impact Cursor Indexing?

Here’s what I’m trying to figure out:

Related topics