Text formats are best understood by LLMs

AndonMitev · February 15, 2025, 9:41am

Which text formats are best understood by LLMs—Markdown, XML, or others? Also, are there any tools that can scrape content from a website or documentation and convert it into the optimal format for LLM processing?

danperks · February 25, 2025, 11:34pm

For scraping docs in Cursor, you’ll want to check out the @Docs command (Cursor – @Docs). It’s built to crawl and index documentation sites automatically

As for text formats - while Cursor works with various formats, we don’t make specific recommendations about which formats are “best” for LLMs since that’s more of a general ML question than something specific to Cursor!

Topic		Replies	Views
Is there a good way to improve context by documentation creators? Discussion	3	70	January 26, 2025
Quick Question About Cursor’s LLM Integration Discussion	2	322	November 8, 2024
Indexing https://llmstxt.org/ format to docs Bug Reports	1	26	April 11, 2025
Reliability Issues in Parsing LLM Outputs Feedback	0	130	November 8, 2024
Llms.txt as custom docs Discussion	0	148	February 28, 2025

Text formats are best understood by LLMs

Related topics