Can Cursor be taught/made aware of API Manual (docs)

So lets say I scrape an API’s Manual documentation and put it into a .txt. Will cursor know about this and can utilize it?

When I say manual, I mean like a written out guidebook of a library i use. tHE TXT is from scraping a properly well documented web page that provides samples and explanations of the library i use.

Thanks in advance!

use the grep search tool and ask it to search for ‘inserttermhere’ in the directory that your txt documentation is in, but be sure that its non-binary and not rich encoded txt files. if you need anymore assistance with it lmk, api documentation is exceedingly helpful in having a specialized assistant so i understand the struggle of figuring this out.

future recommendation, web scraper → links for all the docs → py script to download each page and save as pdf → py2pdf to convert the pdfs → py script to py to txt as pure txt output = full library of the documentation and a workflow to get all future documentation available locally.

alternatively, if the API is a library (numpy for example)
Getting help with documentation for numpy:
python -c “import numpy as np; help(np.nb.unravel_nb)” > unravel_help.txt
(replace np.nb.unravel_nb with whatever you’re looking for function wise or attribute/methods/blah blah blah and change the txt file name it outputs so you can locate and read it after) then you have to read the txt file to get and view the results for help, giving this as a helper command or rule to your agentic model will help it locate docs and debug

1 Like

So, the manual on the website that I am scraping has content that is repeated for every web page. Meaning when every class and namespace has its own web page 2/3 of the content is like website navigation stuff that’s clutter and needs to be ignored.

This is why I had to make a python script that could uniquely Target only the valuable information per web page. In other words a one size fits-all python scraper for library documentation would be not a good idea.

Now for a more network-based solution, perhaps you could write a python script that can simply navigate your libraries online documentation and just build out a mapping of urls and web pages in the web pages in The urls, etc. Then your AI can use that mapping to hopefully guess where online to look for the information it needs.

Thanks for the response, I’m currently investigating cursor AI as a additional tool for my workflow. The cursor IDE itself doesn’t have all the tools I need as a game programmer so it certainly won’t be replacing rider for me but I’m hoping it can replace chat GPT and it’s project module. Currently whenever there is enough changes I have to rerun all of my scraping and project data and upload those text files to my GPT project. It works fine but cursor AI feels more close to having a personalized and local assistant.

1 Like

I am busy with similar task as well - how to properly guide LLM to improve correctness and use existing patterns for solutions instead of reinventing a wheel. Looks like it is a common topic in AI assistance :slight_smile:
OOTB Cursor should be able to parse additional documentation from web sites. It worked perfectly and visibly till last updates - now it does not, at least for me.
You can store all the text from documentation in txt files in a separate folder and write a global or project-specific rule to explain Cursor how to read these files. This also works, at least it tries its best, but still not with 100% correctness.
State-of-the-art in improving correctness of LLMs - MCP. But you need a structured data as a backbone. If you store your data in form of a knowledge graph and provide functions to make queries on it - you can wrap it into your own MCP server and use Cursor to connect to it during requests.
Theoretically.
Practically, nothing works OOTB and one needs to create everything from ‘Hello, world!’ :frowning:

1 Like