How do index documentation that doesn't want to index

Hi all,

I come across this problem regularly trying to add documentation such as:

It fails to index any pages (bar one). What is the workaround?

Thanks :folded_hands:

Report it as a bug.

Hey, try adding the URL like this:

https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/

Hey there. A simple fix I use is using one of the many site crawler to markdown projects to crawl a site into a clean directory into markdown files. Then I use repomix to turn the entire directory into a single file. Then I upload the file to wherever (e.g github gist/project) and point the indexer to that single file URL.

Another approach a colleague of mine does is he runs his own indexer locally that crawls/scrapes a site, and host the markdown locally. He can call it from an MCP and it returns a local url to pass to the indexer.