Jack1
October 3, 2024, 1:01pm
1
Cant seem to index a single docsite
I tried just adding the docs url
Tried the html sitemap method
for example, in case of example.com/xyz/abc
use only example.com as prefix
the HTML-XML-sitemap of example.com as the Entrypoint
is that what you did?
by setting https://langfuse.com as prefix and https://www.xml-sitemaps.com/download/langfuse.com-3db9835b9/sitemap.html?view=1 as entrypoint
Tried all the variations I could think off with appending slashes and whatnot.
Also tried on the following docs as a test :
It always just processes for 1 sec then says 0 pages indexed.
Nothing in any of the Output Tabs related to any issues as this is probably done server side and no trace is coming back to us.
Is anyone able to index anything right now ?
2 Likes
I have also experienced document crawling not working on sites that it previously worked for and documented it here:
Hi @Blorf299 ,
Here are the values I used at:
Cursor Settings > Features > Docs > + Add new doc
when using 0.41.3:
// initial input field
https://getuikit.com/docs/introduction
// prefix
https://getuikit.com/docs
// entrypoint
https://getuikit.com/docs/introduction
I am now also getting:
Indexed 0 pages
There are no logs present in Output > Window.
So seeing how these settings worked when i last tried it , I would say there is possibly a bug.
Below are screenshots of what I attempte…
Currently at 0.41.3
.
1 Like
Same issue here with 0.41.3
. I was previously able to index https://raw.githubusercontent.com/knowsuchagency/promptic/refs/heads/main/README.md but no longer. The indexing simply doesn’t work at all. I copied and pasted the docs to pastebin in case the issue was accessing github but it also fails to fetch the page and index it.
still broken, can’t index pygad.readthedocs.io …
@Butanium - While it does seem like Cursor is having issues getting an up to date version of those docs, it looks like there is a cached version already saved, so you should be able to use that for now! I’ve reported this one to look into - thanks for letting us know.
@shahsarick Hey, what documents are you trying to index?
1 Like
hey thx, where can I find this saved version?
Hey, just adding the URL you sent should work to at least load the cache, but I was seeing Update failed
in the UI when the scraper attempted to get a more up-to-date copy.
1 Like