Bugbot has quietly become one of the most important “teammates” inside Cursor, and this post is an inside look at how it went from scrappy experiment to a system that now reviews over two million PRs every month.
What Bugbot actually does
- Bugbot is a code review agent that scans your PRs for logic bugs, performance issues, and security problems before they hit production.
- It runs on both customer repos and all internal Cursor code, so every change ships through the same guardrails users get.
From rough prototype to real product
- Early on, models just weren’t strong enough, so the team iterated with a lot of internal testing, tweaking models, prompts, and pipelines until engineers actually trusted the comments.
- One surprisingly effective trick was running multiple bug-finding passes in parallel with shuffled diffs, then using “majority voting” to keep only issues that showed up across several passes.
Making it fast, reliable, and customizable
- To work at PR scale, the team rebuilt Git integration in Rust, optimized how much repo data is pulled, and added rate-limit monitoring and batching to play nicely with GitHub.
- Teams wanted checks that understood their own invariants, so Bugbot rules were added to encode things like unsafe migrations or forbidden internal API usage without hardcoding them into the system.
Measuring real impact, not vibes
- The team introduced a resolution rate metric: an AI-powered judge that checks, at merge time, which Bugbot-reported issues actually got fixed in the final diff.
- Using this, plus an internal benchmark (BugBench), they ran around 40 major experiments and pushed resolution rates from ~52% to over 70%, while more than doubling resolved bugs per PR.
Why the agentic redesign mattered
- The biggest leap came when Bugbot shifted from a fixed pipeline to a fully agentic loop where it can reason over the diff, call tools, and decide where to dig deeper.
- That flipped the prompting strategy: instead of trying to calm the model down, they now aggressively encourage it to investigate anything suspicious, then use tools, validation, and dynamic context to keep quality high.
Where this is heading next
- Bugbot is now evolving beyond “find issues” toward “find and fix”: Bugbot Autofix (in beta) spins up a Cloud Agent to automatically patch issues it finds.
- Upcoming work includes letting Bugbot run code to verify its own reports, do deeper research on complex issues, and even run in an always-on mode that continually scans your codebase instead of waiting for PRs.
Full blog post at https://cursor.com/blog/building-bugbot