Bugbot doesn't catch all issues on first pass - multiple review cycles needed

Where does the bug appear (feature/product)?

BugBot

Describe the Bug

When BugBot analyzes a PR, it sometimes doesn’t catch all issues in the first pass. This results in multiple review cycles where new comments appear on the same files or other files for bugs that were present in the original code, not just in recent changes.

This creates a frustrating workflow where I have to push changes multiple times to address issues that should have been caught in the initial analysis.

Steps to Reproduce

  1. Create a PR with a file that has multiple bugs/issues
  2. Wait for BugBot to analyze the PR
  3. BugBot comments on one issue
  4. Fix and push the changes
  5. BugBot then comments on another issue in the same file or a different file that was present in the original code
  6. Repeat steps 4-5 multiple times

Expected Behavior

BugBot should analyze all files in the PR comprehensively on the first pass and provide all comments for existing issues at once, rather than discovering them incrementally across multiple pushes.

Possible configuration: If multiple bugs are found in one file, send all comments in the first analysis rather than spacing them out across subsequent reviews.

Operating System

MacOS

Version Information

N/A - This is a BugBot feature request/issue

Does this stop you from using Cursor

No - Cursor works, but with this issue

Hey, thanks for the report. This is a known issue. We’ve already seen cases where BugBot only finds some issues in a single run instead of flagging everything at once.

A couple things that can help for now:

  • You can manually trigger a re-check by leaving a cursor review comment on the PR. This can surface more findings without new commits.
  • You can turn off the “Run only once per PR” setting in your personal BugBot dashboard settings, so future pushes keep triggering checks.

To help the team investigate, could you share a link to the specific PR where this happened? Seeing the actual check history will make it much easier to understand what’s going on.

We have “run only once per PR configured“! I don’t have a PR at hand with these issues, but I will keep an eye on it and share it! My colleagues also noticed these issues!

@deanrie here is a PR where the node and sns imcompability was only captured after a few commits: https://github.com/heylibby/heyLibbyGrowth/pull/207

Here is the Bugbot comment: https://github.com/heylibby/heyLibbyGrowth/pull/207/changes/a41e0542004bd51700435153a456a43059fc2388#r2814849966

Thanks for the extra info. I passed it on to the team.

Let me know if you catch any more cases like this. The more examples we have, the easier it’ll be to debug.

Perfect! Will do!

My team also occasionally suffer from this. Any improvements here would be very useful. We’re trying to speed up our review cycles so it would be very beneficial if we could rely on bugbot to handle all these simple cases for us so we can worry about harder issues. Here is an example of something caught by claude using our bugbot.md file but not by bugbot itself. The repository is here but private: https://github.com/trabapro/traba-server-node/pull/12004.

@deanrie another example, this error wasn’t caught on the first try! https://github.com/heylibby/HeyLibbyAmplify/pull/2583#discussion_r2833011748 on this PR!

Would be great if there is a way for us to flag these issues or false positives so that BugBot can learn!

@deanrie we still see this problem quite often. Here are some notes from a colleague of mine

I have here an example of cursor bug bot about not leaving all comments in first itereation. This is the PR: https://github.com/heylibby/HeyLibbyAmplify/pull/2610, the comments are the last two.
What I did:

I merged develop into the branch and pushed the changes, nothing change in the original branch
I got two new suggestions (one refactor and one suggestion)

The thing was the first time I publish the PR, the isValidImageType function was already in the code, inside the new component and existing component. Is a minor, but is something that cursor could found in first iteration.

Thanks for the extra examples, I shared them with the team, including the latest one from your colleague with isValidImageType.

The more specific PR links we get, the easier it is to spot a pattern, so please send more if you notice any.

About a way to flag false positives or missed issues, there’s a separate feature request for that: Feature Request: Allow Team Members to Flag False Positives in Bug Reports. Feel free to vote so it can be prioritized.

For now, the cursor review comment workaround is still the best way to trigger a re-check without empty commits.

Let me know how it goes.

amazing! Thank you @deanrie

I am seeing the same behavior when trying Bugbot with our team. Each review surfaces different bugs. The preferred behavior would be to check for all of the bugs, and report all of them at the same time on the first pass so they can be addressed with fewer round-trip passes. Essentially, “find all issues”, not “find first issue then stop”.

The goal is to increase development efficiency by reducing feedback loops. My initial response to the first issue was very positive. I could see how this could be beneficial to our team. But after three rounds of fixing different issues that all existed in the original commit, the positive feelings are waning. 45 minutes later, It now feels like this is getting very tedious, and actually slowing down our workflow. Perhaps we are better off taking other approaches to review our code changes.

The core issue is that the Bugbot agent should identify all of the issues on the first pass. The only issues that should come up in the second pass are issues that were introduced by fixing the first set of issues.

I hope this can be improved soon, as I think it would be a valuable feature to teams like ours if it saved work instead of slowing us down.

@deanrie We are considering switching to Coderabbit, cause it’s slowing us down! We see it more and more often, and we completely! @Joyfullservice described perfectly how we feel! we really love bugbot and we don’t wanna go away, but if there is a better alternative out there we are gonna give it a try!

Hey @Luisa, I totally get it. If a tool slows you down instead of speeding things up, that’s really frustrating.

I want to be honest: the team knows about this issue, and all the PR examples you and your teammates and @Joyfullservice shared have been super helpful for diagnosing it. But we don’t have a specific timeline for when it’ll be fixed yet.

All I can say is your feedback helps us prioritize, and we’re definitely factoring in that this is impacting the workflow for multiple teams.

If you decide to try Coderabbit in parallel, no hard feelings. We’ll keep working on improving BugBot, and I’ll update this thread as soon as we have news.

This is becoming a huge pain point for me team as well. Our developers often spend a full day back and forth with bugbot on every PR before it is satisified. It takes SO LONG for each review, and it only ever finds one or 2 things, so it requires iterating multiple times.

Every sprint retro people complain about how much this slows down their workflow. It’s too bad because we liked bugbot SO much better than code rabbit, but we might have to try coderabbit again unfortunately.

@deanrie here is another PR where it only caught the role issue on the second run: https://github.com/heylibby/replifyPhone/pull/223

Had a pull request this week where a developer on my team iterated with bugbot 28 times before it was satisfied!! 28 times!! And it takes like 15 minutes each run… This might be the worst one I’ve ever seen.

Hey @Luisa, @mikej96, thanks for the new examples, they really help.

@mikej96, 28 iterations on a single PR is definitely not how this should work. Can you share a link to that PR? An extreme case like that will be especially useful for the team to investigate.

Status update: the team is aware of this issue, and we’ve passed along all the PR examples from this thread. There’s no specific timeline yet, but the amount of feedback here is definitely helping us prioritize. I’ll update the thread as soon as we have news.

@deanrie does the cursor team have any workarounds or temporary fixes for the issue while we wait for a more permanent fix?

For now, the only workaround is to manually leave a cursor review comment to trigger a re-check without making empty commits. It doesn’t fix the root issue, but it shortens the loop.

The team is aware, and all PR examples from this thread have been shared. There’s no specific timeline yet, but the amount of feedback here does affect prioritization. I’ll update the thread as soon as there are any news.