Cloud agent relatively quickly gives up on live testing due to computeruse exceeding 100 images/documents

Where does the bug appear (feature/product)?

Background Agent (GitHub, Slack, Web, Linear)

Describe the Bug

Cloud agents are really amazing because they can test the environment and debug until the bugs are fixed. However, numerous times now I’ve run into the issue where the cloud agent refuses to continue testing because, for example, “:warning: I attempted GUI verification, but the cloud computerUse environment is still blocked by the same provider cap (maximum of 100 images/documents), so I couldn’t capture fresh UI evidence in this run.”

I’m not really sure I understand why there are 100 images and documents that we’re bumping into. Is it counting all the screenshots and videos that it took while testing to show in the Cursor agent web UI? Why would they be part of the context?

Ideally, the agent would automatically understand how to fix this and drop some of the images and documents, whatever they are, from the context. Or at the very least, allow us to do so without having to start a brand new chat.

Steps to Reproduce

In the most recent example, I was developing a Chrome extension and sent 5-6 replies to the Cloud agent. I told it to test things and fix certain bugs. After about 3-4 replies, it started bumping into this limit and refusing to test in its environment.

Screenshots / Screen Recordings

Operating System

Windows 10/11

Version Information

I’m using a cloud agent.

For AI issues: which model did you use?

Codex 5.3 High.

For AI issues: add Request ID with privacy disabled

bc-fdd1b82f-1fb2-4128-a63d-82c9ed5c1df4

Does this stop you from using Cursor

Sometimes - I can sometimes use Cursor

Hi @archon810 Thanks for reporting! This is a known class of issue. The computerUse exceeding 100 images/documents message is coming from the underlying GUI/computer-use provider limit, not from files in your repo.

In practice, screenshots or other visual inputs captured during live GUI testing can accumulate during a long Cloud Agent thread. Once that provider-side image/document cap is reached, the agent may no longer be able to take fresh screenshots for verification in that same run.

We’re tracking this behavior as something to improve or surface the status of this more transparently. For now, the best workaround is to start a fresh follow-up Cloud Agent run with a short summary of the current state, or ask the agent to prefer non-visual checks where possible, such as CLI tests, logs, DOM inspection, or targeted screenshots only when needed.

I took another look, and I actually have great news for you sooner than I expected. Yesterday, a PR was merged than can help to allow for summarization of this image context and allow for compaction. It will take some time for this to be available to the general public but expect an update soon!

Amazing news. Thank you for such a quick update.

I tried starting a new chat session. It responded once (apparently, the Gmail user got logged out so it couldn’t proceed), and this was the result of the second reply which was me saying I just logged you in, carry on:

I can’t run the live Gmail validation yet because the computerUse provider is still hard-failing with:

Request contained too many images or documents … maximum of 100
I tried 3 separate remediations after your “signed in” update:

Resume existing GUI agent with new test prompt.
Start a fresh computerUse invocation with minimal prompt + explicit model.
Start a readonly computerUse probe.
All three failed with the same provider-limit error before any browser actions could run.

I don’t even understand how these limits are being hit so quickly when it only generated 7 images and 1 video in the first reply.

bc-ef7063d5-ae22-4e21-8b24-a360780e07b9

Hi @archon810 Sorry about that. To be clear, the image maximum is not just images that are surfaced and visible to the user, but also images (screenshots) that it takes in the process of conducting its normal business on your behalf. I checked our merged PRs, and the PR that I referenced earlier has been merged into main. Do you want to try again now?