MCP + image support

Thanks for reporting a bug you have found in Cursor!
Please add the following info to help us diagnose your issue:

:white_check_mark: Check the forum to ensure the issue hasn’t been reported already

None found

:lady_beetle: Provide a clear description of the bug

I was testing out the puppeteer mcp server written by anthropic. When I try and take a screen shot I get a warning saying that images aren’t supported.

:arrows_counterclockwise: Explain how to reproduce the bug (if known)

Just ask it to take a screen shot.

:camera: Attach screenshots or recordings (e.g., .jpg, .png, .mp4).

:computer: Tell us your operating system and your Cursor version (e.g., Windows, 0.x.x).
Sequoia 15.2
Cursor 0.45.10

:no_entry_sign: Tell us if the issue stops you from using Cursor.
nope

4 Likes

The command is npx -y @modelcontextprotocol/server-puppeteer

I just tried and got the same.

I tried adding github as well, but apparently it’s not exporting the tools in a way that cursor can understand…

+1 Would love this to work too :slight_smile:

The issue here is that the puppeteer MCP server outputs screenshots as base64 data directly to the terminal, rather than saving them as image files that Cursor can reference. This causes the “conversation too long” error since it’s trying to include all that raw image data

You’ll need to modify the server to save screenshots to files instead of outputting them directly. For now you can work around this by having the server save screenshots to a local path and return that path instead of the base64 data

@danperks is there a way to read the images from disk in the same way we paste the image into the chat window?

i have an MCP server that saving screenshots of my browser to disk, but I can’t seem to parse the image in the right way for the llm to see it

There’s no way to use an image produced by an MCP server in the Agent right now, except without manually adding it yourself from the disk!

This is a good idea though, will pass to the team.

6 Likes

@danperks thank you thank you!

imo this is the last missing piece of the agent loop.

Cursor agent is amazing at writing server code and checking itself, but if you guys add this one feature, you’ve basically built a Devin capable of fully autonomous full-stack web development right into the IDE.

Really looking forward to this

1 Like

capabilities: {
resources: {
templates: [“file://${path}”]
}, // Enable resources capability with templates
tools: {}, // Keep tools capability
}. When coding for MCP server if you support “resource” templates and we can send screenshots as resources. that would still be honouring MCP and give us that image reading abilties

1 Like

I’m having the same problem with a tool I created to manage Android devices and do UI automation and other tasks. Everything works amazing, except for screenshots. Any time a screenshot is requested, Cursor says the conversation is too long and fails. I’m using the MCP Python SDK and the Image type. I tried converting from PNG to JPG to reduce the size, but the conversation too long error persists. The size of the image is <300K so it should be totally fine.

This is definitely a missing piece in the agentic workflow. I’d love it if this got fixed!

1 Like

Thanks for sharing. How did you modify the MCP puppeteer server to do this?

A work around for this is just using an api key for Claude models. it’s unfortunate theres an additional cost, but it does make the server usable. Tried it with and without api key and confirmed.

Edit actually I realize it only worked the first time because the window it took the screenshot with was very small

To add here, this means puppeteer mcp and also playwright mcp can’t put their screenshot abilities to use.

Actually the model just hallucinates answers when you use the screenshot tool.

Def would be great to allow the returned images to be properly handled.

BTW, support for images being returned from mcp servers would also allow file system MCPs that can read images…

1 Like

If that’s of any help, I have a workaround for now :
I forked the mcp-playwright repo from microsoft and added an option to save the screenshots in a directory, returning the absolute path of the saved file.

Here it is : GitHub - Nazruden/playwright-mcp: Playwright MCP server
Created a PR on the original repo but not sure if they’ll accept it
Meanwhile here is how to use it :

  • Clone the fork
  • npm install
  • npm run build

Configuration looks like this then :

{
  "mcpServers": {
    "playwright": {
      "command": "node",
      "args": [
        "path/to/playwright-mcp/lib/program.js", 
        "--save-screenshots", 
        "path/to/screenshots/dir"
      ]
    }
  }
}

To go along with this I published an MCP server for images analysis, available here :

Usage instructions are available in the Readme

Hope this helps !

1 Like

Thanks! how do you read the image then? is that possible in cursor or windsurf?

Thanks, Dan. I can update the code manually to start saving them to disk.

How can Cursor reference them, however? Is there some docs on this? I’d be happy for now if the screenshots just showed up in the cursor chat (even though I understand right now you cannot make use of them) in future chats.

This will make it so the user using the MCP server can at least see that screenshots are happening in the moment, and click into it to see what screenshot was taken.

Hey, just added this section to our docs detailing how you can send images into the chat via MCP (on v0.49 and up!):

Thank you @danperks .. I’ve tried it out and it works well.

made a server to do just that :slight_smile: