How to use new feature Image Injectionin in MCP?

I have already read the documentation carefully: Cursor – Model Context Protocol

I believe my personal MCP server is returning the data in the correct format:

{
  "content": [
    {
      "type": "text",
      "data": "纯文字图片\n=====\n\n"
    },
    {
      "type": "image",
      "data": "77+9UoT77+977+9Yy.....Pvv73vv716FRv70=",
      "mimeType": "image/jpeg"
    },
    {
      "type": "text",
      "data": "  \n\n"
    }
  ]
}

However, the image still isn’t showing up in the chat context.


Version: 0.49.4
VSCode Version: 1.96.2
Commit: ec408037b24566b11e6132c58bbe6ad27046eb90
Date: 2025-04-22T00:13:20.211Z
Electron: 34.3.4
Chromium: 132.0.6834.210
Node.js: 20.18.3
V8: 13.2.152.41-electron.0
OS: Darwin arm64 23.5.0

Could someone please help me figure this out? I’d really appreciate it!

Your format looks spot on. I have the same format.

Cursor shows image fine my MCP return has:
content:

             {
                type: 'image',
                mimeType: firstItem.mimeType,
                data: firstItem.blob,
             }

where data is base64 image.

However.. even though cursor renders the image fine. If you’re using google gemini then It’s totally blind to it, claude can see it. So I think Cursor isn’t fully functional and this new feature is buggy.

In your case, triple check mime type, perhaps it’s a PNG not a JPEG?

Thank you for reply, I’m sure the image mime type is correct, I tried official JS example, replace the image with my own base64url—it worked fine in my Cursor too!

I’m thinking the issue might be related to my SSE server. I’m using Spring.AI to implement my MCP server, so maybe the data eventually sent to Cursor is different from a standard STDIO MCP.

Could be, if you have copy of base64 in full, you could try to reassemble it back into a .jpg and see if it’s a valid image to rule that out.

Yes, I’ve already tried it, and it can indeed be converted into a jpg file.

I figured out why my Spring AI MCP server failed to support image injection.

By default, the tool call result is of type text, not image.

@JsonInclude(JsonInclude.Include.NON_ABSENT)
public record TextContent( // @formatter:off
    @JsonProperty("audience") List<Role> audience,
    @JsonProperty("priority") Double priority,
    @JsonProperty("type") String type,
    @JsonProperty("text") String text) implements Content { // @formatter:on

    public TextContent {
        type = "text";
    }

    public String type() {
        return type;
    }

    public TextContent(String content) {
        this(null, null, "text", content);
    }
}
public static McpServerFeatures.SyncToolRegistration toSyncToolRegistration(ToolCallback toolCallback) {
    var tool = new McpSchema.Tool(toolCallback.getToolDefinition().name(),
            toolCallback.getToolDefinition().description(),
            toolCallback.getToolDefinition().inputSchema());

    return new McpServerFeatures.SyncToolRegistration(tool, request -> {
        try {
            String callResult = toolCallback.call(ModelOptionsUtils.toJsonString(request));
            return new McpSchema.CallToolResult(List.of(new McpSchema.TextContent(callResult)), false);
        } catch (Exception e) {
            return new McpSchema.CallToolResult(List.of(new McpSchema.TextContent(e.getMessage())), true);
        }
    });
}

To solve this, I created a custom toSyncToolRegistration method, using the returnDirect field to decide whether to return the tool result as raw JSON.
Apparently, Spring AI mcp sdk doesn’t seem to use this field in the @Tool annotation.

public static McpServerFeatures.SyncToolRegistration toSyncToolRegistration(ToolCallback toolCallback) {
    var tool = new McpSchema.Tool(toolCallback.getToolDefinition().name(),
            toolCallback.getToolDefinition().description(),
            toolCallback.getToolDefinition().inputSchema());

    return new McpServerFeatures.SyncToolRegistration(tool, request -> {
        try {
            String callResult = toolCallback.call(ModelOptionsUtils.toJsonString(request));
            if (toolCallback.getToolMetadata().returnDirect()) {
                return ModelOptionsUtils.jsonToObject(callResult, McpSchema.CallToolResult.class);
            } else {
                return new McpSchema.CallToolResult(List.of(new McpSchema.TextContent(callResult)), false);
            }
        } catch (Exception e) {
            return new McpSchema.CallToolResult(List.of(new McpSchema.TextContent(e.getMessage())), true);
        }
    });
}

Here is my image tool implementation:

@Tool(name = "image_tool", description = IMAGE_TOOL_DESC, returnDirect = true)
public McpSchema.CallToolResult imageTool(@ToolParam(description = TOOL_PARAM_DESC) String... urls) {
    if (urls == null || urls.length == 0) {
        return new McpSchema.CallToolResult(Collections.emptyList(), true);
    }

    List<McpSchema.Content> list = new ArrayList<>();
    for (String url : urls) {
        McpSchema.ImageContent image = new McpSchema.ImageContent(
            List.of(), 1.0D, "image", urlToBASE64(url), "image/jpeg"
        );
        list.add(image);
    }

    return new McpSchema.CallToolResult(list, false);
}

With this setup, I’m now able to return images as part of tool responses by setting returnDirect = true on the tool definition.