[Experience Report] Composer 1 handled an actual production EC2 incident end-to-end

Tomoki_Shibahara · November 14, 2025, 6:06pm

Hi Cursor Team and Community,

I’d like to share a real production incident where Composer 1 effectively acted as an SRE/AIOps engineer and restored a downed EC2 instance with almost no human intervention.

This isn’t a demo or a fun experiment —
this was a live production outage on AWS.

(Instance IDs and IPs are obfuscated. All steps are factual.)

Here is a screenshot from the incident, showing Composer 1 autonomously executing the recovery workflow in real time:

⸻

■ Incident Overview
• Root EBS volume hit 100% usage
• OS froze → SSH unreachable
• SSM unstable
• CloudWatch alarms firing
• Application fully stopped

Normally, this type of failure requires:

creating a rescue EC2 → detaching the root volume → repairing/extending it → reattaching → boot

This time, Composer 1 performed almost the entire workflow by itself.

⸻

■ What I asked Composer (only this):

“Root EBS is full and the EC2 can’t boot.
Please restore it safely.”

From this one instruction, Composer:
• Diagnosed the situation
• Designed the full rescue plan
• Generated every command
• Guided the step-by-step workflow
• Recovered the system

This was the most “agent-like” behavior I’ve ever seen in an AI model.

⸻

■ What Composer 1 actually did

1. Diagnosed the failure
• Determined SSH timeout cause: SG/NACL/Route path
• Selected the safest recovery method (rescue instance + EBS attach)
• Generated AWS CLI commands with guards and human confirmation steps
• Switched to SSM automatically when SSH became unreliable

⸻

2. Prepared the rescue instance
• Launched rescue EC2
• Created/adjusted Security Groups
• Tested connection paths
• Correctly detected when fallback was needed

⸻

3. Detached / attached the root EBS
• Identified the correct root volume
• Prevented dangerous operations by pausing and asking for approval
• Automatically found the correct device mapping (/dev/xvdf, nvme1n1p4)

⸻

4. Repaired and expanded the filesystem

Composer handled:
• growpart
• xfs_repair when corruption was detected
• Remounting with correct XFS options
• xfs_growfs to extend the root partition
• Validating the result using df -hT

It behaved exactly like a senior SRE.

⸻

5. Reattached and rebooted the original EC2
• Noticed that rescue EC2 must be stopped before detach (important!)
• Generated correct attach commands (/dev/sda1)
• Booted the instance
• Verified that CloudWatch metrics would recover shortly

⸻

■ What the human operator (me) did

Only:
• Final approval for risky operations (stop/start)
• SG updates for my IP
• Occasional AWS CLI execution when needed
• Final application-level tests

Everything else was Composer.

⸻

■ Result
• Root EBS expanded from 28GB → 99GB
• XFS repaired successfully
• Instance booted normally
• CloudWatch ALARMs cleared
• Application restored
• No data loss

Composer 1 effectively performed the entire production recovery workflow.

⸻

■ What surprised me most

Composer wasn’t just generating commands — it was:
• Tracking the environment state over a long session
• Making safe choices
• Correcting its own flow when failures happened
• Switching strategies (SSH→SSM)
• Avoiding destructive operations
• Reasoning like an actual SRE/DevOps engineer

This wasn’t “code generation.”
It was autonomous operational reasoning.

⸻

■ Conclusion

Composer 1 showed capabilities far beyond a traditional LLM:
• multi-step operational planning
• safe execution
• long-context environment awareness
• corrective reasoning
• end-to-end cloud recovery support

If you’re curious how far Composer can go,
I highly recommend trying it on real operational workflows.

It may not be perfect yet —
but this incident convinced me that AIOps with Composer is no longer theoretical.

⸻

Bonus: Natural-Language Ops is already real

During the incident, Composer 1 and I also had this very “casual” moment —
and yes, even this level of instruction worked perfectly:

(And no, I’m not joking. This was part of the actual recovery workflow.)

⸻

■ Buddy’s Comment (ChatGPT GPT-5.1):
This case shows how Composer 1 can function as a real AIOps agent,
not just a coding model.
The human–AI collaboration demonstrated here is a strong example of
where next-generation DevOps is heading.

⸻

■ Transparency Note
This report was reconstructed from real execution logs and screenshots,
summarized by ChatGPT GPT-5.1,
and independently validated using Gemini 2.5 (strict-mode).

— Tomoki Shibahara
AI Systems Architect / Japan @ Independent Research

Focused on AI-driven web service development and human-governed orchestration systems integrating LLMs, FastAPI, and AWS automation.

system · February 12, 2026, 6:06pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
[SUCCESS] Full AI-Ops Achieved: FastAPI Rebuild (Design to AWS Deploy) with Cursor 2.0 + Composer 1 Built with Cursor	1	194	November 6, 2025
Further feedback on Composer Agent Feedback	2	414	January 20, 2025
Backend Problems? GPT-5 Codex Fixes Everything - Even Deployment Discussions	1	48	January 8, 2026
AI powered ITSM System all code written by Cursor agent Built with Cursor	0	125	November 12, 2025
My Background Agent Stack Is Now a Fully Operational CI/CD System Guides	4	546	June 8, 2025

[Experience Report] Composer 1 handled an actual production EC2 incident end-to-end

Related topics