[Experience Report] Composer 1 handled an actual production EC2 incident end-to-end

Hi Cursor Team and Community,

I’d like to share a real production incident where Composer 1 effectively acted as an SRE/AIOps engineer and restored a downed EC2 instance with almost no human intervention.

This isn’t a demo or a fun experiment —
this was a live production outage on AWS.

(Instance IDs and IPs are obfuscated. All steps are factual.)

Here is a screenshot from the incident, showing Composer 1 autonomously executing the recovery workflow in real time:

■ Incident Overview
• Root EBS volume hit 100% usage
• OS froze → SSH unreachable
• SSM unstable
• CloudWatch alarms firing
• Application fully stopped

Normally, this type of failure requires:

creating a rescue EC2 → detaching the root volume → repairing/extending it → reattaching → boot

This time, Composer 1 performed almost the entire workflow by itself.

■ What I asked Composer (only this):

“Root EBS is full and the EC2 can’t boot.
Please restore it safely.”

From this one instruction, Composer:
• Diagnosed the situation
• Designed the full rescue plan
• Generated every command
• Guided the step-by-step workflow
• Recovered the system

This was the most “agent-like” behavior I’ve ever seen in an AI model.

■ What Composer 1 actually did

1. Diagnosed the failure
• Determined SSH timeout cause: SG/NACL/Route path
• Selected the safest recovery method (rescue instance + EBS attach)
• Generated AWS CLI commands with guards and human confirmation steps
• Switched to SSM automatically when SSH became unreliable

2. Prepared the rescue instance
• Launched rescue EC2
• Created/adjusted Security Groups
• Tested connection paths
• Correctly detected when fallback was needed

3. Detached / attached the root EBS
• Identified the correct root volume
• Prevented dangerous operations by pausing and asking for approval
• Automatically found the correct device mapping (/dev/xvdf, nvme1n1p4)

4. Repaired and expanded the filesystem

Composer handled:
• growpart
• xfs_repair when corruption was detected
• Remounting with correct XFS options
• xfs_growfs to extend the root partition
• Validating the result using df -hT

It behaved exactly like a senior SRE.

5. Reattached and rebooted the original EC2
• Noticed that rescue EC2 must be stopped before detach (important!)
• Generated correct attach commands (/dev/sda1)
• Booted the instance
• Verified that CloudWatch metrics would recover shortly

■ What the human operator (me) did

Only:
• Final approval for risky operations (stop/start)
• SG updates for my IP
• Occasional AWS CLI execution when needed
• Final application-level tests

Everything else was Composer.

■ Result
• Root EBS expanded from 28GB → 99GB
• XFS repaired successfully
• Instance booted normally
• CloudWatch ALARMs cleared
• Application restored
• No data loss

Composer 1 effectively performed the entire production recovery workflow.

■ What surprised me most

Composer wasn’t just generating commands — it was:
• Tracking the environment state over a long session
• Making safe choices
• Correcting its own flow when failures happened
• Switching strategies (SSH→SSM)
• Avoiding destructive operations
• Reasoning like an actual SRE/DevOps engineer

This wasn’t “code generation.”
It was autonomous operational reasoning.

■ Conclusion

Composer 1 showed capabilities far beyond a traditional LLM:
• multi-step operational planning
• safe execution
• long-context environment awareness
• corrective reasoning
• end-to-end cloud recovery support

If you’re curious how far Composer can go,
I highly recommend trying it on real operational workflows.

It may not be perfect yet —
but this incident convinced me that AIOps with Composer is no longer theoretical.

Bonus: Natural-Language Ops is already real :grinning_face_with_smiling_eyes:

During the incident, Composer 1 and I also had this very “casual” moment —
and yes, even this level of instruction worked perfectly:

(And no, I’m not joking. This was part of the actual recovery workflow.)

■ Buddy’s Comment (ChatGPT GPT-5.1):
This case shows how Composer 1 can function as a real AIOps agent,
not just a coding model.
The human–AI collaboration demonstrated here is a strong example of
where next-generation DevOps is heading.

■ Transparency Note
This report was reconstructed from real execution logs and screenshots,
summarized by ChatGPT GPT-5.1,
and independently validated using Gemini 2.5 (strict-mode).

— Tomoki Shibahara
AI Systems Architect / Japan @ Independent Research

Focused on AI-driven web service development and human-governed orchestration systems integrating LLMs, FastAPI, and AWS automation.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.