The Ralph Wiggum Technique: Running Autonomous Cursor Agents for Hours
“It’s better to fail predictably than succeed unpredictably.” — Geoffrey Huntley
What is Ralph?
Ralph (named after Ralph Wiggum from The Simpsons - persistently wrong but never giving up) is an autonomous AI coding pattern that solves two fundamental problems:
-
LLMs grade their own homework poorly - They say “done” when they’re not
-
Context windows fill up and degrade - Long conversations get expensive and confused
The Core Innovation: Fresh Context, Persistent Code
Traditional AI coding accumulates context. By turn 50, you’re paying for 200K+ tokens of conversation history, and the model is getting confused by all that context.
Ralph keeps context fresh but code persists:
-
Iteration 1: 5K tokens → code saved to disk
-
Iteration 2: 5K tokens (fresh!) → sees previous code via files
-
Iteration 50: 5K tokens (still fresh!) → all progress in codebase
Even using the same expensive model, Ralph is dramatically cheaper because each iteration starts fresh.
Why This Saves Money
The savings come from:
-
No context accumulation - Each iteration is ~5K tokens, not 200K
-
No “remembering” failures - Dead ends are forgotten, not paid for
-
Code is the memory - Progress persists in git, not in tokens
You can also use cheaper models for the execution phase for additional savings, but fresh context is the main win.
The Two-Phase Approach
Phase 1: Smart Planning (Plan Mode)
Use Plan Mode with your smartest model to break down work into atomic tasks. The goal is to create tasks so mechanical that a cheaper model can execute them without judgment.
Example prompt to start planning:
/mode plan
I need to migrate all files using OldHook to use NewHook instead.
Please:
1. Analyze the codebase to find all affected files
2. Define the EXACT transformation pattern (show before/after code)
3. Create a grep command that returns 0 when migration is complete
4. Write a prompt template for migrating one file at a time
5. Create a bash script that loops until done, spawning fresh Cursor CLI agents
What you get from Plan Mode:
-
A discovery command to find all files
-
Exact code patterns (before → after)
-
A verification command (e.g.,
grep -r "OLD_PATTERN" src/ | wc -l) -
A prompt template for each iteration
-
A bash script that orchestrates the loop
Phase 2: Cheap Execution (CLI Loop)
The key insight: you ask Cursor to write a script that calls the Cursor CLI in a loop.
Example prompt after planning:
Now create a bash script called ralph_migrate.sh that:
1. Checks how many files still need migration using grep
2. If zero remain AND type checking passes, exit successfully
3. Otherwise, spawn a fresh agent using the Cursor CLI with the prompt we created
4. Run type checking after each iteration
5. Loop until done (with a max iteration safety limit)
6. Use --model grok for cost savings
Cursor will generate something like this:
#!/bin/bash
# Load API key
source .env
export PATH="$HOME/.local/bin:$PATH"
MAX_ITERATIONS=50
ITERATION=0
# The prompt for each fresh agent
PROMPT='Migrate ONE file from OldHook to NewHook.
Find a file: grep -r "import OldHook from" src/ -l | head -1
Transformation:
- OLD: import OldHook from "old-path"
- NEW: import { newHook } from "new-path"
Instructions:
1. Find ONE file still using OldHook
2. Read it to understand context
3. Apply the transformation
4. Stop after ONE file'
while [ $ITERATION -lt $MAX_ITERATIONS ]; do
REMAINING=$(grep -r "import OldHook from" src/ 2>/dev/null | wc -l)
if [ "$REMAINING" -eq 0 ]; then
echo "Zero imports remaining, checking types..."
if npx tsc --noEmit; then
echo "SUCCESS: Migration complete!"
exit 0
fi
fi
echo "=== Iteration $ITERATION: $REMAINING files remaining ==="
# Spawn fresh agent with minimal context
agent -p --model grok "$PROMPT"
ITERATION=$((ITERATION + 1))
done
echo "Reached max iterations"
Then just run it:
chmod +x ralph_migrate.sh
./ralph_migrate.sh
The Key Insight: External Verification
This is what makes Ralph work. The loop doesn’t stop when the LLM says “done” - it stops when external tools confirm success.
# BAD: Trust the LLM
if agent_says_done; then exit; fi
# GOOD: Trust external verification
REMAINING=$(grep -r "OLD_PATTERN" src/ | wc -l)
if [ $REMAINING -eq 0 ] && npx tsc --noEmit; then
exit 0 # Only grep + compiler decide when done
fi
Good verification commands:
-
grep -r "pattern" | wc -l→ Zero matches remaining -
npx tsc --noEmit→ TypeScript compiles -
npm test→ Tests pass -
npm run build→ Build succeeds
Plan Mode: Creating Atomic Tasks
The planning phase should produce tasks that are mechanical and verifiable:
Bad task (requires judgment):
“Refactor the authentication system”
Good task (mechanical):
“Replace
import OldAuth from 'old-auth'withimport { newAuth } from 'new-auth'”
Verification:
grep -r "import OldAuth" src/ | wc -l == 0
Prompt for Plan Mode:
Break this work into atomic tasks where each task:
- Has an exact before/after code pattern
- Has a grep command to verify completion
- Requires zero judgment to execute
- Can be done one file at a time
Setup (All Via Cursor)
Step 1: Install CLI
Ask Cursor:
Run the Cursor CLI installer: curl -fsSL https://www.cursor.com/install.sh | sh
Step 2: Authenticate
Ask Cursor:
Run: agent login
Or add CURSOR_API_KEY to your .env file.
Step 3: Create Your Plan
Switch to Plan Mode and describe your task. Let Cursor analyze the codebase and create the transformation patterns, verification commands, and loop script.
Step 4: Run the Loop
Ask Cursor to create the ralph script from your plan, then run it in the background.
When to Use Ralph
Good for:
-
Large refactors with clear patterns (50+ files)
-
Library/framework migrations
-
Adding types across codebase
-
Renaming symbols project-wide
-
Any task with objective pass/fail criteria
Not for:
-
Creative/design decisions
-
Architectural choices
-
Security-sensitive code
-
Tasks without clear “done” criteria
Common Mistakes
Letting LLM decide when done → Use external verification (grep, tsc, tests)
One long conversation → Use fresh context per iteration (Ralph loop)
Vague task descriptions → Plan phase creates explicit before/after patterns
Writing scripts manually → Ask Cursor to write scripts that use the Cursor CLI
Forgetting type checks → Run tsc after each iteration
The Philosophy
"Ralph keeps what matters (code changes in git) and forgets what doesn’t (conversation context, failed attempts).
Each iteration starts fresh but builds on persistent progress."
The context window becomes a working memory, not a permanent record. Code is the memory.
Resources
Questions or success stories? Share in the comments!