Ralph Cursor Guide

The Ralph Wiggum Technique: Running Autonomous Cursor Agents for Hours

“It’s better to fail predictably than succeed unpredictably.” — Geoffrey Huntley

What is Ralph?

Ralph (named after Ralph Wiggum from The Simpsons - persistently wrong but never giving up) is an autonomous AI coding pattern that solves two fundamental problems:

  1. LLMs grade their own homework poorly - They say “done” when they’re not

  2. Context windows fill up and degrade - Long conversations get expensive and confused

The Core Innovation: Fresh Context, Persistent Code

Traditional AI coding accumulates context. By turn 50, you’re paying for 200K+ tokens of conversation history, and the model is getting confused by all that context.

Ralph keeps context fresh but code persists:

  • Iteration 1: 5K tokens → code saved to disk

  • Iteration 2: 5K tokens (fresh!) → sees previous code via files

  • Iteration 50: 5K tokens (still fresh!) → all progress in codebase

Even using the same expensive model, Ralph is dramatically cheaper because each iteration starts fresh.


Why This Saves Money

The savings come from:

  • No context accumulation - Each iteration is ~5K tokens, not 200K

  • No “remembering” failures - Dead ends are forgotten, not paid for

  • Code is the memory - Progress persists in git, not in tokens

You can also use cheaper models for the execution phase for additional savings, but fresh context is the main win.


The Two-Phase Approach

Phase 1: Smart Planning (Plan Mode)

Use Plan Mode with your smartest model to break down work into atomic tasks. The goal is to create tasks so mechanical that a cheaper model can execute them without judgment.

Example prompt to start planning:


/mode plan

I need to migrate all files using OldHook to use NewHook instead.

Please:

1. Analyze the codebase to find all affected files

2. Define the EXACT transformation pattern (show before/after code)

3. Create a grep command that returns 0 when migration is complete

4. Write a prompt template for migrating one file at a time

5. Create a bash script that loops until done, spawning fresh Cursor CLI agents

What you get from Plan Mode:

  • A discovery command to find all files

  • Exact code patterns (before → after)

  • A verification command (e.g., grep -r "OLD_PATTERN" src/ | wc -l)

  • A prompt template for each iteration

  • A bash script that orchestrates the loop

Phase 2: Cheap Execution (CLI Loop)

The key insight: you ask Cursor to write a script that calls the Cursor CLI in a loop.

Example prompt after planning:


Now create a bash script called ralph_migrate.sh that:

1. Checks how many files still need migration using grep

2. If zero remain AND type checking passes, exit successfully

3. Otherwise, spawn a fresh agent using the Cursor CLI with the prompt we created

4. Run type checking after each iteration

5. Loop until done (with a max iteration safety limit)

6. Use --model grok for cost savings

Cursor will generate something like this:


#!/bin/bash

# Load API key

source .env

export PATH="$HOME/.local/bin:$PATH"

MAX_ITERATIONS=50

ITERATION=0

# The prompt for each fresh agent

PROMPT='Migrate ONE file from OldHook to NewHook.

Find a file: grep -r "import OldHook from" src/ -l | head -1

Transformation:

- OLD: import OldHook from "old-path"

- NEW: import { newHook } from "new-path"

Instructions:

1. Find ONE file still using OldHook

2. Read it to understand context

3. Apply the transformation

4. Stop after ONE file'

while [ $ITERATION -lt $MAX_ITERATIONS ]; do

REMAINING=$(grep -r "import OldHook from" src/ 2>/dev/null | wc -l)

if [ "$REMAINING" -eq 0 ]; then

echo "Zero imports remaining, checking types..."

if npx tsc --noEmit; then

echo "SUCCESS: Migration complete!"

exit 0

fi

fi

echo "=== Iteration $ITERATION: $REMAINING files remaining ==="

# Spawn fresh agent with minimal context

agent -p --model grok "$PROMPT"

ITERATION=$((ITERATION + 1))

done

echo "Reached max iterations"

Then just run it:


chmod +x ralph_migrate.sh

./ralph_migrate.sh


The Key Insight: External Verification

This is what makes Ralph work. The loop doesn’t stop when the LLM says “done” - it stops when external tools confirm success.


# BAD: Trust the LLM

if agent_says_done; then exit; fi

# GOOD: Trust external verification

REMAINING=$(grep -r "OLD_PATTERN" src/ | wc -l)

if [ $REMAINING -eq 0 ] && npx tsc --noEmit; then

exit 0 # Only grep + compiler decide when done

fi

Good verification commands:

  • grep -r "pattern" | wc -l → Zero matches remaining

  • npx tsc --noEmit → TypeScript compiles

  • npm test → Tests pass

  • npm run build → Build succeeds


Plan Mode: Creating Atomic Tasks

The planning phase should produce tasks that are mechanical and verifiable:

Bad task (requires judgment):

“Refactor the authentication system”

Good task (mechanical):

“Replace import OldAuth from 'old-auth' with import { newAuth } from 'new-auth'

Verification: grep -r "import OldAuth" src/ | wc -l == 0

Prompt for Plan Mode:


Break this work into atomic tasks where each task:

- Has an exact before/after code pattern

- Has a grep command to verify completion

- Requires zero judgment to execute

- Can be done one file at a time


Setup (All Via Cursor)

Step 1: Install CLI

Ask Cursor:


Run the Cursor CLI installer: curl -fsSL https://www.cursor.com/install.sh | sh

Step 2: Authenticate

Ask Cursor:


Run: agent login

Or add CURSOR_API_KEY to your .env file.

Step 3: Create Your Plan

Switch to Plan Mode and describe your task. Let Cursor analyze the codebase and create the transformation patterns, verification commands, and loop script.

Step 4: Run the Loop

Ask Cursor to create the ralph script from your plan, then run it in the background.


When to Use Ralph

Good for:

  • Large refactors with clear patterns (50+ files)

  • Library/framework migrations

  • Adding types across codebase

  • Renaming symbols project-wide

  • Any task with objective pass/fail criteria

Not for:

  • Creative/design decisions

  • Architectural choices

  • Security-sensitive code

  • Tasks without clear “done” criteria


Common Mistakes

Letting LLM decide when done → Use external verification (grep, tsc, tests)

One long conversation → Use fresh context per iteration (Ralph loop)

Vague task descriptions → Plan phase creates explicit before/after patterns

Writing scripts manually → Ask Cursor to write scripts that use the Cursor CLI

Forgetting type checks → Run tsc after each iteration


The Philosophy

"Ralph keeps what matters (code changes in git) and forgets what doesn’t (conversation context, failed attempts).

Each iteration starts fresh but builds on persistent progress."

The context window becomes a working memory, not a permanent record. Code is the memory.


Resources


Questions or success stories? Share in the comments!

1 Like