TLDR: create checkpoints for every file change during a response, to allow more fine tuned & better timed guidance during vibe-coding
I use Cursor in YOLO mode (all chat actions are auto approved), and follow & guide the agent by reading not only the output but also thoughts.
This is really powerful because, even though I haven’t programmed in a while, with my sw engineering background and long time technical product management experience, I am able to give critical guidance to the Agent. Examples:
- I am capable of seeing when it is following a red herring symptom and reframe the approach towards root cause
- I detect opportunities the right amount of DRY (sometimes more, sometimes less)
- I can see when it gets confused about project structure and re-creates existing files for now having found them where it expects.
The problem is, since the CoT + actual output speed is considerably faster than my processing, by the time I catch an opportunity to give feedback and correct course, the model usually went through 2-3 more cycles of modifications. Even if I stop immediately, there is no exact way for me to tell the model “here’s where things went wrong. Ignore everything you did after that point in the chat”.
So what I end up doing is roll back to the previous check point and precisely tell the Agent what to avoid, hoping that will avoid the bad path.
In other words, because of the human’s processing speed being much lower than agent output, AI replies today constitute a context with discrete pieces, when in fact vibe coding requires a more “continuous” context, to prune bad paths more precisely.
I don’t want to prescribe a particular implementation in a feature request :-), but if I did :-), it would be like this:
- create a checkpoint after each file change (not after each agent reply ends or is stopped)
- give the user ability to click anywhere in the response (or ideally even in the CoT) to start their reply exactly at that point.
The context for the next user message would contain the last checkpoint before that message, and a compressed context with AI’s response exactly up to that point.