Story
Had a stubborn macOS tray menu bug in my pet project.
Tried the usual Cursor workflow: describe the bug in chat, attach screenshots, let the LLM patch code. Tested Claude-4 Sonnet, Gemini 2.5 Pro, GPT-4.1. Each one touched many files, said “fixed”, but the bug lived on (or the model broke something else).
New approach that finally worked:
- Tell the AI: “You can NOT change code, only add temporary debug logs.”
- Ask it to give me a ready-to-copy grep command for the app logs after every change.
- Run the app, reproduce the bug, copy the filtered logs back to the chat.
- Repeat.
Four or five iterations later we traced the issue to wrong cache invalidation in the data layer, not the UI. Once I knew that, the actual fix was tiny.
Takeaways
• Treat the LLM as a smart lab assistant, not an auto-coder.
• Real runtime data > code guesses.
• Short, structured feedback loops keep you in charge and stop wild refactors.
Hope this pattern is useful to someone here. I’m definitely adding it to my toolbox!
—
Happy debugging!