Notes from Day 1 with Cursor: 1568 lines of code, 71 chats, and $120+ in OpenAI credits

Things I did:

  • Configured a .devcontainer, Dockerfile, settings.json, launch.json, and github actions for a personal project.
  • Pasted in tree of directory structure to explore high-level organization options.
  • Converted 20 pages of handwritten research notes into a summarized README.md.
  • Asked for libraries which might be helpful, and compared tradeoffs.
  • Created tikZ/pgf illustrations, Cmd+K to add details, then Cmd+k again to simplify.
  • Fed in 80 paragraphs of notes, asked for suggested tests to take them into consideration, wrote them out. (Model not very good at pytest fixtures?)
  • Some test failures could not be auto-fixed, paste in stack trace, draft request for help to collaborator.
  • Lift and shift is much easier, so started a migration from torch to pytorch-lightning.
  • Replaced sections of manually written code with calls to library functions.
  • Asked for quality issues to address, then worked through the checklist it gave me.
  • Deduplicated redundant code in scripts, organizing logic into modules.
  • Went outside of my codebase and used terminal command suggestions to categorize + clean up files in my downloads folder, freeing 1/3rd of used disk space.
  • Queried to check my current progress vs what was still remaining in the plan.
  • Checked for inconsistencies when making changes across multiple files.
  • Made changes in FileA, then Cmd+K in FileB “Change to be compatible with @FileA
  • Asked if FileC would need modifications or not, and if so, what it would involve.
  • Pasted in sample external code and Cmd+k to integrate it with my project.
  • At work, took a stale PR, set up new branch, Cmd+k old comments, checked it in.
  • Asked a principal engineer for a quick call, shared screen, recorded convo for 4.5 min, pasted transcript to chat, list changes discussed, implement them.
  • Wanted to learn memory management, had it generate puzzles for me to solve, told it to calibrate difficulty of next round of problems based on score my answers get.
  • A lot of prompt engineering to convince model to ask decent questions, found napkin math helps more than adding quotes from high quality textbooks/blog posts.

Feedback:

  • Being given all the steps at once is overwhelming. Consider option for inner monologue, only showing next step, but can expand entire plan at any time.
  • Consider borrowing elements from software designed for people with mental disabilities, e.g https://www.mycoughdrop.com/. This reduces cognitive burden but keeps a human in the loop instead of delegating the task to an agent.
  • One example might be, when chat interface creates a bullet point, proactively generate code, add an icon, and put it as an action on a board. .
  • Software in the real world often grows organically instead of following a strict design. The lens I take of modern source control, dependency analysis, trace debugging, etc. is a form of genetic engineering where you splice in new traits, generate mutants, select against a fitness function. Consider a “Smart Paste” feature for merging, or a “Disentangle” feature to do the opposite, this could be implemented with a concept slider to select the relative strength of each parent.
  • I would like to see more emphasis on understanding and interpretability. Anysphere is set to become commercially successful, shortens AGI timelines by a bit compared to the counterfactual, but not by much given Copilot X… I want to understand Sualeh and Aman’s view on integrating alignment research into their products and mitigating x-risks.