Things I did:
- Configured a .devcontainer, Dockerfile, settings.json, launch.json, and github actions for a personal project.
- Pasted in
tree
of directory structure to explore high-level organization options. - Converted 20 pages of handwritten research notes into a summarized README.md.
- Asked for libraries which might be helpful, and compared tradeoffs.
- Created tikZ/pgf illustrations, Cmd+K to add details, then Cmd+k again to simplify.
- Fed in 80 paragraphs of notes, asked for suggested tests to take them into consideration, wrote them out. (Model not very good at pytest fixtures?)
- Some test failures could not be auto-fixed, paste in stack trace, draft request for help to collaborator.
- Lift and shift is much easier, so started a migration from torch to pytorch-lightning.
- Replaced sections of manually written code with calls to library functions.
- Asked for quality issues to address, then worked through the checklist it gave me.
- Deduplicated redundant code in scripts, organizing logic into modules.
- Went outside of my codebase and used terminal command suggestions to categorize + clean up files in my downloads folder, freeing 1/3rd of used disk space.
- Queried to check my current progress vs what was still remaining in the plan.
- Checked for inconsistencies when making changes across multiple files.
- Made changes in FileA, then Cmd+K in FileB “Change to be compatible with @FileA”
- Asked if FileC would need modifications or not, and if so, what it would involve.
- Pasted in sample external code and Cmd+k to integrate it with my project.
- At work, took a stale PR, set up new branch, Cmd+k old comments, checked it in.
- Asked a principal engineer for a quick call, shared screen, recorded convo for 4.5 min, pasted transcript to chat, list changes discussed, implement them.
- Wanted to learn memory management, had it generate puzzles for me to solve, told it to calibrate difficulty of next round of problems based on score my answers get.
- A lot of prompt engineering to convince model to ask decent questions, found napkin math helps more than adding quotes from high quality textbooks/blog posts.
Feedback:
- Being given all the steps at once is overwhelming. Consider option for inner monologue, only showing next step, but can expand entire plan at any time.
- Consider borrowing elements from software designed for people with mental disabilities, e.g https://www.mycoughdrop.com/. This reduces cognitive burden but keeps a human in the loop instead of delegating the task to an agent.
- One example might be, when chat interface creates a bullet point, proactively generate code, add an icon, and put it as an action on a board. .
- Software in the real world often grows organically instead of following a strict design. The lens I take of modern source control, dependency analysis, trace debugging, etc. is a form of genetic engineering where you splice in new traits, generate mutants, select against a fitness function. Consider a “Smart Paste” feature for merging, or a “Disentangle” feature to do the opposite, this could be implemented with a concept slider to select the relative strength of each parent.
- I would like to see more emphasis on understanding and interpretability. Anysphere is set to become commercially successful, shortens AGI timelines by a bit compared to the counterfactual, but not by much given Copilot X… I want to understand Sualeh and Aman’s view on integrating alignment research into their products and mitigating x-risks.