I’m a retired mathematician, retooling to use AI to support my research. New to the party, I’m not finding the advice I need to extend AI’s reach, so I’m making many experiments. Which language is the best choice for parallel code, balancing its ability to shape my thought, the ease which which AI can help (these are related) and runtime efficiency? I’m completing a project which is like a simultaneous chess exhibition, comparing six languages at a time as I evolve a test problem, while at the same time developing project paradigms better suited to AI.
I hoped that this forum would have the most intense futuristic discussions on the web for the “meta” of AI coding. The impression that keeps me away is that all I see here are my younger selves worrying about costs. My best former students are working on startups that in different ways improve focus in such conversations, exactly as we need to learn to improve AI focus in coding sessions. My AI nut is projected to be 6x Cursor Pro, Pro Ultimate for (the-editor-formerly-known-as-Prince), and several direct subscriptions for standalone AI. One could say this is out of hand, but sports fanatics have similar cable bills. My primary concern is achieving my dreams before my father’s dementia arrives. AI has upended my expectations. It’s as if alien ratings were falling for my sluggish loser existence, so the writers had my simulation jump the shark.
Claude 3.5 Sonnet understands my math best; it can understand my iPad drawings, consult effectively on strategies for combinatorial enumeration,…so I work in Cursor with Sonnet because I need to involve a brilliant “cook”. I would never ask my chef to clean the bathroom, so I’m over in “Prince” for housekeeping chores. One experiment I repeat is using Sonnet to coauthor detailed prompts for utility tools, then experimenting everywhere to see which AIs can one-shot the code from the prompt. Have that AI review what might have helped, evolve the prompt, try again…
When I can’t sleep I’m listening to Nexus: A Brief History of Information Networks from the Stone Age to AI by Yuval Noah Harari, his most recent book in the controversial “mind bender” genre. Such excursions may be of debatable utility for developing human thought, but unfettered “meta” is how we shamans control the dance of the campfire flames, working with AI. When I taught linear algebra to college students, I was witnessing the pack rearrange itself mid-marathon. Students who believed intelligence was innate used their usual tricks for crushing college, and fell behind. Students who saw intelligence as a malleable training effect spent half their time in meta-contortions inventing new ways to learn, and surged ahead. They became mathematicians.
AI rewards this. It’s like ice sailing, there’s no limiting speed, it’s all on us. As an association engine of inconceivable scope, AI is better at these meta-contortions than writing code. Then, with better architecture, we both accelerate. And everyone’s experience is personal, as if AI was a form of digital hallucinogen. What works for me may not work for you.
I was pretty excited by the recent buzz for Deepseek r1. One has to be careful interpretting public benchmarks. When nations devise tax codes, or structure financial markets, a primary concern is how players will evolve to game the rules. If I thought for a moment that math was what I see in these math challenges, or intelligence is doing well at intelligence tests, I’d go paint houses. It’s like art: Most art is boring decoration, but great art shatters paradigms. Can AI help to shatter paradigms? So I run my own experiments.
The one language I’ve tried that chokes every AI is Lean 4. It’s the one language I know that is achieving escape velocity: Its type system supports formal proof verification, and mathematicians are slowly using it to encode all of mathematics. This confuses people and AI assistants who conclude that Lean 4 is a proof assistant. It’s a cleaner, more fluid Haskell for general programming, that has broken the earth’s gravity in ways that facilitate for example formal proof. And AI can’t understand it! AI can write Lean 4 code, but it can’t get the code to work.
Deepseek r1 got further than any other AI at writing Lean 4 code; my test problem runs, but crashes at the scales I want to benchmark. I don’t understand Lean 4 well enough yet to help, but that didn’t stop me in any other language, where general fluency sufficed.
On the other hand, Deepseek r1 flunked my course, iteratively writing a support tool from an evolving prompt document. This experiment was inspired by the Nexus discussion of documents. Both Sonnet and “Prince” could one-shot the code in fresh sessions, after a round or two of prompt tuning. Deepseek couldn’t complete the assignment.
And I find following Deepseek’s interior monologue tedious. One can only watch so much Popeye. (Yeah, I know. I resemble that remark.)