I have been developing numerous projects in Python using Cursor AI. However, due to a condition affecting my hands, I seldom use the keyboard and instead rely on Windows Voice Access to transcribe everything.
I commenced this approach approximately five months ago, and it has been life changing in terms of how rapidly I have been able to create projects and solve problems. Does anyone else utilize voice-to-text technology for coding and Cursor AI?
Actually, this message is entirely written by dictation. Iām using Superwhisper and have various prompts depending on the workflow, which will then translate what Iām saying into a series of product manager-type instructions written by a developer.
Interesting. Iāve thought about exploring voice but Iām so used to typing that, in fact, I often start typing when the thoughts are still forming. The act of typing helps me flesh them out.
Because of this, I havenāt really tried voice other than dictation that comes with my operating system.
When you say you have voice prompts that take your spoken words and ātranslatesā them, this seems a step up from dictation. Would you mind sharing examples of the sorts of prompts you have used?
Windows voice access is built in to Windows 11. I use it all day every day. The transcription isnāt always accurate, but AI like Claude 3.5 easily figures out your intent, even with atrocious word transcription errors.
Yes - in open-plan offices itās impractical, but for remote solo work speech is very useful. Itās the most natural way to communicate, so implement it with minimal friction. Preferably via a shortcut (a button is okay), since thatās the most seamless option. I guess People type mainly from habit - voice is the future.
Iām on Debian (using AppImage) and the extension is better than āSpeechNoteā for Linux, because:
When I stop recording it pastes the transcribed text wherever my cursor is in Cursor (chat, file, UI field, search bar etc.)
Works out of the box very fast and accurate without me picking the right AI model for that.
Everything is completely free.
Cursor should have that natively in-app to add context-aware transcription - replacing the wrongly spelled words with relevant terms from the codebase. For example, I have shopping cart in my app, but often I have the word ācardā in my transcription rather than ācartā. Variable/function names with underscores, etc.
Please, we need a Cursor integration to add file names, functions names, variable names in context to the dictionary used by those speech to text software.
They donāt understand my python modules with the name of my company in it or my function names with _ in it etc.
Completely local, even with ai-rewriting (if needed).
You can set it up as a user daemon: voxd --autostart true
ready to get triggered by a hotkey, whenever and wherever you want to type.
I begrudgingly use Wispr Flow as well. The app is extremely buggy, constantly cuts out and just stops dictating your voice at random times. They also have some of the worst customer service Iāve ever experienced. Iāve messaged and emailed them multiple times, and Iāve never heard back.
The only reason I still use them is I managed to get a 50% discount on the subscription.
wow sorry to hear that, so it stops listening despite you still holding the keyboard key? in my case it works flawlessly, even with extremely long dictations!