Hi there, I see a lot of feature requests for speech to text, but I would also greatly want text to speech so Cursor can read back the generated code and explanations to me, I think that would be really helpful and problem a lot easier to implement than Speech to text and be a good stepping stone towards full voice chat interactions.
It’d be really great if the chat UI had a button for speech-to-text as when working towards complex tasks it’d really improve efficiency and DX. Plus it would be an accessibility feature for limited mobility developers.
I would like to request the addition of a speech-to-text feature while prompting. This would allow users to speak their input instead of typing, which could improve productivity and accessibility.
Yes, this would be useful
Please vote
Gotch ya:
Windows = windowskey + H
macOS = double control button.
The more advanced code models are here, the more the interaction with the cursor is changing. From getting a help when stuck and automating boilerplatte to merely overseeing what agents are doing.
This shift in paradigm would tremendously profit from adding more sophisticated ways in interacting with the cursor.
The first one is obviously speech-to-text, but there is a lot of software for that, so it’s not crucial.
What is crucial IMO, is adding more text-to-speech capabilities.
I’d love also to have possibility to have text-to-speech, where “agent” would speak about what he is doing currently.
Adding it now will also be beneficial for the future, as it will be more easy to implement it in another areas.
Additional points if something like xtts-2 would be used, so I can provide a VEGA (from Doom) voice to read me this.
I’m also often too lazy to read. The IDE/agent talking to me would be awesome.
We could build this as a VS Code extension if they expose APIs for this…
Like in Iron Man, Tony Stark is also talking to Jarvis to code something and Jarvis is responding. It would feel much more like in Iron Man if it had this feature!
when will cursor support text to speech? its fine if you guys add support for local rendering using whisper but desperately need one model, this has been added to the github copilot and this can be a dealbreaker for me but I don’t wanna quit on cursor
Something similar to this but inbuilt in cursor Whisper Assistant Extension (Voice to text) - General - Cursor Community Forum as this installation has many fail points
Hey @nishant, I asked Cursor about speech to text support a while back but they didn’t seem to have it on the roadmap. I’m intrigued to hear if this has changed now.
In the meantime, I’ve used the Whisper Assistant Extension daily for months on a Mac and it works great for me. I don’t have a lot time to support the extension in terms of testing on other platforms and debugging edge cases but I’m sure for many it works out of the box.
Have you managed to use it successfully yourself?
I certainly would appreciate it. It would be nice if i could be typing in my code window while asking questions through speech.
Hey martin, thanks for making that extension, I did try it but getting some errors while installing as I use windows OS. That’s why asking here to have natives support which will remove this dependency
Feel free to open an issue on the GitHub project about the problems you are having. If I can I will see if I can solve the issues on Windows OS if you let me know the details.
Of course, native support would be preferred but this may work in the meantime!
VSCode Speech (VS Code Speech - Visual Studio Marketplace) is now integrated with Copilot Chat.
Any idea if speech is on Cursor’s roadmap?
I am currently using SuperWhisper for coding. Since Cursor lacks the vscode voice. You can try SuperWhisper, I use it to create the input for copilot+ - click were you need it then alt+space. The SuperWhisper prompt I am using is simple: Your task is to take the text provided and follow the rules below.
- Rewrite it into a clear, grammatically correct prompt while preserving the original meaning as closely as possible.
- The prompt always concerns coding problems and is an input to GitHub Copilot. Correct spelling mistakes, punctuation errors, verb tense issues, word choice problems, and other grammatical mistakes.
- Output should only include the rewritten text without your quotes, commentary, or preamble.
It works great, but if anyone try this and create a better prompt that could be used for creating input prompts for the cursor copilot, I would be interested in trying it, so please share it. One thing—I have attempted to integrate a feature for terminals that allows me to control the terminal, but yesterday, I found LLM-CMD to generate and execute commands in your shell. I already use Simon Wilson’s LLM for several projects; he has a lot of useful repos. But the LLM-CMD combined with something like Superwisper is Superhandy. Now I can eat Doritos without getting an orange keyboard.
PS - Superwhisper is free if you have a Setapp account (dont know if I would pay for Superwhisper without it) Superwhisper uses a local model for VTT but you need to add openai API key to convert the VTT into the prompt text.
I would think Cursor could easily make this setup by adding a llm that you would download to enable the voice, controlling terminal through voice is not hard but NLP to code could potentially create a hazard. I use it for git and other simple things that would be in training data so it has not created any mistakes but you should always read it before hitting enter.
thanks for detailed reply, I did read it but forgot to reply. I wanted to try out what you said but I already wasted a lot of time with whisper, I’m thinking of quitng cursor to join other tools as support hasn’t been great here, Cody and github copilot seems good
Here’s a video with someone doing this. According to the comments he’s using superwhisper..
+1 for superwhisper. It’s been amazing for chatgpt as well, faster than their voice tools
one side effect: it makes life so much better that you start talking to your computer in public places
why do not use mac os built in voice input? press ctrl twice