Speech-to-text extension built with cursor


function as follows
ctrl +1 shortcut to start, start recognizing
Recognition of hundreds of words is only 2 seconds, recognition accuracy of 95% + (with the API, fill in their own key, the data run locally, no server), recognition of a minute only need about 0.0005 U.S. dollars He is very cheap, and at present, I adapted to Windows, Linux, because I do not have an Apple computer temporarily can not do Macos, support AI optimization prompt word structure output, then AI to structure output at the same time support automatic copying clipboard. He is very cheap, and currently I adapt to Windows, Linux, because I do not have an Apple computer temporarily can not do Macos, support for AI optimization cue word structured output, the recognition of the text to the AI, and then use the AI to structure the output at the same time to support the automatic copy to the clipboard. I spent a dollar a month after using it a few thousand times.

Currently I only made the Chinese one, and I want to go for an English version in the near future. But I am not aware of any good API speech recognition overseas. I’ve looked at Google’s and they are expensive, if say it’s used a few thousand times a month it would cost $20-100 , do you guys have any good suggestions , I’d like to hear them and am currently thinking about accessing Google’s recognition

1 Like


:rofl:

i use

it works and i do not pay any money. Bu i need to select the chat window after stopping recording. (you assign the keyboard shortcuts for starting and stopping recording. it is also fast. )

this is for windows,
there is an alternative software in mac.

Yes, windows has its own such as win+H, but I mainly use ubuntu development so I made a plugin in the cursor to assist me, for me ctrl+1 this shortcut is very convenient!

In addition, I’ve added an AI optimization that converts the text from the voice output into cue words Structured output, so far it’s testing ok

you can try Bing Translate API

I currently use this with Cursor https://withaqua.com

Aqua Voice uses a fusion transcription architecture + a client context engine to be the most accurate speech-to-text system available. Text is automatically formatted to fit the specific application and document. This enables using voice for entirely new applications like technical prompting. Aqua produces the highest quality output of any voice to text system.

1 Like

Okay, I’ll try.

It would be great if Cursor could build a built-in speech-to-text. It would increase everyone’s efficiency. Voice talk is far faster than typing.

I can recommend Whispering - https://whispering.bradenwong.com/

If you are using windows then try Win key + H it opens like this below and if you keep your cursor in any window and just speak. It will convert voice to text. You don’t need any other software. It works in window 11