Support for speech to text

In my case, it is twice the function key not the control key

That would be great!
I came from using VSCode+Copilot+VSCode Speech and the ability to use VSCode Speec to transcribe my voice is what I miss the most.

2 Likes

Hi guys,

As this is still not included in the Cursor app I’ve continued to develop the ā€˜Whisper Assistant’ VSCode extension which works really well with Cursor on Mac, Windows and Linux.

You can get started transcribing locally using Docker with your preferred audio library to record your mic input (Sox is recommended but you can change this to use ffmpeg / arecord etc).

It’s also possible to use the OpenAI and Groq API if you cannot run the Whisper model locally.

I use this daily and have been getting great results after fine tuning based on feedback from users across all platforms.

Give it a go and let me know if there’s any feedback you have:

Cool extension, is there a way to integrate it with third party transcription tools like MacWhisper or Ollama (instead of docker for whisper)?

Yes. It’s definitely possible to use Ollama although I’m unsure whether it is possible for MacWhisper.

The extension uses the OpenAI library to make requests to the preferred service (local Docker, OpenAI or GroqCloud). There’s no reason why it couldn’t be expanded to allow for a different URL to support Ollama as I’m pretty sure it uses the same OpenAI api implementation.

What model(s) on Ollama would you use exactly? I can then look into it.

1 Like

Any model on Ollama that supports speech to text. I mostly do other stuff with Ollama but this would be very helpful.

Yes Ollama has a separate OpenAI compatible API

1 Like

Great, I’ll take a look at this as soon as I have chance.

1 Like

I’ve had a look and it seems that Ollama doesn’t have a popular Whisper model being used. For this reason it doesn’t seem worth investing time looking into Ollama Whisper support right now.

Thanks for the feedback, very good point. I didnt realize they dont have it yet.

1 Like

I was just looking for a tool that would do speech to text and would really like if cursor had first party tool for that or better yet have it integrated in cursor

1 Like