Whisper Assistant Extension (Voice to text)

Hi guys,

I’m a Cursor Pro subscriber and ever since I started using the IDE I thought of how powerful it would be to be able to use my voice alongside Cursors GPT-4 code generator. I saw it wasn’t available already anywhere else so I created it myself!

The GitHub repo is here, if you’d like to see the code:

And the link to the VSCode extension is here:

It’s free to use as it uses on a local version of Whisper. All instructions are outlined in the repo and the extension readme.

It’s only tested on a Mac so open to feedback if you notice any issues. I’ve been coding with this for around 2 weeks and it’s definitely improved my Cursor experience!

9 Likes

Legend.
this is really good.

1 Like

Enjoy, glad you like it!

I just posted a video of the Whisper Assistant in action with Cursor:

2 Likes

have you encounter these bugs before?
image
image

I haven’t! What OS are you using? I’ve only tested on a Mac M1 so wondering whether the OS could be an issue?

I have MacOs Sonoma / Macbook M1 Pro

yeah idk might be from my side. but re-installing didn’t work. maybe need to try sth else.

Unfortunately, this plugin doesn’t work for me. I am running Cursor on Windows and have installed whisper (I can run it from the command line). However, the plugin keeps telling me Whisper is not installed. It’s in my PATH too.

The check for whisper just looks for the whisper command on the command line.
Run ‘whisper -v’ in a terminal to make sure it’s accessible, if it’s there then try restarting cursor and trying again.

It is accessible. As I said, I was able to use via the command line to transcribe some audio I recorded. I also tried restarting cursor and the terminal too.

Sorry but I’m unsure what could be causing this. I had this issue the first first I installed Whisper but after a restart the command was found…

1 Like

This is dope man. Well done !!!

1 Like

Thanks for sharing man, this is awesome!

@Cthutu ever figure out that issue by chance? I am on windows as well, tried everything I can think of and no luck so far, thanks!

1 Like

No - I’ve abandoned Cursor now. I’ve gone back to VSC.

1 Like

You can use external tools like MacWhisper. Point to any text input, [hotkey], speak ,[hotkey]. Quite alright.

1 Like

For the non native speakers out there, I added a setting to select whisper’s transcription language into the extensions settings.

1 Like

Nice job :grinning:
I will test as soon I have chance and also will review what changes would be needed to introduce WhisperX, to make the translation faster

1 Like

Oh, I need to star this repo then! I wanted to see how hard it would be to make it streaming the recording and start the decoding before we finish speaking. Would that come with the WhisperX implementation?

That would be great. I had a go at this originally but ran out of time to get it working smoothly with SoX.

Unless I am mistaken, I don’t think you can stream using Whisper. You may have to chunk the audio and get the translation / response for each segment. If that is the only approach that will work, it would also work with WhisperX as well (if needed).

If you have the urge, see if there is another approach that would work to stream the recording with the original Whisper release, it may mean we don’t need to use WhisperX at all!

Have the same issue. Regardless though the transcription is so slow even for single sentence recordings that it doesn’t seem worth using unfortunately.