Context length and slow gpt-4

saeedmahmud · October 22, 2023, 9:53am

Hi, I just installed Cursor today and have already used up the free quotas available. As a beginner in programming, I’ve found AI to be extremely helpful. Using Cursor feels like a significant step forward for me. I have a couple of questions before switching to cursor:

What does “slow” mean in the context of gpt-4’s performance? How slow is slow gtp-4? So, far free slow gpt-4 i used seems okay speedwise for me. But is it get any slower?
What is the context length for gpt-4 in cursor?
Does the context length and speed remain constant, or are there any conditions under which it could be reduced, such as heavy usage?

Jakob · October 22, 2023, 12:24pm

The speed of slow GPT-4 will fluctuate based on how busy our servers are. Personally, I also thought that the slow GPT-4 was often good enough for me.
8192 tokens.
Context remains constant. Speed can fluctuate (see 1.).

saeedmahmud · October 22, 2023, 4:16pm

thanks for clarifying. is there any plan to release debian package?

Jakob · October 22, 2023, 5:41pm

Our Linux download should already work for Debian. AppImages work on Debian. You just need to chmod +x them first. We would like to release a deb package in the future.

saeedmahmud · October 22, 2023, 6:05pm

Yeah, i am already using it with chmod+x, a deb package would be great in future. Thanks.

three · October 22, 2023, 9:37pm

Just a quick suggestion in case you’re not familiar with it - there’s a handy tool called app2deb which seems to work very successfully with the Cursor appimage.

Very straightforward console-based tool: pass it the appimage, get a deb, install the deb. I haven’t tried it on pure Debian as all mine are headless non-graphical boxes, but it worked flawlessly on Ubuntu.

6DiegoDiego9 · October 26, 2023, 7:49am

I read “We largely rely on 8k but sometimes use 32k” (OpenAI's API pricing - #4 by truell20) in August/24.
Is the use of 32k discontinued?

Also, what exactly happens when I fill the prompt with more tokens than the context window?
I saw that no error is raised, so I presume that it uses embeddings to just take chunks of my prompt. If this is the case, I’d like it to inform me (the user) about this.

truell20 · October 26, 2023, 4:28pm

At this point, I think we just rely on 8k in production; though if you have 32k access using your API key, you’re welcome to use that.

When you blow out the prompt with large files, we do indeed use embeddings to pick the most relevant parts of the files. Sometimes you’ll see the UI give other options too. When your conversation gets too long, we recursively summarize it so the bot knows about some of the past.

(FWIW in our experience, 32k doesn’t really look at things that more than 8k tokens back in its context window)

Topic		Replies	Views
GPT-4 or GPT-4 Turbo? Discussions	21	2003	January 16, 2024
GPT-4 Turbo with API key -- context length is.....? Discussions	4	1073	January 20, 2024
Option to Toggle Between Fast and Slow GPT-4 Requests Feature Requests	24	3337	April 12, 2025
Bad chat performance? Feedback	8	851	December 15, 2023
GPt-4 Latest Release (Up to April 2023) Discussions	8	1957	November 11, 2023

Context length and slow gpt-4

Related topics