The 671b model should not have been used. Instead, a small distilled model was used.
Instead of speculating, give proof with prompts and responses as one-shots.
1 Like
I don’t know. In my opinion, at least based on my very limited testing, it looks like a full-fat R1. Though there was one issue - Cursor seems to have a timeout, probably around 5 minutes, which, combined with slow inference speeds, can lead to terrible results (cut-off responses mid-thinking without any result).
1 Like
Hey, we should have the cut off issue fixed, DeepSeek R1 was out slowest model to date when we first added it, but it’s now much quicker!
2 Likes