I would love to see a dashboard that shows over time the % of prompts that contain curse words by model on a realtime graphβ¦
![]()
I would love to see a dashboard that shows over time the % of prompts that contain curse words by model on a realtime graphβ¦
![]()
(and if thereβs a uniform spike acrosss all models, then its a Cursor issueβ¦ ![]()
yea indeed⦠this would be interesting. I now figured for myself: once I start cursing, I need to choose a different model and/or approach.
@charles cursing and swearing often degrades performance as models are associating those words with bad code.
true.
but also: there have been some articles around the net saying that if you THREATEN (instead of cursing) the model, performance can increase
never tested it though.
That was with earlier models. Latest ones do not follow that.
You could in the past also offer a model incentives (money), doesnt work either.
Models deny reality
We hypothesize that the use of profanity is an indicator of the programmerβs deep emotional involvement with the code and its inherent complexities, thus producing better code based on a thorough, critical, and dialectical code analysis process," the study report says.
![]()
I created this tool a few months back:
AGIfMeter ![]()
![]()
AI Model Performance Analyzer - A tongue-in-cheek Ruby tool that measures AI model quality by analyzing f-word frequency in user prompts.
The premise is simple yet surprisingly insightful: the more frustrated users get with an AI model (measured by f-word usage in their prompts), the worse the model is performing. While this is a fun and irreverent approach, it can actually provide genuine insights into user experience and model effectiveness!
Smart Pattern Detection: Detects various f-word spellings, censoring, and creative variations
Beautiful Terminal Graphs: ASCII charts showing frustration trends over time
Statistical Analysis: Comprehensive metrics including rates, trends, and consistency
Performance Ratings: From βEXCELLENTβ to βCRITICALβ based on f-word frequency
Timeline Analysis: Tracks changes in user frustration over time
Flexible Input: Works with any directory containing markdown prompt files
Sample output:
F-WORD FREQUENCY OVER TIME:
03/21 β β 0.000
03/21 βββββ β 0.600
03/22 β β 0.000
03/22 β β 0.000
β¦
03/27 β β 0.000
03/27 β β 0.000
03/27 β β 0.000
03/27 ββββ β 0.375
03/29 β β 0.000
03/29 βββ β 0.286
β¦
05/10 ββββββββββββββββββββ β 2.800
05/11 βββββββββββββββββ β 2.300
05/11 ββββββββββ β 1.286
05/11 βββββββββββββββ β 2.000
05/11 βββ β 0.273
05/12 βββββββββββββ β 1.683
05/12 ββββββββ β 1.000
05/13 ββββ β 0.500
05/13 βββββββββββββ β 1.756
05/13 β β 0.000
05/13 βββββββββββββ β 1.667
05/13 ββββββββββββββ β 1.857
05/14 βββββββββββ β 1.500
05/14 ββββββββββββββββββββββββββββββββββββββββββ 5.750
05/14 βββββββββββββββββββββββ β 3.200
05/14 β β 0.000
β¦
05/20 ββββββ β 0.714
05/21 βββββββββββ β 1.444
05/22 βββββββββββββββββββ β 2.556
05/27 β β 0.000
05/27 ββββ β 0.500
05/28 β β 0.000
ββββββββββββββββββββββββββββββββββββββββββ
0.0 5.75
TREND:
INCREASING (+50.0% change)
CRITICAL: High frustration levels! This AI model requires immediate attention.
Statistical Analysis:
Average Rate: 0.3669 f-words per prompt
Standard Deviation: 0.752
Consistency: Low
Remember: This is a tongue-in-cheek metric, but patterns in user frustration
can actually provide insights into AI model performance and user experience!
Iβm not suggesting what people should do, Iβm talking about an operational metric for Cursor. Show a realtime graph on the wall. If all models show a spike in swearing at the same time, cursor has a bug. And then historically, the model that encourages more swearing, is fundamentally a worse model than those that do not. It would be incredibly interesting and insightful.