How do you quantify the benefit of using Cursor

After working with LLMs a lot for writing code, I still struggle to answer questions like “how much productivity do you gain?” Or “is there a certain effectiveness boost that can be quantifiable?”. Those are good questions but I can’t seem to find a good answer.

Usually it’s a really good boost, sometimes I one-shot a solution. Other times - I throw out thousands of lines of code because they don’t get me anywhere.

Is there any realistic method to measure how much AI-generated code actually reaches production in such a workflow? How would you do it?

Existing Cursor stats give an insight on token usage, but that’s not what I need.