Not sure if I’m doing something incorrect (there is not a ton of documentation on this TBH), but Cursor seems to be really bad at generating even simple unit tests. As a simple example, I have function that simply differences two dates, and Cursor (via the gpt-4o model) generates a test that passes only one argument. It can’t really be this bad, can it? Even the vaunted claude-3.5-sonnet does not do much better, in terms of generating tests that pass. Any tips or tricks much appreciated.
What happens if you provide the context manually to the web interfaces or API
I’m generating the test within the same file that the function is written. There is not any additional context that I can think of that would be relevant, unless I’m missing something.
My point was to test whether it’s a cursor thing or an llm thing by trying it with the LLMs outside of cursor