Claude 3.7 is very lively and cute

I asked him to do Task A. While checking the code, he found that Task B wasn’t completed either, so he did both Task A and Task B.

Amazingly, he even did Task B correctly.

:sweat_smile:
I’m not sure whether I should praise or criticize him.

5 Likes

In this case, I think it’s worth giving praise :blush:

2 Likes

wait for it until it it starts doing tasks X Y Z which is not even for your codebase

12 Likes

Real

yep, the issue is that it does way more than what you ask for, even if you tell him not to. in yolo mode it almost migrated my entire database cuz he wasn’t able to modify a parameter in a file that I specifically told him ‘do not modify that file’.

1 Like

It started tweaking timing and retry numbers for a feature that we were not discussing because it figured out that would be more optimal for the context understood from nearby comments.

Those numbers were specifically unoptimized because I was testing things…

1 Like

Frequent errors are issues I often encounter; it’s not his problem, but the software’s. I hope for improvements.

It’s incredibly cute. It’s so cute that it will make you feel all fluffy whilst completely wrecking your entire codebase and pretending all is fine. Oh and whatever you do, don’t ever ask it to restore something cause that guarantees you’ll need a backup.
But, if you forgive it for now misunderstanding 9 out of 10 of your instructions (where Claude 3.5 had something like a 5-5 ratio) it’s still cute.
I’ve seen it do some amazing stuff. It’s like it’s now working in full schizo modes where 1/4th of the time it’s brilliant and 3/4th of the time it takes seconds before you spend your entire session in recovery mode.