It’s hard to keep up with all those new models all the time.
Some weeks ago, I preferred to:
plan with Gemini 2.5 pro thinking
implement w agent using Sonnet 3.5
Now we have o3, Sonnet 4 (and Gemini) for thinking, and Claude 4 got much better.
I often go directly to agent w Sonnet 4 thinking.
For more complicated stuff, I tried o3 for planning, and Sonnet 4 to implement. o3 feels slow, cause it asks a lot of questions. But that may be an advantage, even. Or Sonnet 4 thinking is good enough?
Max is included in the new plans but note that with any resources used the more you use it the faster you get to the rate limit. Naturally the Ultra plan has higher rate limits.
Therefore less consumption will be with regular models and most/faster consumption will be in Max mode. For most users non-Max will be great usage.