Multimodal Support

Allow users to use Gemini 2.5 Pro as an “input” where the LLM with 1M token limit be split into context, reasoning, and output; then use Claude Sonnet 3.7 as “output” where it takes Gemini’s output to actually generate the context aware code, this way you avoid having to waste Claude’s limited tokens on context and reasoning as that’s done and calculated in Gemini’s response.

Gemini’s output would have to be tailored and aware that it’s not supposed to generate code, but generate the limitations and requirements for Claude to do the code generation

2 Likes

This would be a game changer, holy sh*t

1 Like