It would be wonderful to utilize Gemini’s video and image capabilities, which, from what I’ve read, have improved recently, to help debug applications.
Imagine making a video of a UI problem and sending it to Gemini along with all the necessary files, screenshots, and other relevant information. You could also record a video of DevTools, the Styles tab, and the Network tab.
This could enhance the quality of its suggestions, as this is how humans typically debug and work on code.
When it comes to frontend development, humans generally cannot debug code without constantly referring to the DevTools and its various outputs. Therefore, I believe that the key missing element—or perhaps a major step—towards achieving full AGI and automation is allowing the AI to access screen content, DevTools data, UI views, and so on.
For now, this will work with our own API key, but if we succeed in effectively prompting the model with all this visual information, imagine what we could create and eventually automate!
So, could you please consider adding the ability to attach videos and multiple images? I believe this wouldn’t be too difficult, but it would open up a whole new world of experimentation.