Wednesday, May 20, 2026

Google launches multi-Input video AI model

Google just took AI video creation to the next level with Gemini Omni Flash, turning text, images, and audio into fully synchronized videos in seconds.
New AI model that can create anything from any input, starting with video

Google has launched Gemini Omni Flash, the first model in its new Omni AI family, during Google I/O 2026. The model is being integrated into the  Gemini App,  Flow by Google and  YouTube Shorts, with API access scheduled to roll out in the coming weeks.

Google said Omni Flash can generate video from multiple input formats, including text, photos, audio and existing video clips. The company also confirmed the model can create synchronized video and audio outputs up to 10 seconds long.

The launch expands Google’s multimodal AI strategy beyond text-based assistants and into video creation tools aimed at creators and developers. Unlike earlier Veo-based systems focused mainly on text-to-video generation, Omni Flash can edit and transform existing media inputs.

The announcement came as Google introduced a wider series of AI updates across Search, Android, Workspace and YouTube during its annual developer conference. CEO Sundar Pichai described the company’s latest direction as the “Agentic Gemini Era,” focused on embedding AI across Google’s ecosystem.

Prior to the official unveiling, references to “Gemini Omni” had appeared in leaked screenshots and developer builds circulating online ahead of the conference. Early demonstrations suggested the system was designed for chat-based video generation and editing workflows.

 

Leave a Reply

Your email address will not be published.