Google's new Gemini Omni, can generate "anything from any input"

Rohan Paul Twitter · Rohan Paul (@rohanpaul_ai) · 2026-05-19

Google announces Gemini Omni, a multimodal video AI model that creates and edits video clips from any combination of video, image, audio, text, and sketch inputs.

Open original ↗

Appears in

Google I/O 2026: Gemini 3.5 and Agents-Everywhere Strategy

Extraction

Topics: gemini-omnimultimodal-aivideo-generationgoogle-ai

Claims

Google's Gemini Omni can generate output from any input type including video, images, audio, text, and sketches.
Gemini Omni supports video editing workflows such as adding characters, replacing objects, and changing actions in existing footage.
Users can record a normal video and instruct Omni to transform it with AI-generated modifications.

Key quotes

Google's new Gemini Omni, can generate 'anything from any input'

A user can record a normal video, then ask Omni to add a character, replace an object, change the action, alter the...