Google's new Gemini Omni, can generate "anything from any input"
Rohan Paul Twitter · Rohan Paul (@rohanpaul_ai) · 2026-05-19
Google announces Gemini Omni, a multimodal video AI model that creates and edits video clips from any combination of video, image, audio, text, and sketch inputs.
Appears in
Extraction
Topics: gemini-omnimultimodal-aivideo-generationgoogle-ai
Claims
- Google's Gemini Omni can generate output from any input type including video, images, audio, text, and sketches.
- Gemini Omni supports video editing workflows such as adding characters, replacing objects, and changing actions in existing footage.
- Users can record a normal video and instruct Omni to transform it with AI-generated modifications.
Key quotes
Google's new Gemini Omni, can generate 'anything from any input'
A user can record a normal video, then ask Omni to add a character, replace an object, change the action, alter the...