First Model in the Omni Family

Gemini Omni Flash — Features, Availability & How to Use

The first model in Google's Gemini Omni family. Native multimodal video generation with conversational editing, world knowledge, and free access on YouTube Shorts.

Announced May 19, 2026 · Google I/O 2026

Structural foam orb — creative material generation

What is Gemini Omni Flash?

Gemini Omni Flash is the first model in Google DeepMind's Gemini Omni family, announced at Google I/O 2026 on May 19. Omni is a new line of models that natively handles text, images, video, and audio in a single system — and Flash is the first to ship.

Currently, Omni Flash primarily generates video output. Image and audio generation capabilities are planned for future updates.

Unlike traditional AI video tools where you write a prompt, generate, don't like the result, and re-prompt from scratch, Omni Flash supports conversational editing. You describe what you want to change, and the model iterates on the previous output — maintaining character consistency, physical plausibility, and scene context across edits.

Chrome mirror effect — style transfer through conversational editing

Conversational Video Editing

Edit videos through natural language. Each instruction builds on the last — change the environment, adjust the camera angle, swap styles, or modify specific details without starting over. The model remembers previous edits and maintains consistency across iterations.

This is fundamentally different from other AI video tools. Instead of prompt → generate → reject → re-prompt, Omni Flash lets you have a conversation with your video. “Make the lighting warmer.” “Add a particle effect to the background.” “Switch to slow motion.” Each edit builds on the last.

Architectural scene generation with physically plausible structures

World Knowledge Generation

Omni Flash combines physical intuition — gravity, fluid dynamics, kinetics — with Gemini's knowledge of history, science, and culture. It creates videos that go beyond pattern matching.

For example, it can generate a claymation explanation of protein folding, creating an educational video that accurately represents a complex biological process. This world knowledge gives Omni Flash a significant advantage over models that simply replicate visual patterns.

Protein folding visualization — combining AI knowledge with video generation

Multimodal Input & Digital Avatars

Combine images, text, video, and audio as input in any combination. Transfer motion from one video to a reference image, apply style from a photo to generated footage, or add audio-driven effects.

Omni Flash also supports digital avatars — create a video version of yourself that looks and sounds like you. All generated videos include an invisible SynthID digital watermark for content provenance, verifiable through the Gemini App, Chrome, and Google Search.

Astronaut scene — cinematic world knowledge

Whale motion — physics-based fluid dynamics

Skateboarding — character motion consistency

Museum scene — architectural understanding

Where Can You Use Omni Flash?

Gemini App

Available for Google AI Plus, Pro, and Ultra subscribers.

YouTube Shorts Free

Free access via YouTube Shorts and YouTube Create App. No subscription required.

Google Flow

Google's creative workflow tool for professionals.

API Coming Soon

Developer and enterprise access arriving in the next few weeks.

This makes Omni Flash one of the most accessible AI video models. You don't need a paid subscription to try it — just open YouTube Shorts and start creating.

How Omni Flash Compares

vs. Sora (OpenAI)

Sora generates high-quality video but requires re-prompting for edits. Omni Flash's conversational editing workflow is fundamentally different — iterate through conversation rather than starting over.

vs. Kling AI

Kling handles physics-based motion well. Omni Flash competes with Google's ecosystem integration and multimodal input capabilities.

vs. Veo (Google)

Veo is Google's previous video model. Omni Flash is a generational leap — built on Gemini's native multimodal architecture.

vs. Seedance

Seedance 2.1 excels in dance and motion-heavy content. Omni Flash is more general-purpose with Gemini's world knowledge.

The key differentiator: Omni Flash isn't just a video generator — it's a multimodal reasoning model that creates video. It understands physics, maintains context across edits, and combines multiple input types.

Infinite orbs — creative visual effects through natural language

Who Should Use Omni Flash?

Social Media Creators

Generate and iterate on TikTok and YouTube Shorts content quickly.

Marketers

Create ad variations through conversation, test concepts without editing.

Educators

Turn complex topics into visual explainers using Gemini's built-in knowledge.

Developers

Build AI video features into apps (API coming soon).

Explore More

Gemini Omni AI

Complete guide to features and capabilities

Gemini Omni vs Sora

How Google's model compares to OpenAI's video generator

Gemini Omni API

Developer access, integration guides, and API documentation

Pricing & Plans

Subscription tiers and credit packages

Try Gemini Omni Flash Now

Generate fast AI videos from text or images. Start with free credits, no account required.

Gemini Omni

Frequently Asked Questions

Is Gemini Omni Flash free?

Yes. Omni Flash is free to use on YouTube Shorts and the YouTube Create App. For Gemini App access, you need a Google AI Plus, Pro, or Ultra subscription.

What is the difference between Omni Flash and the full Omni model?

Omni Flash is the first model in the Omni family. Future Omni models may offer higher quality output, longer generation times, and additional capabilities. Flash is designed for speed and broad accessibility.

Can I edit generated videos with text instructions?

Yes. This is Omni Flash's core feature. You can iteratively edit videos through natural language — each instruction builds on the previous edit while maintaining character consistency and physical plausibility.

What input types does Omni Flash support?

Omni Flash accepts images, text, video, and audio as input in any combination. Audio input currently supports voice reference only, with other audio types coming soon.

Is there an API for developers?

Not yet, but Google has confirmed API access for developers and enterprise customers is coming in the next few weeks. It will likely be available through Google AI Studio or Vertex AI.

Are Omni Flash videos watermarked?

Yes. All generated videos include an invisible SynthID digital watermark for content provenance. The watermark can be verified through the Gemini App, Chrome, and Google Search.