Google unveiled the next generation of its image and video generation artificial intelligence (ai) models on tuesday at the i/o 2025 event. Dubbed imagen 4 and veo 3, these multimodal ai models Arrive with new capability and upgrades over their predacers. While Imagen 4 Features Faster Generation Times and Improved Text Rendering, Veo 3 Gets Native Audio Generation Capability and Can Integrate BackgRound Sound Sound and DILOGUS IN GENEROGUES IN GENETED VIDES. AlongSide the new models, the tech giant also unveiled a new AI-Powered filmmaking app dubbed flow.
What’s new with imagen 4 and veo 3?
In a blog postThe mountain view-based tech giant detailed the new image and video generation ai models. Imagen 4 comes almost a year after its predacessor was released. In December 2024, Google also relaased veo 2 And updated imagen 3 with new capabilities.
Now, with imagen 4, the company is focusing on generation speed and accuracy of the model. Similar to the previous generation, the latest imagen model also support text and images as input. The generated images witness an improvement in adding fin details such as intricate fabrics, water droplets, and animal fur in images. It can also generate images much faster than its predacesor.
Google Says Imagen 4 can also generate better images in photorealism and abstract styles. It generates output in a wide range of aspect ratios and up to 2K resolution. Additional, the company has made improvements in text rendering by focusing on the spelling of words as well as typography. The model is now more context-aware about text placement, choice of font size, as well as making creative choices about the font style.
Imagen 4 is currently available in the gemini app, whisk, vertex ai (for enterprises), and across wormspace apps such as docs, slides, Vids, Vids, and more. It is not clear Whindle plans to expand the model to all gemini users or just the paid subscribers. Later this year, the company also plans to launch a version of the ai model that can generate images 10x faster than imagen 3.
Coming to Veo 3, Google’s Latest Video Generation Model Now Comes with Native Audio Generation, and it can incorporate Ambient Sounds, Background Noise, and DIALOGUS in Videos. In a demo shown at the i/o 2025 event, two animated characters even with speech to each other with a clear and natural-following voice.
Apart from this, veo 3 also also brings improvements in Prompt Adherence, Real-World Physics, and Accurate Lip Syncing. It is currently available to google ai ultra subscribers in the US via the Gemini App and a newly introduced app dubbed flow. Enterprises can access it via the vertex ai platform.
Flow is an AI-Powered Filmmaking tool that Leverages Gemini, Imagen, and Veo Models. Users can describe a video clip Using Natural Language Prompts, and the app can generate an eight-second-long video. The app is said to have a high prompt adherence, and it can generate consistent frames of cast, locations, objects, and styles. It is available to the google ai pro and ultra plan subscribers in the us.