Placeholder canvas


Just days after Meta announced their text-to-video generator, Google has announced that it is almost ready to announce its AI-powered text-to-video generator, which they call Google Imagen Video.

The generator is still in its development phase. Still, by the time it reaches a publicly releasable state, it will be capable of producing 1280×768 videos at 24 frames per second from a primarily written prompt.

According to Google’s research paper, Imagen Video will have stylistic abilities, such as generating videos based on the work of famous artists like Vincent van Gough. It will also cause 3D rotating objects while preserving their structure and rendering text in various animation styles.

Google says that Imagen Video has been trained on 14 million video-text pairs and 60 million image-text pairs, as well as the LAION image-text dataset, which was used to train Stable Diffusion.
Google hopes its AI-video model can “significantly decrease the difficulty of high-quality content generation.” Imagen Video builds on Google’s Imagen, a text-to-image program similar to OpenAI’s DALL-E.

As described by Google’s research teach, Imagen Video will take a text description and generate a 16-frame, three-frames-per-second video at 24×48 pixel resolution. The system then upscales and “predicts” additional frames, producing a final 128-frame, 24-frames-per-second video at 720p.

It is worth noting that Google picks all the results from Imagen Video, and as of yet, no independent testers have tried the program.

The research paper claims that Imagen Video can render text correctly, which DALL-E and Stable Diffusion both struggle with. The text that those programs generate is barely readable.

It also claims that Imagen Video has demonstrated an understanding of depth and three-dimensionality, allowing drone flythrough videos to be created that rotate around and capture objects from different angles without distortion.

Google has voiced concerns over “problematic data” to train its AI-image generator programs. The company has attempted to filter out sexually explicit or violent content, social stereotypes, and cultural biases. It is concerned that the tool may be used “to generate fake, hateful, explicit, or harmful content.”

“We have decided not to release the Imagen Video model or its source code until these concerns are mitigated,” adds Google.

Share your love


Follow Us

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

error: Unauthorized Content Copy Is Not Allowed