skip to content

From Android 15, to Sora-like AI video generator here’s everything announced at Google I/O 2024

The latest Google I/O event was all about AI. They introduced many new features, especially Android 15, which is still being tested. Most of the updates were about making AI more central to how Android works and how we search on Google.

Gemini, a feature introduced last year, is getting even more integrated into Pixel phones. Soon, you’ll be able to use Gemini over any app to add AI-generated text or pictures easily. They’re also improving Circle to Search and Gemini Nano to better understand text, images, and audio.

AI was a big focus, mentioned 121 times in the keynote. Here are the main updates Google announced.

Gemini gets some updates, and a new version
Google has announced a new AI model called Gemini 1.5 Flash, which is optimized for incredible speed and efficiency. Flash sits between Gemini 1.5 Pro and Gemini 1.5 Nano, the company’s most miniature model that runs locally on the device.

Google said that it created Flash because developers wanted a lighter and less expensive model than Gemini Pro to build AI-powered apps and services while keeping some of the things that differentiate Gemini Pro from competing models, like a long context window of one million tokens.

Google will also increase, or rather almost double, Gemini’s context window to two million tokens later this year. This will allow it to process two hours of video, 22 hours of audio, more than 60,000 lines of code, or more than 1.4 million words at the same time.

Google introduced Gemini 1.5 Flash, a new AI model focused on speed and efficiency. It fills the gap between Gemini 1.5 Pro and Gemini 1.5 Nano, the most miniature model that operates locally on devices.

This decision was made based on developers’ requests for a lighter and more affordable alternative to Gemini Pro for building AI-powered apps and services while retaining features like a long context window of one million tokens, distinguishing Gemini Pro from other models.

Later this year, Google plans to expand Gemini’s context window to two million tokens, enabling it to process two hours of video, 22 hours of audio, over 60,000 lines of code, or more than 1.4 million words simultaneously.

Project Astra
Project Astra was also showcased, representing an early iteration of a universal AI-powered assistant. Described by Google’s DeepMind CEO Demis Hassabis as Google’s equivalent of an AI agent for everyday use, Astra was demonstrated through a video filmed in a single take.

In the video, a user navigates Google’s London office with their phone, engaging in natural conversations with the app while pointing the camera at objects such as a speaker, code on a whiteboard, and the view outside a window.

Notably, Astra accurately identifies the location of the user’s glasses without them mentioning it. The video concludes with a revelation that the glasses contain an onboard camera system and can seamlessly interact with Project Astra, hinting at Google’s potential development of smart glasses comparable to Meta’s Ray Ban.

Google Photos answers everything.
Google Photos has stepped up its game with AI, making it even brighter at finding specific images or videos. If you’re a Google One subscriber in the US, you can soon ask Google Photos complex questions like “Show me the best photo from each national park I’ve visited.”

This feature, rolling out over the next few months, will use GPS info and Google’s own judgment to present you with options. Additionally, you can ask Google Photos to generate captions for your photos to share on social media.

Imagine Imagen 3 and Veo
Google has introduced two new AI-powered media creation engines: Veo and Imagen 3. Veo is Google’s response to OpenAI’s Sora, capable of producing high-quality 1080p videos that can last longer than a minute. It’s equipped to understand cinematic concepts like timelapses.

Imagen 3, on the other hand, is a text-to-image generator that Google claims surpasses its previous version, Imagen 2. This upgraded model produces the company’s highest quality text-to-image results with an incredible level of detail, creating photorealistic, lifelike images with fewer flaws. This places it in competition with OpenAI’s DALLE-3.

Search reborn
Google is revolutionising the core functionality of Search with significant updates. While many of the announced features, such as the ability to ask intricate questions or plan meals and trips using Search, will only be accessible through opting into Search Labs, a platform for testing experimental features.

A major new addition called AI Overviews, which Google has been testing for a year, is now rolling out to millions of users in the US. This feature will display AI-generated answers at the top of search results by default and is slated to reach over a billion users worldwide by year’s end.

Android 15 gets painted with AI
Google is directly integrating Gemini into the Android operating system. With the release of Android 15 later this year, Gemini will be contextually aware of the app, image, or video being used, allowing users to summon it as an overlay for specific inquiries. The implications for Google Assistant, which currently fulfils similar functions, remain unclear as it was not addressed in today’s keynote.

Other updates include the addition of digital watermarks to AI-generated content, the integration of Gemini into the side panel of Gmail and Docs, the implementation of a virtual AI teammate in Workspace, real-time detection of potential scams during phone calls, and various other enhancements.

Share your love
Facebook
Twitter
LinkedIn
WhatsApp

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

error: Unauthorized Content Copy Is Not Allowed