Google AI

@GoogleAI • 2,430,923 subscribers

Making AI helpful for everyone. Show thinking ↓

Shorts

By now, you've probably heard about Gemini Omni, our new model designed to create anything from any input, starting with video. But... what's the big deal? Let’s break it down 🧵👇

226,479 views

Announcing Personal Intelligence, a more personalized Google Gemini designed just for you. How it works: — Customized: With your permission, it reasons across your Gmail, YouTube, Google Photos, and Search apps to share hyper-relevant and context-aware responses — Secure: If enabled, you control which Google apps to connect to. This setting is off by default — Useful: From travel plans based on your Google Photos to gym recommendations based on goals you’ve shared with Gemini, you get help tailored to your world Personal Intelligence in beta is rolling out to Google AI Pro and AI Ultra subscribers in the U.S., with expansions to the free tier, more countries, and AI Mode in Search to come. Take a look at the Gemini app's personalized assistance in the clip below, then let us know what you would use it for!

320,325 views

Beyond generating high-fidelity visuals, we wanted to test the limits of what Nano Banana Pro can do. We worked with design partners Porto Rocha to build out a hypothetical brand called YOYOYO to see how the model would handle the task. Here’s what we found: 🎨Brand consistency: Across logos, colors, and typography, the model maintained a strict, cohesive brand identity (even for wildly diverse concepts) 🛍️Environmental realism: We asked to see the products in storefront and studio mockups. It nailed accurate lighting, shadows, and physical proportions - even when upscaled for massive retail displays 🪀Spatial accuracy: We tested spatial volumes for physical packaging. The generated proportions were so precise that we were able to 3D-print the functional yo-yo How have you been pushing the limits of Nano Banana Pro? Let us know in the replies below!

123,192 views

Today, we launched a brand-new intelligent Search box. Here's what that means: An upgrade to the Search experience with our most advanced Gemini 3.5 models, bringing with them our latest agentic capabilities You can ask across modalities (text, images, files, and videos) and Search can reason across them all We're combining AI Overviews and AI Mode into one, seamless AI Search experience. So you can ask follow-up questions, build context, and received even more tailored and personalized responses This new AI Search experience is live today across desktop and mobile, worldwide.

44,591 views

Meet Gemma 3n, a model that runs on as little as 2GB of RAM 🤯 It shares the same architecture as Gemini Nano, and is engineered for incredible performance. We added audio understanding, so now it’s multimodal, fast and lean, and runs on-device (no cloud connection required!)

228,686 views

Graph clustering merges similar items into groups to better understand relationships in data. Today, read about our recent works, including key techniques that enabled us to scale a high-quality algorithm that can cluster trillion-edge graphs. Read more →

265,411 views

Introducing SEEDS, our newest generative AI technology that advances medium-range weather forecasting. We can now generate ensemble forecasts more efficiently, helping us better predict rare and extreme weather events. 🌩️ #WeatherForecasting Learn more at

205,778 views

Quantum computers offer many promising applications dependent on greatly improved performance. Read how we’ve combined quantum error correction w/ our latest superconducting processor, Willow, exponentially reducing error rates w/ increasing qubit scale →

146,774 views

Today we introduced AlphaGenome, a new tool that can more comprehensively predict the impact of single variants or mutations in DNA 🧬 How, you ask? 🤔 tldr; Our AlphaGenome model takes a long DNA sequence as input, processes that data, and predicts thousands of molecular properties by characterizing its regulatory activity. For the full read ➡️

83,048 views

What’s MedGemma? 🤔 It’s our collection of open, multimodal medical models that are designed to help developers build AI tools for healthcare, such as analyzing radiology images or summarizing notes for physicians. We built this demo using MedGemma to help showcase the possibilities of the model. What other use cases can you foresee for this technology?

65,083 views

We trained our Large Sensor Model (LSM) on over 40 million hours of de-identified multimodal sensor data from 165K users to demonstrate how it could improve performance in wearable tasks like exercise and activity recognition. Here’s what we found →

63,629 views

That dog in your photo? He's got something to say. 🐶 Turn your images into eight-second video clips with sound effects and speech in the Google Gemini and Flow from Google Labs. This feature uses Veo 3 to generate motion that reflects real world physics and includes a new experimental audio capability so you can really bring your images to life. Try it at and

55,236 views

Chrome 🧵4/5 We’ve also built a deeper integration between Gemini in Chrome and your favorite Google apps, like Calendar, YouTube and Maps, so you can schedule meetings, see location details and more without leaving the page you’re on.

39,027 views

It’s no secret the human brain is a complex structure. Even so, #AI has emerged as a powerful tool to map out its complicated pathways. Discover the advancements our Connectomics team & Harvard University University researchers are making to understand the brain →

63,985 views

Here are some really tactical ways to optimize the First and last frame capability: — Make sure your prompt includes precise camera motion descriptions for smooth and creative transitions between the first and last frames. — Or, you can simply include the word “transform” in your prompt for a smooth transition — Starting with a closeup in your first frame and then zooming out to a wide-shot in the final frame is an effective way to create a dramatic reveal.

28,595 views

We’re proud to highlight Google Research’s contributions to improving Clear Calling, the background noise reduction feature on Pixel, which can now handle full-band audio & is powered by an audio-to-audio ML model that was optimized to run at low latency on Google Tensor.

69,420 views

Alright, now that we know what an agent is, how does it actually work? When you ask for help on a task, the agent plans a series of steps and executes them directly in the application on your behalf, using the tools it has access to. Say you are booking a local service or trying to organize your inbox (which typically takes multiple steps): the AI model first plans how to achieve the task using its existing knowledge and then interacts with your inbox to execute the task. The agent will continue until it is confident the task has been successfully completed.

22,487 views

We translated the endurance of competitive hot dog eating into a game by prompting Gemini to “Create a HTML, CSS, Javascript hot dog eating contest game. Game mechanics is user needs to click super fast to eat each hotdog. Add a glass of water to help digest and allowing for faster eating when a user eats too many hotdogs. Timer is 1 minute.“ The Google Gemini App built the game mechanics in a single prompt, so the rest of our vibe coding focused on refining UI design with Gemini. Play here:

31,465 views

Population dynamics can provide insights into domains ranging from health to environmental science. Here we introduce a geospatial foundation model (plus embeddings and code recipes) that could be employed for a variety of downstream tasks. →

41,911 views

Videos

LIVE

1.2k

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Streaming Now

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

HD live stream

Exclusive private shows

1.2k viewers online

Current Status

Live

Private Show

Join now for exclusive access

Free preview available • Premium content

0:44

We partnered with artists, designers, and builders to create new AI tools that solve real problems in their creative workflows. Here’s what’s new: — Introducing Google Pics in Google Workspace: A brand-new image creation & editing tool. Move and resize objects, add text, and translate just by hovering and clicking — Big updates to صافي النيه😉: 1) You can now create with Gemini Omni Flash in Google Flow 2) Google Flow Agent is a multi-step creative partner that reasons and plans complex tasks with you. 3) Google Flow tools are custom tools you can “vibe code” for animations, video effects, text layering & more — Design live with Stitch by Google: Now, you can use text or voice prompts to edit layouts in real time then export those designs straight to code — More creative control in صافي النيه😉Music: Edit songs section by section, remix the style of full songs, and create music videos with our new Gemini Omni Flash model

Google AI

13,952,992 views • 1 month ago

1:29

We were able to sit down with the Google DeepMind team behind the new Gemini Omni Flash model to hear all of their behind-the-scenes stories, memorable moments, and many, many (occasionally embarrassing) video generations. Watch the full Release Notes episode here:

Google AI

13,118,427 views • 2 months ago

0:25

Today, we released Gemini 3.5 Live Translate, our latest audio model for live speech-to-speech translation. It supports over 70 languages and starts translating as soon as you start talking, streaming translations while listening to what you say next. No awkward pauses or choppy audio, just real connection without language barriers. So, how does it work? 🤔 The model is able to make split-second decisions to juggle speed and translation quality so conversations actually feel fluid, human, and natural. In order to do this, the model must receive and contextualize the input while simultaneously outputting the translated speech. Through this process, Gemini 3.5 Live Translate manages to stay mere seconds behind each speaker and can even maintain pacing, pitch, and intonation across extended sessions. See it in action below, or try it yourself in the Google Translate app on iOS & Android.

Google AI

3,977,250 views • 1 month ago

0:33

We’re launching a brand new, full-stack vibe coding experience in Google AI Studio, made possible by integrations with the Google Antigravity coding agent and Firebase backends. This unlocks: — Full-stack multiplayer experiences: Create complex, multiplayer apps with fully-featured UIs and backends directly within AI Studio — Connection to real-world services: Build applications that connect to live data sources, databases, or payment processors and the Antigravity agent will securely store your API credentials for you — A smarter agent that works even when you don't: By maintaining a deeper understanding of your project structure and chat history, the agent can execute multi-step code edits from simpler prompts. It also remembers where you left off and completes your tasks while you’re away, so you can seamlessly resume your builds from anywhere — Configuration of database connections and authentication flows: Add Firebase integration to provision Cloud Firestore for databases and Firebase authentication for secure sign-in This demo displays what can be built in the new vibe coding experience in AI Studio. Geoseeker is a full-stack application that manages real-time multiplayer states, compass-based logic, and an external API integration with Google Maps 🕹️

Google AI

4,743,127 views • 4 months ago

0:47

Introducing Gemini 3.1 Pro 🚀 3.1 Pro represents a major step forward in core reasoning. It scored 77.1% (more than doubling 3 Pro’s score) on ARC-AGI-2, the benchmark that evaluates a model's ability to solve new logic patterns and work through challenges it hasn’t encountered before. This demo illustrates how the model can go beyond the prompt. Instead of rendering a video or static graphic, 3.1 Pro codes a full environment, integrating generative audio and providing UI controls.

Google AI

3,067,002 views • 4 months ago

2:51

Today, we’re launching Gemma 4, our most intelligent open models to date. Built with the same breakthrough technology as Gemini 3, Gemma 4 brings advanced reasoning to your personal hardware and devices. Here’s what Gemma 4 unlocks for developers: — Intelligence-per-parameter: Our 31B (Dense) and 26B (MoE) models deliver state-of-the-art performance for their size, outcompeting models 20x their size on Arena.ai — Commercial flexibility: Released under a permissive Apache 2.0 license for complete developer flexibility and digital sovereignty — Agentic workflows: Native support for function-calling and structured JSON output allows you to build reliable, autonomous agents — Multimodal edge AI: The E2B and E4B models bring native vision, audio, and low latency to mobile and IoT devices — Long-context reasoning: Up to 256K context windows allow you to process entire repositories or large documents in a single prompt Whether you're building global applications in 140+ languages or local-first AI code assistants, Gemma 4 is built to be your foundation. Explore in Google AI Studio or download the weights on Hugging Face, Kaggle, and ollama.

Google AI

1,669,327 views • 3 months ago

0:27

Step into the map with the Street View grounding feature in Project Genie from Google DeepMind and Google Labs. Announced at I/O, this research prototype uses locations from Google Maps Street View as a foundation, letting you generate and explore interactive, 360-degree virtual environments from just a text prompt or real-world starting place. Cool, right? But… How does it actually work? 🤔 As an experimental tool, Project Genie tackles the "blank space" problem (showing both what’s in front of the camera and behind it) by utilizing Street View data to realistically generate a 360-degree view of the location you selected as the starting point for your world to generate from. Worlds generated by Genie are far more dynamic and rich because they’re created frame-by-frame based on the world description and user actions. By predicting each subsequent frame, Genie is able to simulate what it looks like to swim across an ocean, or hike to the top of a peak, marking a massive shift in interactive media and simulation pipelines. What real-world place would you want to step into and explore?

Google AI

72,964 views • 8 days ago

$We’re shipping two major updates to streamline your creative workflow, allowing you to generate high-speed images with one model and then instantly animate them with the other—all at a fraction of the cost 🍌⚡️ 1️⃣ Introducing Nano Banana 2 Lite: Our fastest and most cost-efficient Gemini Image model yet delivers text-to-image outputs in under 4 seconds. Now available via the Gemini API and Google AI Studio, and rolling out soon across @NotebookLM, Google Flow, Google Gemini, Stitch by Google, Google Search and Google Photos. 2️⃣ Gemini Omni Flash in Public Preview: Our natively multimodal model for cost-efficient video generation and conversational editing. Now available via the Gemini API, Google AI Studio, and Gemini Enterprise Agent Platform so you can integrate the model into your workflow. While exciting on their own, the real magic happens when you build using these models together. Watch how our interior design demo integrates Nano Banana 2 Lite and Omni to instantly reimagine any space. Upload a photo, swipe through tailored design concepts, and see Omni bring the details to life in cinematic motion. Try out the demo app in AI Studio:$

0:27

We’re shipping two major updates to streamline your creative workflow, allowing you to generate high-speed images with one model and then instantly animate them with the other—all at a fraction of the cost 🍌⚡️ 1️⃣ Introducing Nano Banana 2 Lite: Our fastest and most cost-efficient Gemini Image model yet delivers text-to-image outputs in under 4 seconds. Now available via the Gemini API and Google AI Studio, and rolling out soon across @NotebookLM, Google Flow, Google Gemini, Stitch by Google, Google Search and Google Photos. 2️⃣ Gemini Omni Flash in Public Preview: Our natively multimodal model for cost-efficient video generation and conversational editing. Now available via the Gemini API, Google AI Studio, and Gemini Enterprise Agent Platform so you can integrate the model into your workflow. While exciting on their own, the real magic happens when you build using these models together. Watch how our interior design demo integrates Nano Banana 2 Lite and Omni to instantly reimagine any space. Upload a photo, swipe through tailored design concepts, and see Omni bring the details to life in cinematic motion. Try out the demo app in AI Studio:

Google AI

119,113 views • 18 days ago

2:50

Listen up 🔊 We’ve made some updates to our Gemini Audio models and capabilities: — Gemini’s live speech-to-speech translation capability is rolling out in a beta experience to the Google Translate app, bringing you real-time audio translation that captures the nuance of human speech — Gemini 2.5 Flash and 2.5 Pro Text-to-Speech preview models bring improved adherence to style prompts, precision pacing with context-aware speed adjustments, and character voice consistency for multi-speaker scenarios — Gemini 2.5 Flash Native Audio is now updated, with improvements to handle complex workflows, navigate user instructions, and hold natural conversations

Google AI

1,092,358 views • 7 months ago

2:58

We wanted to see if we could take simple, physical materials (like cardboard and markers) and use AI to bring them to life. What was the result? A short film starring a bunch of TPUs getting ready for the big stage at Google I/O 2026! Working with director Laurie Rowan and Nexus Studios, we kept human artistry at the center of the film by blending puppetry and 3D animation with our models to do the following ↓ Nano Banana: Generated beautifully stylized first frames from the raw puppet footage and basic 3D animations. Google AI Studio: Built a custom tool inside the platform to test these frames at scale, ensuring pixel-perfect consistency Gemini Omni & experimental Google DeepMind Models: Merged the base animation and stylized frames to elevate the final piece to a cinematic level. Our AI pipelines were specifically designed to protect the crafty details that give these films their heart, like the tiny human imperfections of puppetry, or the nuance an animator can build into an expression.

Google AI

124,067 views • 1 month ago

1:14

Today we launched Gemini 3.1 Flash TTS, our most expressive and controllable text-to-speech model yet. This launch [excitement] includes audio tags! 🗣🏷 Audio tags [explanatory] are a seamless way to guide vocal style, pace, and delivery using natural language commands embedded directly in your text. Want a different tempo or tone? [amazement] Just tag the audio to steer the AI-speech output! The model supports 70+ languages (24 of which are high-quality evaluated languages, including: Japanese, Hindi, and Arabic). Watch the audio tags in action in the demo below ↓

Google AI

202,556 views • 3 months ago

Today we’re taking a big step on the path toward AGI and releasing Gemini 3— our most intelligent model yet. With Gemini 3, you can bring any idea to life. It is state-of-the-art in reasoning, the best model in the world for multimodal understanding, and our best agentic and vibe coding model.

0:51

Today we’re taking a big step on the path toward AGI and releasing Gemini 3— our most intelligent model yet. With Gemini 3, you can bring any idea to life. It is state-of-the-art in reasoning, the best model in the world for multimodal understanding, and our best agentic and vibe coding model.

Google AI

492,853 views • 8 months ago

Rolling out today we are launching Nano Banana Pro, the world’s best image model built to move beyond casual creation and into a new era of studio-quality, functional design. Nano Banana Pro enables a new level of precision and creative control, transforming the way you bring ideas to life. Here are a couple of our favorite new features: — Text rendering and translation: Generate crystal-clear text directly within your images. With the model’s advanced language understanding, you can even translate and regenerate visuals with localized text. — World knowledge: By connecting to Search’s vast knowledge base, Nano Banana Pro generates factually accurate diagrams and realistic product placements, making it an invaluable tool for learning and communication.

1:18

Rolling out today we are launching Nano Banana Pro, the world’s best image model built to move beyond casual creation and into a new era of studio-quality, functional design. Nano Banana Pro enables a new level of precision and creative control, transforming the way you bring ideas to life. Here are a couple of our favorite new features: — Text rendering and translation: Generate crystal-clear text directly within your images. With the model’s advanced language understanding, you can even translate and regenerate visuals with localized text. — World knowledge: By connecting to Search’s vast knowledge base, Nano Banana Pro generates factually accurate diagrams and realistic product placements, making it an invaluable tool for learning and communication.

Google AI

337,633 views • 8 months ago

1:34

Gemini 3 Deep Think is getting an upgrade 🧠 By blending deep scientific knowledge with advanced engineering utility, Deep Think now moves beyond abstract theory to drive practical applications. Researchers are already using it to accelerate their work in the real world: — Materials science: A university lab used Deep Think to optimize the growth of complex crystals that are candidates for high-temperature semiconductors — Mechanical engineering: A researcher demonstrated how Deep Think can be used to iterate on physical prototypes at the speed of software. When applied to things like assistive devices, this pace means faster improvements for life-changing technology (more information on this use case in the video below!) The updated Deep Think is available in Google Gemini for Google AI Ultra subscribers.

Google AI

194,747 views • 5 months ago

1:06

We’re expanding the Gemini 3 family with the launch of Gemini 3 Flash. This model: — Combines Gemini 3’s Pro-grade reasoning with Flash-level latency, efficiency, and cost — Delivers frontier-level performance on PHD-level reasoning and knowledge benchmarks — Is our most impressive model for agentic workflows Watch Gemini 3 Flash handle hundreds of function calling options at low latency, compiling global recipes from 100 ingredients and 100 kitchen tools.

Google AI

229,722 views • 7 months ago

2:54

Wyclef Jean 🤝 Google DeepMind: The rhythm of collaboration 🥁💻 Legendary Grammy-winning music producer and artist Wyclef Jean partnered with Google DeepMind and YouTube to explore Music AI Sandbox, a suite of tools for professional musicians and producers that want to experiment with AI as a collaboration partner. Watch this behind-the-scenes video to hear Wyclef explore the potential of AI-assisted artistry and discuss how he utilized our tools when creating his song "Back from Abu Dhabi."

Google AI

156,180 views • 4 months ago

1:06

Our animated short film "Dear Upstairs Neighbors" is previewing today at Sundance Film Festival! In creating this film, our Google DeepMind team of Pixar alumni, an Academy Award winner, researchers, and engineers designed new AI capabilities specifically for filmmakers. Here’s how these AI capabilities played a supporting role (pun intended) to human-led creativity: — Custom Training: Veo and Imagen models were trained on the team's original artwork and paintings — Creative control: The team created the story, then AI was used to transform rough animations into stylized videos — Precision editing: The technology allowed the artists to make specific edits without needing to recreate entire shots Learn more about how "Dear Upstairs Neighbors" was created in this behind-the-scenes video with director Connie He 🍿

Google AI

179,899 views • 5 months ago

1:01

Three years ago, Gemini started by understanding the world. With Gemini 2, models learned to think and reason. Late last year, Gemini 3 brought any idea to life. Today, we’re continuing that journey with our Gemini 3.5 series, starting with Gemini 3.5 Flash, delivering frontier performance for agents and coding.

Google AI

68,715 views • 2 months ago

1:36

Hear the architects of Gemini reflect on their journey to continue pushing the frontier of AI, on this episode of Release Notes. Jeff Dean, koray kavukcuoglu, Oriol Vinyals, and Noam Shazeer sit down on camera together to share a behind-the-scenes look at the people behind the model, and how they saw the vision come together.

Google AI

53,402 views • 1 month ago

1:09

You might be wondering... How does Project Genie work?🤔 Great question. Project Genie is a prototype web app powered by several of our most advanced AI models, each bringing a unique capability to the equation. From Genie 3's ability to simulate the physics and interactions of any scenario, to Nano Banana's sketching and design, to Gemini's advanced world knowledge and reasoning, this combination results in Project Genie's unique, interactive, (and really cool) user experience. But don't just take our word for it, try it today at:

Google AI

136,613 views • 5 months ago

Live Cam

Google AI

Shorts

By now, you've probably heard about Gemini Omni, our new model designed to create anything from any input, starting with video. But... what's the big deal? Let’s break it down 🧵👇

Meet Gemma 3n, a model that runs on as little as 2GB of RAM 🤯 It shares the same architecture as Gemini Nano, and is engineered for incredible performance. We added audio understanding, so now it’s multimodal, fast and lean, and runs on-device (no cloud connection required!)

Graph clustering merges similar items into groups to better understand relationships in data. Today, read about our recent works, including key techniques that enabled us to scale a high-quality algorithm that can cluster trillion-edge graphs. Read more →

Introducing SEEDS, our newest generative AI technology that advances medium-range weather forecasting. We can now generate ensemble forecasts more efficiently, helping us better predict rare and extreme weather events. 🌩️ #WeatherForecasting Learn more at

Quantum computers offer many promising applications dependent on greatly improved performance. Read how we’ve combined quantum error correction w/ our latest superconducting processor, Willow, exponentially reducing error rates w/ increasing qubit scale →

We trained our Large Sensor Model (LSM) on over 40 million hours of de-identified multimodal sensor data from 165K users to demonstrate how it could improve performance in wearable tasks like exercise and activity recognition. Here’s what we found →

Chrome 🧵4/5 We’ve also built a deeper integration between Gemini in Chrome and your favorite Google apps, like Calendar, YouTube and Maps, so you can schedule meetings, see location details and more without leaving the page you’re on.

It’s no secret the human brain is a complex structure. Even so, #AI has emerged as a powerful tool to map out its complicated pathways. Discover the advancements our Connectomics team &amp; Harvard University University researchers are making to understand the brain →

We’re proud to highlight Google Research’s contributions to improving Clear Calling, the background noise reduction feature on Pixel, which can now handle full-band audio &amp; is powered by an audio-to-audio ML model that was optimized to run at low latency on Google Tensor.

Population dynamics can provide insights into domains ranging from health to environmental science. Here we introduce a geospatial foundation model (plus embeddings and code recipes) that could be employed for a variety of downstream tasks. →

Videos

Watch Anya Live

We were able to sit down with the Google DeepMind team behind the new Gemini Omni Flash model to hear all of their behind-the-scenes stories, memorable moments, and many, many (occasionally embarrassing) video generations. Watch the full Release Notes episode here:

Today we’re taking a big step on the path toward AGI and releasing Gemini 3— our most intelligent model yet. With Gemini 3, you can bring any idea to life. It is state-of-the-art in reasoning, the best model in the world for multimodal understanding, and our best agentic and vibe coding model.

It’s no secret the human brain is a complex structure. Even so, #AI has emerged as a powerful tool to map out its complicated pathways. Discover the advancements our Connectomics team & Harvard University University researchers are making to understand the brain →

We’re proud to highlight Google Research’s contributions to improving Clear Calling, the background noise reduction feature on Pixel, which can now handle full-band audio & is powered by an audio-to-audio ML model that was optimized to run at low latency on Google Tensor.