
Google AI
@GoogleAI • 2,416,729 subscribers
Making AI helpful for everyone. Show thinking ↓
Shorts
Videos

We partnered with artists, designers, and builders to create new AI tools that solve real problems in their creative workflows. Here’s what’s new: — Introducing Google Pics in Google Workspace: A brand-new image creation & editing tool. Move and resize objects, add text, and translate just by hovering and clicking — Big updates to صافي النيه😉: 1) You can now create with Gemini Omni Flash in Google Flow 2) Google Flow Agent is a multi-step creative partner that reasons and plans complex tasks with you. 3) Google Flow tools are custom tools you can “vibe code” for animations, video effects, text layering & more — Design live with Stitch by Google: Now, you can use text or voice prompts to edit layouts in real time then export those designs straight to code — More creative control in صافي النيه😉Music: Edit songs section by section, remix the style of full songs, and create music videos with our new Gemini Omni Flash model
Google AI13,052,428 просмотров • 14 дней назад

We were able to sit down with the Google DeepMind team behind the new Gemini Omni Flash model to hear all of their behind-the-scenes stories, memorable moments, and many, many (occasionally embarrassing) video generations. Watch the full Release Notes episode here:
Google AI7,899,389 просмотров • 15 дней назад

We’re launching a brand new, full-stack vibe coding experience in Google AI Studio, made possible by integrations with the Google Antigravity coding agent and Firebase backends. This unlocks: — Full-stack multiplayer experiences: Create complex, multiplayer apps with fully-featured UIs and backends directly within AI Studio — Connection to real-world services: Build applications that connect to live data sources, databases, or payment processors and the Antigravity agent will securely store your API credentials for you — A smarter agent that works even when you don't: By maintaining a deeper understanding of your project structure and chat history, the agent can execute multi-step code edits from simpler prompts. It also remembers where you left off and completes your tasks while you’re away, so you can seamlessly resume your builds from anywhere — Configuration of database connections and authentication flows: Add Firebase integration to provision Cloud Firestore for databases and Firebase authentication for secure sign-in This demo displays what can be built in the new vibe coding experience in AI Studio. Geoseeker is a full-stack application that manages real-time multiplayer states, compass-based logic, and an external API integration with Google Maps 🕹️
Google AI4,732,468 просмотров • 2 месяцев назад

Introducing Gemini 3.1 Pro 🚀 3.1 Pro represents a major step forward in core reasoning. It scored 77.1% (more than doubling 3 Pro’s score) on ARC-AGI-2, the benchmark that evaluates a model's ability to solve new logic patterns and work through challenges it hasn’t encountered before. This demo illustrates how the model can go beyond the prompt. Instead of rendering a video or static graphic, 3.1 Pro codes a full environment, integrating generative audio and providing UI controls.
Google AI3,066,821 просмотров • 3 месяцев назад

Today, we’re launching Gemma 4, our most intelligent open models to date. Built with the same breakthrough technology as Gemini 3, Gemma 4 brings advanced reasoning to your personal hardware and devices. Here’s what Gemma 4 unlocks for developers: — Intelligence-per-parameter: Our 31B (Dense) and 26B (MoE) models deliver state-of-the-art performance for their size, outcompeting models 20x their size on Arena.ai — Commercial flexibility: Released under a permissive Apache 2.0 license for complete developer flexibility and digital sovereignty — Agentic workflows: Native support for function-calling and structured JSON output allows you to build reliable, autonomous agents — Multimodal edge AI: The E2B and E4B models bring native vision, audio, and low latency to mobile and IoT devices — Long-context reasoning: Up to 256K context windows allow you to process entire repositories or large documents in a single prompt Whether you're building global applications in 140+ languages or local-first AI code assistants, Gemma 4 is built to be your foundation. Explore in Google AI Studio or download the weights on Hugging Face, Kaggle, and ollama.
Google AI1,668,095 просмотров • 2 месяцев назад

We wanted to see if we could take simple, physical materials (like cardboard and markers) and use AI to bring them to life. What was the result? A short film starring a bunch of TPUs getting ready for the big stage at Google I/O 2026! Working with director Laurie Rowan and Nexus Studios, we kept human artistry at the center of the film by blending puppetry and 3D animation with our models to do the following ↓ Nano Banana: Generated beautifully stylized first frames from the raw puppet footage and basic 3D animations. Google AI Studio: Built a custom tool inside the platform to test these frames at scale, ensuring pixel-perfect consistency Gemini Omni & experimental Google DeepMind Models: Merged the base animation and stylized frames to elevate the final piece to a cinematic level. Our AI pipelines were specifically designed to protect the crafty details that give these films their heart, like the tiny human imperfections of puppetry, or the nuance an animator can build into an expression.
Google AI91,338 просмотров • 6 дней назад

Hear the architects of Gemini reflect on their journey to continue pushing the frontier of AI, on this episode of Release Notes. Jeff Dean, koray kavukcuoglu, Oriol Vinyals, and Noam Shazeer sit down on camera together to share a behind-the-scenes look at the people behind the model, and how they saw the vision come together.
Google AI43,772 просмотров • 5 дней назад

Listen up 🔊 We’ve made some updates to our Gemini Audio models and capabilities: — Gemini’s live speech-to-speech translation capability is rolling out in a beta experience to the Google Translate app, bringing you real-time audio translation that captures the nuance of human speech — Gemini 2.5 Flash and 2.5 Pro Text-to-Speech preview models bring improved adherence to style prompts, precision pacing with context-aware speed adjustments, and character voice consistency for multi-speaker scenarios — Gemini 2.5 Flash Native Audio is now updated, with improvements to handle complex workflows, navigate user instructions, and hold natural conversations
Google AI1,091,145 просмотров • 5 месяцев назад

Three years ago, Gemini started by understanding the world. With Gemini 2, models learned to think and reason. Late last year, Gemini 3 brought any idea to life. Today, we’re continuing that journey with our Gemini 3.5 series, starting with Gemini 3.5 Flash, delivering frontier performance for agents and coding.
Google AI68,043 просмотров • 15 дней назад

Today we launched Gemini 3.1 Flash TTS, our most expressive and controllable text-to-speech model yet. This launch [excitement] includes audio tags! 🗣🏷 Audio tags [explanatory] are a seamless way to guide vocal style, pace, and delivery using natural language commands embedded directly in your text. Want a different tempo or tone? [amazement] Just tag the audio to steer the AI-speech output! The model supports 70+ languages (24 of which are high-quality evaluated languages, including: Japanese, Hindi, and Arabic). Watch the audio tags in action in the demo below ↓
Google AI201,290 просмотров • 1 месяц назад

Today we’re taking a big step on the path toward AGI and releasing Gemini 3— our most intelligent model yet. With Gemini 3, you can bring any idea to life. It is state-of-the-art in reasoning, the best model in the world for multimodal understanding, and our best agentic and vibe coding model.
Google AI492,685 просмотров • 6 месяцев назад

New upgrades to the Google Gemini are you helping you get more done: ✨Gemini Spark is your 24/7 personal AI agent that can take action on your behalf, under your direction. It seamlessly integrates with Gmail, Google Docs, and Slides to automate your workflows and, best of all, it can keep working even when your laptop is closed. ☀ ️Daily Brief is our newest out-of-the-box agent that gives you a personalized digest based on your goals, and suggests next steps. Daily Brief is rolling out starting today to all Google AI subscribers (18+) in the Gemini app, starting in the US. Gemini Spark is starting to roll out next week.
Google AI37,067 просмотров • 15 дней назад

Rolling out today we are launching Nano Banana Pro, the world’s best image model built to move beyond casual creation and into a new era of studio-quality, functional design. Nano Banana Pro enables a new level of precision and creative control, transforming the way you bring ideas to life. Here are a couple of our favorite new features: — Text rendering and translation: Generate crystal-clear text directly within your images. With the model’s advanced language understanding, you can even translate and regenerate visuals with localized text. — World knowledge: By connecting to Search’s vast knowledge base, Nano Banana Pro generates factually accurate diagrams and realistic product placements, making it an invaluable tool for learning and communication.
Google AI337,397 просмотров • 6 месяцев назад

Gemini 3 Deep Think is getting an upgrade 🧠 By blending deep scientific knowledge with advanced engineering utility, Deep Think now moves beyond abstract theory to drive practical applications. Researchers are already using it to accelerate their work in the real world: — Materials science: A university lab used Deep Think to optimize the growth of complex crystals that are candidates for high-temperature semiconductors — Mechanical engineering: A researcher demonstrated how Deep Think can be used to iterate on physical prototypes at the speed of software. When applied to things like assistive devices, this pace means faster improvements for life-changing technology (more information on this use case in the video below!) The updated Deep Think is available in Google Gemini for Google AI Ultra subscribers.
Google AI194,699 просмотров • 3 месяцев назад

Wyclef Jean 🤝 Google DeepMind: The rhythm of collaboration 🥁💻 Legendary Grammy-winning music producer and artist Wyclef Jean partnered with Google DeepMind and YouTube to explore Music AI Sandbox, a suite of tools for professional musicians and producers that want to experiment with AI as a collaboration partner. Watch this behind-the-scenes video to hear Wyclef explore the potential of AI-assisted artistry and discuss how he utilized our tools when creating his song "Back from Abu Dhabi."
Google AI156,180 просмотров • 3 месяцев назад

Our animated short film "Dear Upstairs Neighbors" is previewing today at Sundance Film Festival! In creating this film, our Google DeepMind team of Pixar alumni, an Academy Award winner, researchers, and engineers designed new AI capabilities specifically for filmmakers. Here’s how these AI capabilities played a supporting role (pun intended) to human-led creativity: — Custom Training: Veo and Imagen models were trained on the team's original artwork and paintings — Creative control: The team created the story, then AI was used to transform rough animations into stylized videos — Precision editing: The technology allowed the artists to make specific edits without needing to recreate entire shots Learn more about how "Dear Upstairs Neighbors" was created in this behind-the-scenes video with director Connie He 🍿
Google AI179,720 просмотров • 4 месяцев назад

We’re expanding the Gemini 3 family with the launch of Gemini 3 Flash. This model: — Combines Gemini 3’s Pro-grade reasoning with Flash-level latency, efficiency, and cost — Delivers frontier-level performance on PHD-level reasoning and knowledge benchmarks — Is our most impressive model for agentic workflows Watch Gemini 3 Flash handle hundreds of function calling options at low latency, compiling global recipes from 100 ingredients and 100 kitchen tools.
Google AI229,643 просмотров • 5 месяцев назад

Last week, we launched Gemini 3.1 TTS, our latest and best text-to-speech model. This new model introduces [awe] audio tags, an intuitive way to guide vocal style, pace, and delivery. Here are some tips on the best ways to use audio tags in your prompts: 1. All inline tags must be enclosed in square brackets, such as [screams] or [whispers] 2. Insert these tags exactly where you want the transition to occur and make sure to avoid placing tags directly next to each other 3. Use tags like [slow] or [fast] to control the pace of the delivery, or even [short pause] or [long pause] to ramp up the anticipation in dramatic moments 4. The model also offers granular control over vocalizations, allowing you to direct the delivery with cues like [cackles] or [whispers] 5. An ideal audio tag formula could look something like: [encouraging] Let’s try that last sentence again to make sure that you nailed it. [slow] "L'oiseau s'est envolé." [short pause] Perfect! [laughs] You're a natural. No matter what you’re developing — from [scholarly] a language learning tool, to [mysterious] an interactive podcast app, to [friendly] more adaptive customer service offerings, and beyond — these prompting tips will equip you to start building with Gemini 3.1 TTS.
Google AI57,715 просмотров • 1 месяц назад

Last month, we released Lyria 3, enabling you to create tracks with lyrics from text, image, or video prompts. Now, we’re introducing Lyria 3 Pro, which expands upon our music generation model to offer additional advanced capabilities. What’s really special about this upgrade is that the model now understands the architecture of music. This makes it possible to prompt for intros, verses, choruses and bridges + generate songs with more complex transitions. You can also create tracks up to 3 minutes long, a big change from previous models that were limited to 30 second tracks. Use Lyria 3 Pro to build upon your existing creativity. We’re excited for your beats to drop 🎶
Google AI81,170 просмотров • 2 месяцев назад

You might be wondering... How does Project Genie work?🤔 Great question. Project Genie is a prototype web app powered by several of our most advanced AI models, each bringing a unique capability to the equation. From Genie 3's ability to simulate the physics and interactions of any scenario, to Nano Banana's sketching and design, to Gemini's advanced world knowledge and reasoning, this combination results in Project Genie's unique, interactive, (and really cool) user experience. But don't just take our word for it, try it today at:
Google AI136,490 просмотров • 4 месяцев назад


