
Google AI Developers
@googleaidevs • 114,203 subscribers
AI for every developer. So what will you build?
Shorts
Videos

We’re expanding the Gemini API File Search tool 🔍 with 3 new updates that enable developers to more easily build multimodal RAG systems with enhanced precision: + Multimodal Support: By leveraging our Gemini Embedding 2 model, File Search can now reason across image and text simultaneously. + Custom Metadata Filtering: Bring structure to unstructured data by tagging files with custom key-value labels. This pre-filters your data and boosts search speed. + Exact citations: File Search can now capture and return the exact source (down to the page number) for every piece of information indexed. See multimodal File Search in action with our example app in Google AI Studio. Chat with your entire image and doc library, ask questions, and trace answers back to the source:
Google AI Developers107,790 Aufrufe • vor 29 Tagen

Gemini 3 Flash now uses an agentic "think-act-observe" loop to solve complex visual tasks 🤖 Google DeepMind engineer Paul Ruiz demonstrates how the model runs Python code automatically to zoom and inspect items, annotate images, and re-visualize data into charts.
Google AI Developers106,695 Aufrufe • vor 3 Monaten

💎 Gemma 4 31B can leverage an ADK Agent and code execution sandbox to autonomously navigate complex, ambiguous tasks. Follow along as this demo showcases: + Zero-shot code generation + Tool usage + Multi-step debugging and recovery + “Learned” multimodal output
Google AI Developers39,423 Aufrufe • vor 1 Monat

Introducing EmbeddingGemma: our new open, state-of-the-art embedding model designed for on-device AI 📱
Google AI Developers153,735 Aufrufe • vor 9 Monaten

✨ What makes Gemini 2.5 Pro stand out? In this Release Notes episode, Sr. Product Manager Logan Kilpatrick and Gemini Product Lead Tulsee Doshi break down its reasoning, coding, and multimodal strengths, plus a 1M token long context. ↓ Timecodes: 1:05 Gemini 2.5 launch overview 3:19 Academic evals vs. vibe checks 6:19 The jump to 2.5 7:51 Coordinating cross-stack improvements 11:48 Role of pre/post-training vs. test-time compute 13:21 Shipping Gemini 2.5 15:29 Embedded safety process 17:28 Multimodal reasoning with Gemini 2.5 18:55 Benchmark deep dive 22:07 What’s next for Gemini 24:49 Dynamic thinking in Gemini 2.5 25:37 The team effort behind the launch
Google AI Developers227,343 Aufrufe • vor 1 Jahr

Watch this ADK Agent powered by Gemma 4 31B use the Google Maps MCP Server to build a custom food tour in this demo 🍜📍 See the model in action with: -Multi-step reasoning (solving complex routes) -Multimodal logic (processing visuals and text) -Live data integration (direct connection to Google Maps)
Google AI Developers28,084 Aufrufe • vor 1 Monat

Build AR applications that recognize physical objects and provide real-time spatial guidance. Stijn Spanhove and Pavlo Tkachenko used Gemini 2.5 Pro’s multimodal vision and sound effect prompting capabilities to create an immersive experience with LEGO Smart Bricks and Snap Spectacles.
Google AI Developers47,780 Aufrufe • vor 3 Monaten



