
Fei Xia
@xf1280 • 11,656 subscribers
Ex Research Scientist, TLM at @GoogleDeepMind, ✨♊, Gemini & Robotics, PhD from @StanfordAILab @StanfordSVL, previously @Tsinghua_Uni. #AGI through Embodiment
Videos

🚀Excited to share that #Gemini 3 Flash can do code execution on images to zoom, count, and annotate visual inputs! The model can choose when to write code to: 🔍 Zoom & Inspect: Detect when details are too small and zoom-in. 🧮 Compute Visually: Run multi-step calculations using code (e.g., summing line items on a receipt). ✏️ Annotate: Draw arrows or bounding boxes to answer questions or show relationships between objects.
Fei Xia19,181 Aufrufe • vor 5 Monaten
Keine weiteren Inhalte verfügbar