Loading video...

Video Failed to Load

Go Home

Sergey has a new habit. He talks to Gemini Live while driving, discussing things like data center power and cost. It’s classic Google dogfooding — obsessively testing your own product. It reminds me of Bill Gates removing his car radio so he could think about Microsoft nonstop. Every founder...

652,612 views • 5 months ago •via X (Twitter)

0 Comments

No comments available

Comments from the original post will appear here

Related Videos

Learning from Human Demonstrations: Show the Robot How to Act! The pipeline is very similar to older experiments using Gemini & pi0 with LeRobot. Pi-zero runs locally, while Gemini Flash generates the affordances and the high-level task. (More details are in the thread.) The new component is learning from demonstrations via Gemini 2.5 Pro. I capture a video while demoing & take one of the last frames. Gemini 2.5 Pro then extracts the instructions & passes them to Gemini Flash to process the scene. The fun part is that there's no fancy insight that came from me; other than the days spent figuring out the right prompts. It's the bitter lesson hitting you in the face -> Enhanced Gemini capabilities make this possible. For example, Gemini Flash cannot do Russian doll stacking, but Gemini 2.5 Pro can do it consistently. The current limitation is low-level manipulation: - As you can see, I'm aligning the objects so they are easy to grasp using the same technique from the training data. I couldn't get Gemini Flash to consistently output an accurate grasping angle, and Gemini 1.5 Pro was too expensive and slow for real-time deployment. - Getting a symmetrical gripper should also help a lot. Adding rubber to the tips would probably also help prevent objects from slipping. Collecting & curating the data was the most time consuming & labor intensive part. Next, to improve low-level manipulation and make the system more real-time, I'm shifting to focus more on sims & synthetic data. This aligns better with my core competence. I'm open to tips and suggestions.

Shreyas Gite

22,480 views • 1 year ago

Gemini for Android is here! This new app makes it easier to access Google's chatbot, powered by the Gemini Pro LLM, right from your Android device. You can ask Gemini a question by tapping the launcher icon or long-pressing the power button. Once invoked, the Gemini overlay lets you enter a text or text+image prompt. You can tap the camera icon to snap a photo or press the "add this screen" button to take a screenshot of the current page to include in your prompt. The Gemini app for Android is available on Google Play with support for English, but support for Korean and Japanese will be coming next week. While there won't be an app for iOS, iPhone users can access Gemini by opening the Google App and tapping the "Gemini" button up top. How can you long-press the power button to invoke Gemini if that gesture is handled by Google Assistant? The answer is that the Gemini Android app can replace Google Assistant as your default assistant if you want, meaning all the ways you'd normally invoke Google Assistant on your phone can instead invoke Gemini. Unfortunately, Gemini currently doesn't offer ALL the same functionality as Google Assistant and requires an active data connection, but more features will be added over time. There's also now a Gemini Advanced tier, which offers access to Google's most powerful Ultra 1.0 LLM. Access to Gemini Advanced requires a subscription to the new $20/month "AI Premium" Google One plan, which offers the same benefits as the 2TB plan but adds access to Gemini Advanced and soon Gemini features in Gmail, Docs, & other Workspace apps (formerly under the Duet AI umbrella). Gemini Advanced is available in English on the web.

Mishaal Rahman

38,428 views • 2 years ago