Google DeepMind's Chatbot Robot: A Key Player in the AI Revolution

### Google DeepMind’s ‌Gemini: Revolutionizing Office Robotics

#### Introduction
In a bustling open-plan ‌office in Mountain⁢ View, California, a sleek, wheeled robot has been making waves as a tour guide and ⁣office assistant. This is all thanks to a significant upgrade from Google DeepMind, which has equipped⁤ the ‍robot with a large ⁢language model to understand commands and ‌navigate its surroundings.

#### Gemini’s⁤ Capabilities
When a human asks,⁣ “Find me somewhere to write,” the robot efficiently leads them to a ⁣whiteboard within the building. This is possible because ‍Gemini can handle both video and text, allowing it‍ to process large amounts of information from previously recorded video ⁣tours of the office. The robot combines Gemini with an algorithm that translates commands into specific ⁤actions, such as turning, based on what it sees.

#### Vision Language Models
Gemini’s introduction in December marked a significant milestone. Demis Hassabis, CEO of Google DeepMind, emphasized its potential ⁤to interact with the‍ physical world ‌and perform useful tasks. Gemini and⁣ similar technologies are capable of understanding office layouts through a smartphone camera.

#### Research and Development
Academic and industry research labs are in a race to enhance robots’ abilities using language models. The May ⁤ [program](https://ras.papercept.net/conferences/conferences/ICRA24/program/) for the International Conference on ‌Robotics and Automation lists nearly two dozen papers on vision language models.

#### Investment in Robotics
Investors are [pouring money](https://www.theinformation.com/articles/ai-investors-turn-their-attention-and-deep-pockets-to-robotics) into startups that ⁢aim to apply⁢ AI ⁣advances to robotics. Researchers from the Google project ⁣have founded a startup⁢ called [Physical Intelligence](https://physicalintelligence.company/), which received $70 million in initial funding.⁣ They aim⁤ to combine large language ⁣models with real-world training for general problem-solving abilities. Similarly, [Skild AI](https://www.skild.ai/), founded by Carnegie Mellon University roboticists, recently announced $300 million in ‍funding.

#### ‌Evolution of Robot Navigation
A few years ago, robots⁢ required detailed maps and specific⁤ commands to navigate. Today, large language models contain valuable information about the physical⁣ world. Vision language models, trained on images and video as well as text, can answer questions requiring perception. Gemini ⁣enables Google’s robot to⁢ follow visual instructions, such as a sketch on a whiteboard showing a route.

#### Future Plans
Researchers plan to test Gemini on⁤ various types of robots. They believe it should handle more complex queries, like “Do they have my favorite drink today?” from a ‌user with many empty Coke cans on their desk.

“Gemini allows Google’s robot to parse visual instructions as well as spoken ones,⁢ following‍ a sketch on a whiteboard that shows a route to a new destination.”

View 4 Comments

4 Comments

Isla Hayes on July 11, 2024 3:38 pm

Is Google’s DeepMind chatbot really the ‘key player’ in AI, or just another tool in the ever-growing arsenal?

Marie Ava on July 11, 2024 3:38 pm

DeepMind’s chatbot, a groundbreaking shift or just overhyped tech?

CommentaryCrafter on July 11, 2024 3:38 pm

Game-changer or just another overhyped AI experiment?

sableb on July 11, 2024 3:38 pm

Love how it’s only “groundbreaking” when Google does it!

Subscribe to Updates

What's Hot

Google DeepMind’s Chatbot Robot: A Key Player in the AI Revolution

Related Posts

4 Comments

Subscribe to Updates