Helpful DoggyBot: Open-World Object Fetching using Legged Robots and Vision-Language Models
Abstract
Learning-based methods have achieved strong performance for quadrupedal locomotion. However, several challenges prevent quadrupeds from learning helpful indoor skills that require interaction with environments and humans: lack of end-effectors for manipulation, limited semantic understanding using only simulation data, and low traversability and reachability in indoor environments. We present a system for quadrupedal mobile manipulation in indoor environments. It uses a front-mounted gripper for object manipulation, a low-level controller trained in simulation using egocentric depth for agile skills like climbing and whole-body tilting, and pre-trained vision-language models (VLMs) with a third-person fisheye and an egocentric RGB camera for semantic understanding and command generation. We evaluate our system in two unseen environments without any real-world data collection or training. Our system can zero-shot generalize to these environments and complete tasks, like following user's commands to fetch a randomly placed stuff toy after climbing over a queen-sized bed, with a 60% success rate. Project website: https://helpful-doggybot.github.io/
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Robot Navigation Using Physically Grounded Vision-Language Models in Outdoor Environments (2024)
- Continuously Improving Mobile Manipulation with Autonomous Real-World RL (2024)
- AHA: A Vision-Language-Model for Detecting and Reasoning Over Failures in Robotic Manipulation (2024)
- Grounding Language Models in Autonomous Loco-manipulation Tasks (2024)
- SKT: Integrating State-Aware Keypoint Trajectories with Vision-Language Models for Robotic Garment Manipulation (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper