Learning to use different tools for objects rearrangement
from demonstrations
Yunchu Zhang*
Hao Zhu*
Zhaoyuan Fang
Katerina Fragkiadaki
Christopher G Atkeson
[Paper]
[GitHub]

Abstract

Humans use a lot of tools in daily life to complete different tasks (e.g open wine bottle with a wine-opener, cutting wood with a saw). Without various tools, we fumble at many tasks and even completely fail at some of them. Two key problems when learning to leverage tools are: 1) With multiple tools available, which one should be selected to best solve the task at hand? 2) What actions should be taken at each instant given the tool-task combination (how should the tool be used)? We argue that the ability to choose and deploy tools is crucial for successful use of tools both for humans and robots. Specifically, in object rearrangement and cleaning tasks, it is important to have the ability to utilize different tools and correctly switch between and deploy suitable tools. Thus, we propose an end-to-end learning framework that jointly learns to choose different tools and deploy tool-conditioned policies with a limited amount of human demonstrations. We evaluate our method on parallel gripper and suction cup picking and placing, brush sweeping, and household rearrangement tasks, generalizing to different configurations, novel objects, and cluttered scenes in the real world.


Method overview

In order to solve a complex household rearrangement task as shown in the figure above, the policy consists of two parts: the affordance-aware tool selection policy (picking prediction module) and the selection-conditioned continuous action policy (placing prediction module). The affordance-aware tool selection module is in charge of figuring out which tool to deploy at each step and where to deploy it. In other words, it needs to be able to learn the affordance in the input image. For example, the robot could learn to first move the objects to clear the workspace for sweeping, instead of trying to sweep beans while the objects are still in the way. Given the predicted starting location, the second module chooses how the tool should act. We implement these policies as neural networks and train with gradient-based training algorithms.

Video


[Slides]


Paper and Supplementary Material

Y Zhang, H Zhu, Z Fang,
K Fragkiadaki, CG Atkeson.
Learning to use different tools for objects rearrangement from demonstrations.
Under review for IROS, 2022.
(PDF)


[Bibtex]


Acknowledgements

This template was originally made by Phillip Isola and Richard Zhang for a colorful ECCV project; the code can be found here.