Human-to-Robot Handovers
(Essential Skills Sub-Track 4)
We provide details, instructions, and documentation for preparing the solution and trials of the handovers in your own lab.
The set of objects for the preparation phase is defined in the CORSMAL Benchmark and consists of four drinking cups with different properties: high deformability, medium transparency (Cup 1); average deformability, low transparency (Cup 2), average deformability, high transparency (Cup 3), and no deformability, high transparency (Cup 4). Cup 4 is the plastic wine glass from the YCB object database. These cups are inexpensive and available worldwide, have different shapes and sizes, different degrees of deformability, include textureless regions, transparencies and reflections that make the vision-based pose estimation challenging.
Purchase links for set of known drinking cups
Cup 1: https://shorturl.at/pFUVY
Cup 2: https://amzn.to/2QrsXH5
Cup 3: https://amzn.to/2JwRk3l
Cup 4: https://amzn.to/33zw4AY
Those who do not have or cannot purchase the containers provided in the above links can purchase local objects that resemble the characteristics of the substituted items. Team must inform the organiser of the selected local objects or other objects used to prepare the solution before the start of the competition for official acceptance.
Filling. To vary the properties of each cup (mass and deformability), containers are filled with two different amounts of rice (which are easy to purchase and - unlike liquids - harmless for the hardware): 0% (empty), and 90% (filled) of the total volume of the cup. The filling amounts are rounded to the smaller quarter of 100 ml to ease the replicability of the configurations. Filling amounts are 125 ml (cup 1), 400 ml (cup 2), 450 ml (cup 3), and 300 ml (cup 4).
For the execution of the configurations, we recommend to prepare all the objects in advance on a table near the area where the handovers are executed. This speeds up the execution of all configurations.
There are 8 configurations repeated 3 times with a shuffled order for each block. We recommend inviting a different volunteer for each block of configurations to account for the variability in the execution of the handovers.
ID | Object | Level | Points |
---|---|---|---|
1 | Empty wine glass (Cup 4) | Easy | 5 |
2 | Empty red cup (Cup 2) | Easy | 5 |
3 | Empty beer cup (Cup 3) | Medium | 10 |
4 | Empty white cup (Cup 1) | Medium | 10 |
5 | Filled wine glass (Cup 4) | Difficult | 15 |
6 | Filled red cup (Cup 2) | Difficult | 15 |
7 | Filled beer cup (Cup 3) | Hard | 20 |
8 | Filled white cup (Cup 1) | Hard | 20 |
9 | Filled red cup (Cup 2) | Difficult | 15 |
10 | Filled beer cup (Cup 3) | Hard | 20 |
11 | Empty wine glass (Cup 4) | Easy | 5 |
12 | Empty red cup (Cup 2) | Easy | 5 |
13 | Filled white cup (Cup 1) | Hard | 20 |
14 | Empty beer cup (Cup 3) | Medium | 10 |
15 | Filled wine glass (Cup 4) | Difficult | 15 |
16 | Empty white cup (Cup 1) | Medium | 10 |
17 | Filled red cup (Cup 2) | Difficult | 15 |
18 | Empty beer cup (Cup 3) | Medium | 10 |
19 | Filled wine glass (Cup 4) | Difficult | 15 |
20 | Empty red cup (Cup 2) | Easy | 5 |
21 | Empty wine glass (Cup 4) | Easy | 5 |
22 | Filled beer cup (Cup 3) | Hard | 20 |
23 | Filled white cup (Cup 1) | Hard | 20 |
24 | Empty white cup (Cup 1) | Medium | 10 |
TOTAL | 300 |
Note that the volunteer should avoid assisting the robot (i.e., remaining still at a location until the robot can pick up the container) or assuming an adversarial behaviour (i.e., making it harder for the robot to reach the object).
This procedure has been revised from the CORSMAL Human-to-Robot Handover Protocol document.
The setup includes a robotic arm with at least 6 degrees of freedom (e.g., UR5, KUKA) and equipped with a 2-finger parallel gripper (e.g., Robotiq 2F-85); a table where the handover is happening as well as where the robot is placed; selected containers and contents; up to two cameras (e.g., Intel RealSense D435i); and a digital scale to weigh the container. The table is covered by a white table-cloth. The two cameras should be placed at 40 cm from the robotic arm, e.g. using tripods, and oriented in such a way that they both view the centre of the table. The illustration below represents the layout in 3D of the setup within a space of 4.5 x 4.5 meters. The table has the following dimensions: W1800 x D600 x H700 mm.
These instructions have been revised from the CORSMAL Human-to-Robot Handover Benchmark document.
Benchmark for human-to-robot handovers of unseen containers with unknown filling
R. Sanchez-Matilla, K. Chatzilygeroudis, K., A. Modas, N.F. Duarte, A., Xompero, A., P. Frossard, A. Billard, A. Cavallaro
IEEE Robotics and Automation Letters, 5(2), pp.1642-1649, 2020
[Open Access]
Towards safe human-to-robot handovers of unknown containers
Y. L. Pang, A. Xompero, C. Oh, A. Cavallaro
IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), Virtual, 8-12 Aug 2021
[Open Access]
[code]
[webpage]
Vision baseline for CORSMAL Benchmark: a vision-based algorithm, part of a larger system, proposed for localising, tracking and estimating the dimensions of a container with a stereo camera.
[paper]
[code]
[webpage]
LoDE: a method that jointly localises container-like objects and estimates their dimensions with a generative 3D sampling model and a multi-view 3D-2D iterative shape fitting, using two wide-baseline, calibrated RGB cameras.
[paper]
[code]
[webpage]
The CORSMAL Challenge contains perception solutions for the estimation of the physical properties of manipulated objects prior to a handover to a robot arm.
[challenge]
[paper 1]
[paper 2]
Additional references
[document]