Jörg Stückler, imageMax Schwarz and Sven Behnke*

Slides:



Advertisements
Similar presentations
ARTIFICIAL PASSENGER.
Advertisements

TEMPLATE DESIGN © The basic model for a trigonometric setup requires that the HID be seen by at least two cameras at any.
Configuration Space. Recap Represent environments as graphs –Paths are connected vertices –Make assumption that robot is a point Need to be able to use.
-Praktikum- Cognitive Robots Dr. Claus Lenz Robotics and Embedded Systems Department of Informatics Technische Universität München
Hilal Tayara ADVANCED INTELLIGENT ROBOTICS 1 Depth Camera Based Indoor Mobile Robot Localization and Navigation.
System Integration and Experimental Results Intelligent Robotics Research Centre (IRRC) Department of Electrical and Computer Systems Engineering Monash.
Kinematics & Grasping Need to know: Representing mechanism geometry Standard configurations Degrees of freedom Grippers and graspability conditions Goal.
Multi-scenario Gesture Recognition Using Kinect Student : Sin- Jhu YE Student Id : MA Computer Engineering & Computer Science University of Louisville.
Learning to estimate human pose with data driven belief propagation Gang Hua, Ming-Hsuan Yang, Ying Wu CVPR 05.
Object Recognition & Model Based Tracking © Danica Kragic Tracking system.
Automatic Feature Extraction for Multi-view 3D Face Recognition
 INTRODUCTION  STEPS OF GESTURE RECOGNITION  TRACKING TECHNOLOGIES  SPEECH WITH GESTURE  APPLICATIONS.
3D Skeletons Using Graphics Hardware Jonathan Bilodeau Chris Niski.
Uncertainty Representation. Gaussian Distribution variance Standard deviation.
A Versatile Depalletizer of Boxes Based on Range Imagery Dimitrios Katsoulas*, Lothar Bergen*, Lambis Tassakos** *University of Freiburg **Inos Automation-software.
Recognition of Traffic Lights in Live Video Streams on Mobile Devices
ECE 7340: Building Intelligent Robots QUALITATIVE NAVIGATION FOR MOBILE ROBOTS Tod S. Levitt Daryl T. Lawton Presented by: Aniket Samant.
A Bayesian Formulation For 3d Articulated Upper Body Segmentation And Tracking From Dense Disparity Maps Navin Goel Dr Ara V Nefian Dr George Bebis.
Segmentation and tracking of the upper body model from range data with applications in hand gesture recognition Navin Goel Intel Corporation Department.
June 12, 2001 Jeong-Su Han An Autonomous Vehicle for People with Motor Disabilities by G. Bourhis, O.Horn, O.Habert and A. Pruski Paper Review.
A Brief Overview of Computer Vision Jinxiang Chai.
A HIGH RESOLUTION 3D TIRE AND FOOTPRINT IMPRESSION ACQUISITION DEVICE FOR FORENSICS APPLICATIONS RUWAN EGODA GAMAGE, ABHISHEK JOSHI, JIANG YU ZHENG, MIHRAN.
Constraints-based Motion Planning for an Automatic, Flexible Laser Scanning Robotized Platform Th. Borangiu, A. Dogar, A. Dumitrache University Politehnica.
Creating Special Effects
Geometric Models & Camera Calibration
1 Perception and VR MONT 104S, Fall 2008 Lecture 21 More Graphics for VR.
1 Artificial Intelligence: Vision Stages of analysis Low level vision Surfaces and distance Object Matching.
1 Research Question  Can a vision-based mobile robot  with limited computation and memory,  and rapidly varying camera positions,  operate autonomously.
Raquel A. Romano 1 Scientific Computing Seminar May 12, 2004 Projective Geometry for Computer Vision Projective Geometry for Computer Vision Raquel A.
Chapter 5 Multi-Cue 3D Model- Based Object Tracking Geoffrey Taylor Lindsay Kleeman Intelligent Robotics Research Centre (IRRC) Department of Electrical.
Kinematics. The function of a robot is to manipulate objects in its workspace. To manipulate objects means to cause them to move in a desired way (as.
Robotics/Machine Vision Robert Love, Venkat Jayaraman July 17, 2008 SSTP Seminar – Lecture 7.
Toward humanoid manipulation in human-centered environments T. Asfour, P. Azad, N. Vahrenkamp, K. Regenstein, A. Bierbaum, K. Welke, J. Schroder, R. Dillmann.
Over the recent years, computer vision has started to play a significant role in the Human Computer Interaction (HCI). With efficient object tracking.
SPACE MOUSE. INTRODUCTION  It is a human computer interaction technology  Helps in movement of manipulator in 6 degree of freedom * 3 translation degree.
Manipulation in Human Environments
3-D Modeling Concepts V part 2.
Autonomous Navigation of a
Robotics: Unit-I.
A Plane-Based Approach to Mondrian Stereo Matching
Vision-Guided Humanoid Footstep Planning for Dynamic Environments
Using the Cyton Viewer Intro to the viewer.
Three Dimensional Viewing
How to SWIVl: Using a Gen 2 SWIVL RObot
Getting Started with Adobe Photoshop CS6
First-person Teleoperation of Humanoid Robots
Color Image Processing
Color Image Processing
Action-Grounded Push Affordance Bootstrapping of Unknown Objects
Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
3-D Modeling Concepts V part 2.
Paper – Stephen Se, David Lowe, Jim Little
Color Image Processing
Direct Manipulator Kinematics
domestic environments
Game Strategy for APC 2016.
Manipulation in Human Environments
CHAPTER 3 ROBOT CLASSIFICATION
Jorg Stuckler, Ricarda Steffens, Dirk Holz, and Sven Behnke
Jörg Stückler, imageMax Schwarz and Sven Behnke*
Application Solution: 3D Inspection Automation with SA
Robust Belief-based Execution of Manipulation Programs
Three Dimensional Viewing
Color Image Processing
Multiple View Geometry for Robotics
Color Image Processing
3-D Modeling Concepts V part 2.
Chapter 4 . Trajectory planning and Inverse kinematics
Stefan Oßwald, Philipp Karkowski, Maren Bennewitz
Introduction to Artificial Intelligence Lecture 22: Computer Vision II
Presentation transcript:

Jörg Stückler, imageMax Schwarz and Sven Behnke* Mobile Manipulation, Tool Use, and Intuitive Interaction for Cognitive Service Robot Cosero Jörg Stückler, imageMax Schwarz and Sven Behnke* Institute for Computer Science VI, Autonomous Intelligent Systems, University of Bonn, Bonn, Germany

System Overview Cognitive Service Robot Cosero: 2 anthropomorphic arms Yaw joint in the torso to enlarge workspace of the arms Linear actuator in the trunk will move the upper body vertical .9m Omnidirectional drive Festo FinRay fingers on rotary joints Microsoft Kinect RGB-D camera for sensing 3D environment Infrared distance sensors attached to the palm in each gripper for measuring object’s distance from the gripper

Cosero: cognitive service robot [Stückler 2012]

CONFORMABLE GRIPPER

Components of a Humanoid Service Robot Mechatronic Design Software and Control Architecture Navigation Manipulation Perception Grasping Tool Use Interaction People Recogntion Gesture Recognition Dialog System Cooperative tasks Teleioperation

Overview of the mechatronic design of Cosero Sensors are colored green, actuators blue, and other components red. USB connections are shown in red, analog audio links are shown in blue, and the low-level servo connections are shown in magenta.

Software Architecture

Compliant Arm Control (A) Activation matrix for compliant control in an example arm pose. The task-space dimensions correspond to forward/backward (x), lateral (y), vertical (z), and rotations around the x-axis (roll), y-axis (pitch), and z-axis (yaw); Examples for use of compliance in physical human–robot interaction: (B) object hand over from robot to user signaled by the user pulling on the object. (C) Cosero is guided by a person. (D) Cosero follows the motion of a person to cooperatively carry a table. (A) Activation matrix for compliant control in an example arm pose. The task-space dimensions correspond to forward/backward (x), lateral (y), vertical (z), and rotations around the x-axis (roll), y-axis (pitch), and z-axis (yaw); Examples for use of compliance in physical human–robot interaction: (B) object hand over from robot to user signaled by the user pulling on the object. (C) Cosero is guided by a person. (D) Cosero follows the motion of a person to cooperatively carry a table.

Surfel Grid Maps for Navigation 2D floor map can cause problems with 3D humanoid navigation. (A) Panorama image of an office. (B) 3D surfel map learned with our approach (surfel orientation coded by color). (C) Navigation map derived from 3D surfel map. Localization is done by Monte Carlo Localization (particle filter) using either 3D surfels or 2D camera line features.

Object Perception Heuristic: most household objects are found in a finite set of locations- specifically on horizontal support planes. Tabletops Shelves This reduces computational search for object pose and geometry. The experiments conducted are limited to the above two assumed locations

Real-time 3D Perception Utilizing the Microsoft Kinect for its depth camera High frame rates allows for processing of images with 160x120 resolution at 20Hz Tasks: Compute Normals Extract Horizontal Points Detect Support Plane Detect Object Candidates Track Objects 5 steps will be elaborated in the following slides

Compute Normals Directly compute the normal vector over the neighboring pixels in x and y image space For every point compute local surface normals from the cross product of two tangents to the surface Create two maps of tangential vectors- one for rows and one for columns For each map compute the difference vectors between corresponding 3D points Compute an integral image for each channel and then compute the average tangential vector. Runtime = linear

2. Extract Horizontal Points Extract all points with vertical normals The resulting points are lying on the horizontal surface Allows for efficiently segmenting the complete depth into planes Looking at just horizontal planes saves computations

3. Detect Support Plane Find the most dominant horizontal plane and the points supporting it The support plane will be horizontal Only look at points and support planes lower than 1 meter and closer than 3 meters from the robot

4. Detect Object Candidate Extract the points above the horizontal support plane Apply Euclidean clustering to get sets of points and object candidates

OBJECT PERCEPTION (A) Cosero capturing a tabletop scene with its RGB-D camera. The visible volume is indicated by the green lines. Each object is represented by an ellipse fitted to its points, indicated in red. (B) RGB image of the tabletop with recognized objects. (C) We recognize objects in RGB images and find location and size estimates. (D) Matched features vote for position in a 2D Hough space. (E) From the features (green dots) that consistently vote at a 2D location, we find a robust average of relative locations (yellow dots). (F) We also estimate principal directions (yellow lines).

Object Perception (A) Object categorization, instance recognition, and pose estimation based on features extracted by a pretrained convolutional neural network. Depth is converted to a color image by rendering a canonical view and encoding distance from the object vertical axis; object detection based on geometric primitives. (B) Point cloud captured by Cosero’s Kinect camera. (C) Detected cylinders. (D) Detected objects

Object Tracking (A) Cosero approaching a watering can. (B) We train multiview 3D models of objects using multiresolution surfel maps. The model is shown in green in the upper part. The model is registered with the current RGB-D frame to estimate its relative pose T, (C) Joint object detection and tracking using a particle filter, despite occlusion. (D) an object manipulation skill is described by grasp poses and motions of the tool tip relative to the affected object. (E) Once these poses are known for a new instance of the tool, the skill can be transferred.

Bluetooth low energy tag localization (A) Used Bluetooth tags (Estimote and StickNFind). (B) Signal strength-based tag localization by lateration. (C) Localization experiment with multiple tags.

Efficient Grasp Planning Overview: Finding feasible, collision-free grasps from the raw object point cloud Collision free Feasible

Efficient Grasp Planning Grasping can be from the side or from above Side grasp: Approach the object along its vertical axis with the grippers aligned horizontally Pre-grasp pose is reached, side-grasp motion primitive approaches the object, close gripper Top grasp: Pitch the end effector by 45 degrees downward to grasp object with the fingertips Reach pre-grasp pose and then establish pitch orientation Use IR distance sensors in the grippers to determine any premature contact with the object or support plane

Planning Collision-Free Grasps Grasp planner selects a feasible collision-free grasp for the object of interest Will sample grasp candidates, remove infeasible and colliding grasps, and ranks the remaining grasps to find the best one First find out the shape and pose of the object Side grasp: pre-grasp pose is sampled on an ellipse in the horizontal plane and grasp the object as low as possible (half the height of the gripper + .03m) Top grasp: sample it on all points in height range of 2cm below the highest point of the object

Filtering for Feasible and Collision-Free Grasps Post processing: filter out grasps based on the following: Grasp width - reject grasps if the object will not fit into the gripper Object height - reject side grasps if the object is too low Reachability - reject grasps outside of the arm’s workspace Collisions - reject grasps that collide during the reaching or grasping Grasps that satisfy Criteria

Searching For Collisions

Ranking of Grasps Once there are collision-free grasps then the grasps get ranked: Distance to object center - favor grasps with smaller distance to object center Grasp width - favor grasps widths closest to .08m Grasp orientation - favor grasps with smallest angle between line toward shoulder and grasping direction Distance from robot - favor grasps with smaller distance to the shoulder

Grasp and motion planning (A) Object shape properties. principal axes of the object. (B) We rank feasible, collision-free grasps (red) and select the most appropriate one (large, RGB). (C) Example side grasp. (D) Example top grasp; motion planning for bin picking: (E) Grasps are sampled on shape primitives,checked for collision-free approach. (F) filter out grasps that are not feasible. (G) Arm motion is planned. (H) Bin picking demonstration at RoboCup

Experiments & Robustness 8 household objects and executed 12 grasps each Didn’t just use rigid objects- tried clothes Later tried this with a shelf, not just table Sometimes only top grasp or only side grasp are feasible (not both)

Experiments & Robustness: Table

Experiments & Robustness: Shelf

Performance Evaluation Fast & Robust The motivation behind all of this is to have more anthropomorphic robots- robots that can grasp every day items in your home Paper presents efficient means to “perceive objects on planar surfaces and to plan feasible, collision-free graps on the object of interest” Achieves this very fast: in real-time Analyzes the many grasp options, rejects the bad, and prioritizes the better Robust: can pick up a variety of household objects If there is a failure the robot will try to pick up the object again Caveat: lots of assumptions about object pose and potential grasps

Tool Use (A) Grasping sausages with a pair of tongs. Cosero perceives position and orientation of the sausages in RGB-D images. (B) Bottle opening. (C) Plant watering skill transfer to unknown watering can

Tool use at RoboCup 2014 (A) Bottle opener, (B) Dustpan and swab, and (C) Muddler. Adaptive tool use

Physical Human Robot Interaction Object Hand over Guiding the robot by hand Cooperative tasks: carrying a table

Person perception (A) Persons are detected as legs (cyan spheres) and torsos (magenta spheres) in two horizontal laser range scans (cyan and magenta dots. Detections are fused in a multi-hypothesis tracker (red and cyan boxes). Faces are detected with a camera mounted on a pan-tilt unit. We validate tracks as persons (cyan box) when they are closest to the robot and match the line-of-sight toward a face (red arrow). (B) Enrollment of a new face. (C) Gesture recognition. Faces are detected in the amplitude image. 3D point cloud with the resulting head, upper body, and arm segments (green, red, and yellow) and the locations of the hand, elbow, and shoulder (green, light blue, and blue spheres)

HRI: Dialog system Speech recognition Speech synthesis Semantic binding of spech to grounded objects Synthesis of Body Language and Gestures robot moves head, torso towards people, objects robot can also generate pointing gestures like a human

Handheld teleoperation interface (A) Complete GUI with a view selection column on the left, a main view in the center, a configuration column on the right, and a log message window on the bottom center. Two joystick control UIs on the lower corners for controlling robot motion with the thumbs. (B) Skill-level teleoperation user interface for navigation in a map (placed in center and right UI panel). The user may either select from a set of named goal locations or choose to specify a goal pose on a map. The user specifies navigation goal poses by touching onto the goal position and dragging toward the desired robot orientation.

Videos Bin Picking German RoboCup Kitchen Competition RoboCup II

Summary A complete system: Mobile manipulator with HRI Solutions to all the problems: Locomotion Path and motion planning Perception Grasping Human interfacing But many, many heuristics and simplifications to make it work! Project idea: take one of these “simple solutions” and make it better!!!