Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹ (1) Courant Institute of Mathematical Sciences, New York.

Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹ (1) Courant Institute of Mathematical Sciences, New York University (2) Net-Scale Technologies, Morganville, NJ 07751, USA SPEED-RANGE DILEMMAS FOR VISION-BASED NAVIGATION IN UNSTRUCTURED TERRAIN

Intelligent Autonomous Vehicles IAV 2007, Toulouse, France Pierre Sermanet September 4 th, 2007 2/21 Outline Program and System overview Problem definition Architecture Results

Intelligent Autonomous Vehicles IAV 2007, Toulouse, France Pierre Sermanet September 4 th, 2007 3/21 Overview: Program LAGR: Learning Applied to Ground Robots Demonstrate learning algorithms in unstructured outdoor robotics Vision-based only (passive), no expensive equipment Reach a GPS goal the fastest without any prior knowledge of location DARPA funded, 10 teams (Universities and companies), common platform Comparison to state-of-the-art CMU “baseline” software and other teams Monthly tests by DARPA in various unknown locations: Overview Problem Architecture Results Unstructured outdoor robotics is highly challenging due to wide diversity of environments (colors, shapes, sizes of obstacles, lighting and shadows, etc)  Conventional algorithms are unsuited, need for adaptability and learning SwRI, TX Ft. Belvoir, VA Ft. Belvoir, VA Hanover, NH

Intelligent Autonomous Vehicles IAV 2007, Toulouse, France Pierre Sermanet September 4 th, 2007 4/21 Overview: Platform Constructor: CMU/NREC Vision based only: 2 stereo pairs of cameras (+ GPS for global navigation) 4 Linux machines linked by Gigabit ethernet: Two “eye” machines (dual core 2Ghz): Image processing “planner” machine (single core 2Ghz): Planning and control loop “controller” machine: Low level communication Maximum speed: 1.3m/s Proprietary CMU/NREC API to sensors and actuators Proprietary CMU/NREC “Baseline”: end-to-end navigation software (D*, etc) (not re-used) Overview Problem Architecture Results GPS Dual stereo cameras Bumper

Intelligent Autonomous Vehicles IAV 2007, Toulouse, France Pierre Sermanet September 4 th, 2007 5/21 Overview: Philosophy Main goal: Demonstrate machine learning algorithms for long-range vision (RSS07). Supporting goal: Build a solid software platform for long-range vision and navigation: Robust and reliable Resistant to sensors imprecisions and failures Overview Problem Architecture Results Input image Stereo labels (short-range) Input: context-rich image windows Output: long-range labels Self-supervised learning using convolutional network:

Intelligent Autonomous Vehicles IAV 2007, Toulouse, France Pierre Sermanet September 4 th, 2007 6/21 Overview: System Processing chain: Overview Problem Architecture Results Sensors (cameras) Input image Image processing Traversibility map Network transmission Actuators (wheels) Path planning Path Eye machine Planner machine Control Pose (GPS + IMU + wheels) Latency Frequency Note: Latency is not only tied to frequency but also to sensors latency, network, planning and actuators latency.

Intelligent Autonomous Vehicles IAV 2007, Toulouse, France Pierre Sermanet September 4 th, 2007 7/21 Problem Important performance drop in local obstacle avoidance with too high latency and frequency: Overview Problem Architecture Results Artificially increasing latency and period almost linearly increases the number of crashes in obstacles Human expert drivers of the UPI Crusher vehicle reported a feedback latency of 400ms was the maximum for good remote driving. How to guarantee good performance with increasing complexity introduced by sophisticated long-range vision modules? When does processing speed prevails over vision range, and vice- versa? Performance Test of July 2006, Holmdel Park, NJ

Intelligent Autonomous Vehicles IAV 2007, Toulouse, France Pierre Sermanet September 4 th, 2007 8/21 Problem: Delays Overview Problem Architecture Results fixed Latency and frequency determine performance, but latency is actually composed of 3 types of latencies or “delays”: 1.Sensors/Actuators latency + LAGR API latency: Images are already 190ms old when made available to image processing 2.Processing latency 3.Robot’s dynamics latency (inertia + acceleration/deceleration): 1.5sec (worst case) between a wheel command and actual desired speed (1) and (3) are relatively high on the LAGR platform and must be caught up to and taken in account by (2).

Intelligent Autonomous Vehicles IAV 2007, Toulouse, France Pierre Sermanet September 4 th, 2007 9/21 Problem: Solutions to delays Overview Problem Architecture Results To account for sensors and processing latencies (1) and (2): a. Reduce processing time. b. Estimate delays between path planning and actuation. c. Place traversibility maps according to delays before and after path planning. To account for dynamics latencies (3): d. Modeling or record robot’s dynamics.  All (a), (b), (c) and (d) solutions are part of the global solution presented in the results section, but here we will only describe a successful architecture for (a)

Intelligent Autonomous Vehicles IAV 2007, Toulouse, France Pierre Sermanet September 4 th, 2007 10/21 Architecture Overview Problem Architecture Results Idea: Wagner et al.¹ showed that a walking human gazes more frequently close by than far away:  need higher frequency closer than far away Close by obstacles move toward robot faster than far obstacles:  need lower latency closer than far away To satisfy those requirements, short and long range vision must be separated into 2 parallel and independent OD modules: “Fast-OD”: processing has to be fast, vision is not necessarily long-range. “Far-OD”: vision has to be long-range, processing can be slower. How to make Fast-OD fast?  Simple processing and reduced input resolution.  Can we reduce resolution without reducing performance? ¹ M. Wagner, J. C. Baird, andW. Barbaresi. The locus of environmental attention. J. of Environmental Psychology, 1:195-206, 1980.

Intelligent Autonomous Vehicles IAV 2007, Toulouse, France Pierre Sermanet September 4 th, 2007 11/21 Architecture: Fast-OD // Far-OD Sensors (cameras) High res input image (320x240 or 512x384) Advanced Image processing Traversibility map (5 to >30m) Network transmission Actuators (wheels) Path planning Path Eye machine Planner machine Control Latency: 700ms Frequency: 3Hz Latency: 250ms Frequency: 10Hz Pose (GPS + IMU + wheels) Low res input image (160x120) Simple Image processing Traversibility map (0 to 5m) Map merging Overview Problem Architecture Results Fast-OD Far-OD

Intelligent Autonomous Vehicles IAV 2007, Toulouse, France Pierre Sermanet September 4 th, 2007 12/21 Architecture: Implementation notes Overview Problem Architecture Results CPU cycles: All cycles must be given to Fast-OD when it runs to guarantee low latency. Different solutions are: 1.Use real-time OS and give high priority to Fast-OD. 2.With regular OS, give Fast-OD control of Far-OD:  Fast-OD pauses Far-OD, runs, then sleeps for a bit and resume Far-OD. 3.Use dual-core CPU. Map merging: Fast and Far maps are merged together before planning according to their respective poses. 2-step planning: This architecture makes it easier to separate the different planning algorithms suited for short and long range: Fast-OD planning happens in Cartesian space and takes robot dynamics in account (more important in short range) Far-OD planning happens in image space and uses regular path planning. Dynamics planning: Cartesian space Long-range planning: Image space 10m 5m infinity 10m

Intelligent Autonomous Vehicles IAV 2007, Toulouse, France Pierre Sermanet September 4 th, 2007 13/21 Results: Timing measures Overview Problem Architecture Results Fast-od actuation latency 250ms Fast-od sensors latency 190ms Fast-od period (frequency) 100ms (10Hz) Far-od period (frequency) 370ms (2-3Hz) Far-od actuation latency 700ms

Intelligent Autonomous Vehicles IAV 2007, Toulouse, France Pierre Sermanet September 4 th, 2007 14/21 Global-map Vehicle-map Results Overview Problem Architecture Results Short and long range navigation test: 1.1 st obstacle appears quickly and suddenly to robot  testing short range navigation 2.Cul-de-sac  testing “long” range navigation  Parallel architecture is consistently better at short and long range navigation than series architecture or FAST-OD only. Note: Here Fast-od has 5m radius and Far-od 15m radius.

Intelligent Autonomous Vehicles IAV 2007, Toulouse, France Pierre Sermanet September 4 th, 2007 15/21 Results: More recent results Overview Problem Architecture Results Fast-od + Far-od in parallel: Short-range navigation consistently successful: 0 collision over >5 runs Finish run in about 16sec along shortest path Fast-od: 10Hz & 250ms & 3meters range Far-od: 3Hz & 700ms & 30meters range Fast-od + Far-od in series: Short-range navigation consistently failing: > 2 collisions over >5 runs Finish run in >40sec along longer path Fast-od/Far-od: 3Hz & 700ms & 3m/30m range (frequency is acceptable but latency is too high) Video 1: collision-free bucket maze Video 2: collision-free bucket maze Videos 3,4,5: obstacle collisions due to high latency and period.

Intelligent Autonomous Vehicles IAV 2007, Toulouse, France Pierre Sermanet September 4 th, 2007 16/21 Results: More recent results Overview Problem Architecture Results Video 6: Natural obstacles Video 7: Tight maze of artificial obstacles Fast-od + Far-od in parallel: Short-range navigation consistently successful: 0 collision over >5 runs Fast-od: 10Hz & 250ms & 3meters range Far-od: 3Hz & 700ms & 30meters range Note: long-range planning is off, i.e. Far-od is processing but ignored. Only short-range navigation was tested here.

Intelligent Autonomous Vehicles IAV 2007, Toulouse, France Pierre Sermanet September 4 th, 2007 17/21 Results: Moving obstacles Overview Problem Architecture Results Video 8: Fast moving obstacle Detects and avoids moving obstacles consistently.

Intelligent Autonomous Vehicles IAV 2007, Toulouse, France Pierre Sermanet September 4 th, 2007 18/21 Results: Beating humans Overview Problem Architecture Results Video 9: Experienced human driver Autonomous short-range navigation is consistently better than inexperienced human drivers and equal or better than experienced human drivers. (driving with only robot’s images would be even harder for a human)

Intelligent Autonomous Vehicles IAV 2007, Toulouse, France Pierre Sermanet September 4 th, 2007 19/21 Results: Processing Speed - Vision Range dilemma Overview Problem Architecture Results We showed that processing speed prevails over vision range for short range navigation, whereas vision range prevails over speed for long range navigation. Only 3m vision range were necessary to build a collision-free short range navigation for a 1.3m/s non-holonomic vehicle: Vehicle’s worst-case stopping delay: 1.0 sec. System’s worst-case reaction time: 0.25 sec. latency + 0.1 sec period  Worst-case reaction and stopping delay: 1.35 sec., (or 1.75m)  Only 1.0 sec. anticipation necessary in addition to worst-case reaction and stopping delay. A vision range of 15m with high latency and lower frequency consistently improved the long range navigation in parallel to the short range module.

Intelligent Autonomous Vehicles IAV 2007, Toulouse, France Pierre Sermanet September 4 th, 2007 20/21 Summary Overview Problem Architecture Results We showed that both latency and frequency are critical in vision-based systems because of higher processing times. A simple and very low resolution OD in parallel with a high resolution OD proved to increase greatly the performance of a short and long range vision- based autonomous navigation system over commonly used higher resolution and sequential approaches: Processing speed prevails over range in short-range navigation and only 1.0 sec. additional anticipation to dynamics and processing delays was necessary. Additional key concepts such as dynamics modeling must be implemented to build a complete end-to-end successful system. A robust collision-free navigation platform, dealing with moving obstacles and beating humans, was successfully built and is able to leave enough CPU cycles available for computationally expensive algorithms.

Intelligent Autonomous Vehicles IAV 2007, Toulouse, France Pierre Sermanet September 4 th, 2007 21/21 Questions?

Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹ (1) Courant Institute of Mathematical Sciences, New York.

Similar presentations

Presentation on theme: "Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹ (1) Courant Institute of Mathematical Sciences, New York."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹ (1) Courant Institute of Mathematical Sciences, New York.

Similar presentations

Presentation on theme: "Pierre Sermanet¹·² Raia Hadsell¹ Jan Ben² Ayse Naz Erkan¹ Beat Flepp² Urs Muller² Yann LeCun¹ (1) Courant Institute of Mathematical Sciences, New York."— Presentation transcript:

Similar presentations

About project

Feedback