DroNet: Learning to fly by driving

Slides:

Advertisements

Similar presentations

Real-Time Human Pose Recognition in Parts from Single Depth Images Presented by: Mohammad A. Gowayyed.

Advertisements

1 Reactive Pedestrian Path Following from Examples Ronald A. Metoyer Jessica K. Hodgins Presented by Stephen Allen.

High Speed Obstacle Avoidance using Monocular Vision and Reinforcement Learning Jeff Michels Ashutosh Saxena Andrew Y. Ng Stanford University ICML 2005.

Quick Overview of Robotics and Computer Vision. Computer Vision Agent Environment camera Light ?

Abstract Design Considerations and Future Plans In this project we focus on integrating sensors into a small electrical vehicle to enable it to navigate.

Artificial Intelligence Lecture No. 28 Dr. Asad Ali Safi Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.

3D SLAM for Omni-directional Camera

ARTIFICIAL NEURAL NETWORKS. Overview EdGeneral concepts Areej:Learning and Training Wesley:Limitations and optimization of ANNs Cora:Applications and.

Using Adaptive Tracking To Classify And Monitor Activities In A Site W.E.L. Grimson, C. Stauffer, R. Romano, L. Lee.

Path Planning Based on Ant Colony Algorithm and Distributed Local Navigation for Multi-Robot Systems International Conference on Mechatronics and Automation.

Learning A Better Compiler Predicting Unroll Factors using Supervised Classification And Integrating CPU and L2 Cache Voltage Scaling using Machine Learning.

Deep Learning and Deep Reinforcement Learning. Topics 1.Deep learning with convolutional neural networks 2.Learning to play Atari video games with Deep.

Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.

Deep Learning Overview Sources: workshop-tutorial-final.pdf

Lecture 4b Data augmentation for CNN training

When deep learning meets object detection: Introduction to two technologies: SSD and YOLO Wenchi Ma.

CNN architectures Mostly linear structure

Vision-Guided Humanoid Footstep Planning for Dynamic Environments

Deep Residual Learning for Image Recognition

CS 4501: Introduction to Computer Vision Object Localization, Detection, Semantic Segmentation Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy.

Learning to Compare Image Patches via Convolutional Neural Networks

LECTURE 01: Introduction to Algorithms and Basic Linux Computing

The Relationship between Deep Learning and Brain Function

Summary of “Efficient Deep Learning for Stereo Matching”

Deep Learning Amin Sobhani.

Data Mining, Neural Network and Genetic Programming

Deep Reinforcement Learning

Status Report on Machine Learning

Computer Science and Engineering, Seoul National University

Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek

CS b659: Intelligent Robotics

Tracking Objects with Dynamics

Status Report on Machine Learning

Schedule for next 2 weeks

Rotational Rectification Network for Robust Pedestrian Detection

Introductory Seminar on Research: Fall 2017

Deep reinforcement learning

Policy Compression for MDPs

Convolutional Networks

CS6890 Deep Learning Weizhen Cai

R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.

Fast and Robust Object Tracking with Adaptive Detection

A critical review of RNN for sequence learning Zachary C

Goodfellow: Chap 6 Deep Feedforward Networks

Image Classification.

INF 5860 Machine learning for image classification

Chap. 7 Regularization for Deep Learning (7.8~7.12 )

Smart Robots, Drones, IoT

Creating Data Representations

A Proposal Defense On Deep Residual Network For Face Recognition Presented By SAGAR MISHRA MECE

Neural Networks Geoff Hulten.

Artificial Intelligence Lecture No. 28

Lecture: Deep Convolutional Neural Networks

Use 3D Convolutional Neural Network to Inspect Solder Ball Defects

Chapter 11 Practical Methodology

Robot Intelligence Kevin Warwick.

Lip movement Synthesis from Text

Designing Neural Network Architectures Using Reinforcement Learning

Emir Zeylan Stylianos Filippou

Neural networks (3) Regularization Autoencoder

Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton

Unsupervised Perceptual Rewards For Imitation Learning

Automatic Handwriting Generation

Introduction to Neural Networks

Natalie Lang Tomer Malach

Instructor: Vincent Conitzer

Multi-UAV to UAV Tracking

Multi-Target Detection and Tracking of UAVs from a UAV

Patterson: Chap 1 A Review of Machine Learning

Presented By: Firas Gerges (fg92)

Presentation transcript:

DroNet: Learning to fly by driving Paper by: Antonio Loquercio, Ana I. Maqueda, Carlos R. del-Blanco, and Davide Scaramuzza Presented by: Logan Murray

Video of DroNet in Action

Problem: Safe and Reliable Outdoor Navigation of AUV’s Ability to avoid obstacles while navigating. Follow traffic or zone laws. To solved these tasks, a robotic system must be able to solve perception, control, and localization all at once. In urban areas this becomes especially difficult If this problem is solved, UAV’s could be used for many applications e.g. surveillance, construction monitoring, delivery, and emergency response.

Traditional Approach Two step process of automatic localization, and computation of controls. Localize given a map using GPS, visual and range sensors. Control commands allow the robot to avoid obstacles while achieving the desired goal. Problems with this model: Advanced SLAM algorithms have issues with dynamic scenes and appearance changes. Separating the perception and control blocks introduces challenges with inferring control commands from 3D maps. *problems with percep and control: also eliminates any possible positive feedback between the two systems. SLAM is very good at localizing in a 3D environment when it has an explicit construction HIKER EXAMPLE

Recent Approach Use deep learning to couple perception and control, which greatly increases results on a large set of tasks Problems: Reinforcement Learning: suffers from high sample complexity Supervised-Learning: Must have human expert flying trajectories to imitate RL: because it requires such a large amount of sample data to be able to create the policy for the robot, they would never be able to be applied to an area such as this, where drones will be operating in safety critical environments. It would be too dangerous to let drones operate in this area and potentially injure someone while trying to collect data. SL: It would be required that you have a lot of expert human pilots making a ton of aerial maneuvers so that the robot can learn how to react in dangerous situations. It would be really dangerous for pilots to collect this kind of data.

DroNet Approach Residual Convolutional Neural Network(CNN) Use a pre-recorded outdoor dataset recorded from cars and bikes Collect a custom dataset of outdoor collision sequences Real-time processing from one video camera on the UAV Create steering angle controls and probability of collision Have a generalized system that can adapt to unseen environments. Does not try to replace the “map-localize-plan” approach, but rather build a system that could work hand in hand with that approach. Their goal was basically to have one stream of video input, and to be able to derive control commands and and process visual input REQUIRES NO STATE ESTIMATION OR PILOT EXPERT DATA

CNN process The shared part of the architecture consists of a ResNet-8 with 3 residual blocks (b), followed by dropout and ReLU non-linearity. Convolution: convolves across the entire image by a certain stride, filtering the image Max pool: takes the max value in that range and stores it as output. Resnet: surpassed humans with image detection/ too complicated for this lecture but (DRAW GRAPH ABOUT ERROR AND DEPTH OF THE CNN) n first the kernel’s size, then the number of filters, and eventually the stride if it is different from 1. (hyperparameters) Dropout: zeros out some of the pixel intensities to increase the generalization BN- batch normalization: eliminates internal covariate shift. ReLu- deals with the vanishing gradient problem.

Training the Steering and Collison predictions Steering: Mean-Squared Error(MSE) Collision Prediction: Binary Cross-Entropy(BCE) Once trained, an optimizer is used to compute both steering and collision at the same time. Assumes gaussian distributed MSE BCE uses something similar but for classification where the inputs are between 0 and 1

Datasets Udacity dataset of 70,000 images of car driving. 6 experiments, 5 for training, 1 for testing. IMU, GPS, gear, brake, throttle, steering angles, and speed One forward-looking camera

Datasets (cont.) Collision dataset collected from a GoPro mounted on the handlebars of a bike. 32,00 images distributed over 137 sequences for a set of obstacles. Frames that are far away from obstacles are labeled with 0(no collision), and frames that are very close are labeled with a 1(collision). Should not be collected by a drone, but are necessary.

Drone Controls Combination of forward velocity and steering angle When collision = 0, forward velocity = Vmax When collision = 1, forward velocity = 0 Perform low pass filter on the velocity Vk to provide smooth, continuous inputs. Alpha= .7 Steering angle is calculated by mapping the predicted scaled steering Sk into a rotation around the z-axis. Then a low pass filter is applied. Beta = .5

Results Hardware Specs: Drone-Parrot Bebop 2.0 drone Core i7 2.6 GHz CPU that receives images at 30 Hz from the drone through Wi-Fi.

Results (Cont.) 1Explained Variance is a metric used to quantify the quality of a regressor, and is defined as EVA = Var[ytrue−ypred ]/ Var[ytrue] F-1 score is a metric used to quantify the quality of a classifier. It is defined as F-1= 2 precision×recall /precision+recall RMSE: rootmeansquared Dronet even though 80 times smaller than the current best architecture, maintains a considerable prediction performance while achieving real-time operation (20 frames per second).

Results (Cont.)

Results(Cont.) Zone A was also tested the drone flying above a 5m altitude with no affects on the operations of the drone. As long as the drone can see the line shaped features that it follows, it can operate.

Results(Cont.)

Results(Cont.) Always produced a safe, smooth flight even when prompted with dangerous situations e.g. sudden bikers or pedestrians in front of the robot. Network focuses on “line-like” patterns to predict crowed or dangerous areas. Activation paths using gradient based localization

Results(Cont.)

Pros vs Limitations Pros Limitations Drone can explore completely unseen environments with no previous knowledge No required localization or mapping CNN allows for high generalization capabilities Can be applied to resource constrained platforms Limitations Does not fully exploit the agile movement capabilities of drones No real way to give the robot a goal to be reached To deal with movement, one could use 3D collision free trajectories when in high probability of collision. For the goal, you could provide a network with a rough estimate of distance to the goal. Or provide a 2d map and use a learning-based approach for ground robots.

Refereneces A.Loquercio, A.I. Maqueda, C.R. Del Blanco, D. Scaramuzza DroNet: Learning to Fly By Driving IEEE Robotics and Automation Letters (RA-L), 2018 Deshpande, Adit. “A Beginner's Guide To Understanding Convolutional Neural Networks.” A Beginner's Guide To Understanding Convolutional Neural Networks – Adit Deshpande – CS Undergrad at UCLA ('19), 20 July 2016, adeshpande3.github.io/A- Beginner's-Guide-To-Understanding-Convolutional-Neural-Networks/.