Structured Hough Voting for Vision-based Highway Border Detection

Slides:

Advertisements

Similar presentations

Bayesian Belief Propagation

Advertisements

The Layout Consistent Random Field for detecting and segmenting occluded objects CVPR, June 2006 John Winn Jamie Shotton.

Learning Shared Body Plans Ian Endres University of Illinois work with Derek Hoiem, Vivek Srikumar and Ming-Wei Chang.

Human Identity Recognition in Aerial Images Omar Oreifej Ramin Mehran Mubarak Shah CVPR 2010, June Computer Vision Lab of UCF.

Ľubor Ladický1 Phil Torr2 Andrew Zisserman1

Marked Point Processes for Crowd Counting

Semantic Texton Forests for Image Categorization and Segmentation We would like to thank Amnon Drory for this deck הבהרה : החומר המחייב הוא החומר הנלמד.

Computer Vision for Human-Computer InteractionResearch Group, Universität Karlsruhe (TH) cv:hci Dr. Edgar Seemann 1 Computer Vision: Histograms of Oriented.

(Includes references to Brian Clipp

Vision Based Control Motion Matt Baker Kevin VanDyke.

Semi-Supervised Hierarchical Models for 3D Human Pose Reconstruction Atul Kanaujia, CBIM, Rutgers Cristian Sminchisescu, TTI-C Dimitris Metaxas,CBIM, Rutgers.

Intelligent Systems Lab. Recognizing Human actions from Still Images with Latent Poses Authors: Weilong Yang, Yang Wang, and Greg Mori Simon Fraser University,

Robust Object Tracking via Sparsity-based Collaborative Model

Beyond Actions: Discriminative Models for Contextual Group Activities Tian Lan School of Computing Science Simon Fraser University August 12, 2010 M.Sc.

Co-Training and Expansion: Towards Bridging Theory and Practice Maria-Florina Balcan, Avrim Blum, Ke Yang Carnegie Mellon University, Computer Science.

Real-time Embedded Face Recognition for Smart Home Fei Zuo, Student Member, IEEE, Peter H. N. de With, Senior Member, IEEE.

Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson

Rodent Behavior Analysis Tom Henderson Vision Based Behavior Analysis Universitaet Karlsruhe (TH) 12 November /9.

Robust Lane Detection and Tracking

1 Integration of Background Modeling and Object Tracking Yu-Ting Chen, Chu-Song Chen, Yi-Ping Hung IEEE ICME, 2006.

Student: Hsu-Yung Cheng Advisor: Jenq-Neng Hwang, Professor

Multi-camera Video Surveillance: Detection, Occlusion Handling, Tracking and Event Recognition Oytun Akman.

1 Video Surveillance systems for Traffic Monitoring Simeon Indupalli.

Real Time Abnormal Motion Detection in Surveillance Video Nahum Kiryati Tammy Riklin Raviv Yan Ivanchenko Shay Rochel Vision and Image Analysis Laboratory.

College of Engineering and Science Clemson University

Crash Course on Machine Learning

Face Alignment Using Cascaded Boosted Regression Active Shape Models

GM-Carnegie Mellon Autonomous Driving CRL TitleAutomated Image Analysis for Robust Detection of Curbs Thrust AreaPerception Project LeadDavid Wettergreen,

Driver’s View and Vehicle Surround Estimation using Omnidirectional Video Stream Abstract Our research is focused on the development of novel machine vision.

3D Fingertip and Palm Tracking in Depth Image Sequences

1. Introduction Motion Segmentation The Affine Motion Model Contour Extraction & Shape Estimation Recursive Shape Estimation & Motion Estimation Occlusion.

A General Framework for Tracking Multiple People from a Moving Camera

Dynamic 3D Scene Analysis from a Moving Vehicle Young Ki Baik (CV Lab.) (Wed)

Marco Pedersoli, Jordi Gonzàlez, Xu Hu, and Xavier Roca

Young Ki Baik, Computer Vision Lab.

Supervised Learning of Edges and Object Boundaries Piotr Dollár Zhuowen Tu Serge Belongie.

Forward-Scan Sonar Tomographic Reconstruction PHD Filter Multiple Target Tracking Bayesian Multiple Target Tracking in Forward Scan Sonar.

BAGGING ALGORITHM, ONLINE BOOSTING AND VISION Se – Hoon Park.

Processing Sequential Sensor Data The “John Krumm perspective” Thomas Plötz November 29 th, 2011.

Stable Multi-Target Tracking in Real-Time Surveillance Video

Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR Anchorage,

Lei Li Computer Science Department Carnegie Mellon University Pre Proposal Time Series Learning completed work 11/27/2015.

Raquel A. Romano 1 Scientific Computing Seminar May 12, 2004 Projective Geometry for Computer Vision Projective Geometry for Computer Vision Raquel A.

Sparse Bayesian Learning for Efficient Visual Tracking O. Williams, A. Blake & R. Cipolloa PAMI, Aug Presented by Yuting Qi Machine Learning Reading.

1 Value of information – SITEX Data analysis Shubha Kadambe (310) Information Sciences Laboratory HRL Labs 3011 Malibu Canyon.

Associative Hierarchical CRFs for Object Class Image Segmentation

Conditional Random Fields for ASR Jeremy Morris July 25, 2006.

A New Method for Automatic Clothing Tagging Utilizing Image-Click-Ads Introduction Conclusion Can We Do Better to Reduce Workload?

Segmentation of Vehicles in Traffic Video Tun-Yu Chiang Wilson Lau.

Combining Speech Attributes for Speech Recognition Jeremy Morris November 9, 2006.

Category Independent Region Proposals Ian Endres and Derek Hoiem University of Illinois at Urbana-Champaign.

Final Review Course web page: vision.cis.udel.edu/~cv May 21, 2003  Lecture 37.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.

Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov

PROBABILISTIC DETECTION AND GROUPING OF HIGHWAY LANE MARKS James H. Elder York University Eduardo Corral York University.

GM-Carnegie Mellon Autonomous Driving CRL 1 TitleRobust Detection of Curbs Thrust AreaPerception Project LeadDavid Wettergreen, CMU Wende Zhang, GM Inna.

SA-1 University of Washington Department of Computer Science & Engineering Robotics and State Estimation Lab Dieter Fox Stephen Friedman, Lin Liao, Benson.

Date of download: 7/8/2016 Copyright © 2016 SPIE. All rights reserved. A scalable platform for learning and evaluating a real-time vehicle detection system.

A Plane-Based Approach to Mondrian Stereo Matching

Signal and Image Processing Lab

Traffic Sign Recognition Using Discriminative Local Features Andrzej Ruta, Yongmin Li, Xiaohui Liu School of Information Systems, Computing and Mathematics.

A Forest of Sensors: Using adaptive tracking to classify and monitor activities in a site Eric Grimson AI Lab, Massachusetts Institute of Technology

Performance of Computer Vision

Yun-FuLiu Jing-MingGuo Che-HaoChang

Cold-Start Heterogeneous-Device Wireless Localization

Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science

Vehicle Segmentation and Tracking in the Presence of Occlusions

Weakly Supervised Action Recognition

Presentation transcript:

Structured Hough Voting for Vision-based Highway Border Detection Zhiding Yu Carnegie Mellon University

Autonomous Driving: Not If, But When

GM-CMU Collaborative Research And this paper is a joint collaborative research between General Motors Company and CMU. We are working together to develop future autonomous driving vehicles and the paper is part of this research project.

Sensors Setup on SRX Platform Images from: Junqing Wei et al., “Towards a Viable Autonomous Driving Research Platform,” IEEE Intelligent Vehicles Symposium (IV), 2013

Sensors: Price vs Information Camera Lidar And this paper is a joint collaborative research between General Motors Company and CMU. We are working together to develop future autonomous driving vehicles and the paper is part of this research project. Radar Price

Computer Vision Applications Object detection (pedestrian, vehicle, bicycle…) Road parsing (lane/border detection, road segmentation, vanishing point estimation…) Localization and tracking Driver status monitoring Many other applications…… And this paper is a joint collaborative research between General Motors Company and CMU. We are working together to develop future autonomous driving vehicles and the paper is part of this research project.

Motivation, Description and Goal Development for future driving assistance system and autonomous driving system Robust detection within 0.5 to 6 meters detection range. Achieve near 100% accuracy in daytime and over 90% in nighttime on the right most lane Handling various scenarios including highway entrance and exit Extend to the joint system with front view What we are trying to do is that we have a monocular camera looking towards the side and we seek to use vision and learning algorithms to automatically detect the border and shoulder of a highway. By saying border we mean it is the physical end of the paved road. As you can see the red line in the image is the border returned by our algorithm. Another thing we aim to detect is the road shoulder. A shoulder is defined as the region between the right most solid lane and the border. For example, the blue line here is the lane marking. And the green region is the shoulder.

High-Level Idea: Learning based Method Concrete Barrier Guard Rail Soft Shoulder Guard Rail Soft Shoulder Concrete Barrier Lane Marking How does the algorithm work? In this paper, we train both the border detector and the lane marker detector, and perform scanning window detection. Our trained detector handles various types of borders as shown on the top row. The scanning window detection will return densely triggered detectors and each triggered detector will return a voting point indicating approximately where the border and lane marking are. We then use our proposed structured Hough voting model to finally output both the border as well as the shoulder. Structured Hough Voting Densely Fired scanning windows Returned Voting Points Border / lane marking hypotheses

Dataset Collection Overall 1592 training images: Concrete Barrier (839 images) Guard Rail (300 images) Soft Shoulder (453 images) Overall 2638 testing images:

Training Patch Alignment For each sample patch, fix the width-height ratio to be 2 Center each patch with respect to y-coordinates of ground truth 1-3 positive samples from each image. Separate on x-coordinates to cover as much as possible. Each positive sample is associated with 3 negative samples with the same size, randomly selected from background. Concrete Natural Steel Lane Marker Positive Samples: Negative Samples:

Concatenated Filter Bank Feature Concatenated HOG Feature Feature Extraction Filter Bank Concatenated Filter Bank Feature Concatenated HOG Feature HOG Patches that are discriminative to HOG Patches that are discriminative to filter banks

Classification & Detection Extract features from all training patches (based on previous page) Perform Fisher discriminant analysis Train an RBF kernel SVM Scanning window detection (Deliberately having a lot of positive firing) Guard Rail Soft Shoulder Concrete Barrier Lane Marking

Hough Voting

Structured Hough Voting: Intuitions Basic philosophy: A model that assumes voting results are correlated rather than independent Inter-frame structural info on hypotheses (Temporal smoothness) Intra-frame structural info (Geometric relationship) Multiple candidate hypotheses generation (Proposals with diversity) Constrained Hough Voting on detected voting points (Detection + Tracking) Arbitrary Hough Voting on detected voting points (Detection) Constrained Hough Voting on image gradients (Pure Tracking)

Purpose of Candidate 1 Deals most of the frames where hypotheses from consecutive frames have strong correlation.

Purpose of Candidate 2 Automatically corrects result through searching for “much better” voting configurations (This is the power of detection, avoids error from tracking)

Purpose of Candidate 3 In the worst case where Type 1 voters fail, perform tracking by gradients from previous pose configuration.

Modeling under CRF: Background A Conditional Random Field (CRF) discriminatively defines the joint posterior probability as the product of a set of potentials The potentials are functions with hypotheses Hi being the variables. They are modeled in such a way that a larger potential value generally indicates a better hypothesis configuration. CRF inference seeks to find the joint hypothesis configuration H that maximizes Unary Potential Pairwise Potential H1 H2 … HN X1 X2 XN

Modeling under CRF: Intuition What are the hypothesis Hi? E.g.: image pixel labels (FG/BG, Object Class, etc.), if it is a segmentation problem. In our problem, Hi is the Hough Voting hypothesis: Hi = (r, θ). X is the observation of voting point coordinates and their weights. The unary potential corresponds to the exponential of Hough voting weights: exp(v(Hi)). The pairwise potential corresponds to the inter-frame smoothness (tracking) constraint. H1 H2 … HN X1 X2 XN

No Structural Information Hbd,1 Hbd,2 Hbd,N … X1 X2 XN Hln,1 Hln,2 Hln,N … X1 X2 XN Simplest Case: frame-wise independent Hough voting

Adding Inter-frame Structural Info. Hbd,1 Hbd,2 Hbd,N … X1 X2 XN Hln,1 Hln,2 Hln,N … X1 X2 XN Adding temporal smoothness: Hough voting constrained by neighboring frames

Adding Intra-frame Structural Info. Hbd,1 Hbd,2 Hbd,N … X1 X2 XN Hln,1 Hln,2 Hln,N … X1 X2 XN Adding Geometric Constraint: Hough voting constrained by both neighboring frames and intra-frame hypotheses

The Structured Hough Voting Model Candidate Hypotheses Generation Unit • • • Coupled Structure Potential Mode Selection Potential

The Structured Hough Voting Model

Candidate Hypotheses Generation Unit

Mode Selection Potential Use decision tree to guide the mode selection. The mode selection basically forces the output to be one of the candidate hypotheses, but allows discrepancy with the decision tree prediction with a penalty.

Coupled Structure Potential The coupled structure potential captures two most important relations between a border hypothesis and a lane hypothesis Parallelism Distance

Inference Conducting a whole inference each time given a new frame is computationally infeasible. Relaxation: Initialize with the inferred state variable configuration of the previous t-1 frames and infer the current state variables, updating in an incremental way. Inference procedure at t = 1: 1. Perform Hough voting for both border and lane marking 2. Perturbate hypotheses if geometric relationship violated (optional) Inference procedure at t > 1: 1. Generate the 3 candidate hypotheses for both border and lane marking 2. Use decision tree to help selecting the best candidate 3. Perturbate candidate hypotheses if geometric relationship violated (optional) 4. Re-select the best candidate

Experiments: Adding Coupled Structure

Experiments: Qualitative Results Ground Truth and Baseline methods: Ground Truth Independent Hough voting in each frame using the fired detector voting points Hough voting using the triggered detector voting points constrained by previous frame Adding gradient tracking to Baseline 2. Kalman filter. Proposed Method

Experiments: Quantitative Results

Highway Entrance Detection and Lane State Tracking

Summary Proposed the Structured Hough Voting Model The proposed model can be theoretically formulated under a CRF Fast real-time feature extraction and online inference Achieves very robust and good performance under challenging scenarios and low quality inputs from production camera

Thank You! Q & A