Traffic Sign Recognition Using Discriminative Local Features Andrzej Ruta, Yongmin Li, Xiaohui Liu School of Information Systems, Computing and Mathematics.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Gestures Recognition. Image acquisition Image acquisition at BBC R&D studios in London using eight different viewpoints. Sequence frame-by-frame segmentation.
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Road-Sign Detection and Recognition Based on Support Vector Machines Saturnino, Sergio et al. Yunjia Man ECG 782 Dr. Brendan.
Human Identity Recognition in Aerial Images Omar Oreifej Ramin Mehran Mubarak Shah CVPR 2010, June Computer Vision Lab of UCF.
Data preprocessing before classification In Kennedy et al.: “Solving data mining problems”
1 Video Processing Lecture on the image part (8+9) Automatic Perception Volker Krüger Aalborg Media Lab Aalborg University Copenhagen
Recognition of Traffic Lights in Live Video Streams on Mobile Devices
Modeling Pixel Process with Scale Invariant Local Patterns for Background Subtraction in Complex Scenes (CVPR’10) Shengcai Liao, Guoying Zhao, Vili Kellokumpu,
HCI Final Project Robust Real Time Face Detection Paul Viola, Michael Jones, Robust Real-Time Face Detetion, International Journal of Computer Vision,
1 Robust Video Stabilization Based on Particle Filter Tracking of Projected Camera Motion (IEEE 2009) Junlan Yang University of Illinois,Chicago.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Generic Object Recognition -- by Yatharth Saraf A Project on.
A Study of Approaches for Object Recognition
Probabilistic video stabilization using Kalman filtering and mosaicking.
Multiple Human Objects Tracking in Crowded Scenes Yao-Te Tsai, Huang-Chia Shih, and Chung-Lin Huang Dept. of EE, NTHU International Conference on Pattern.
CSSE463: Image Recognition Day 30 Due Friday – Project plan Due Friday – Project plan Evidence that you’ve tried something and what specifically you hope.
1 Integration of Background Modeling and Object Tracking Yu-Ting Chen, Chu-Song Chen, Yi-Ping Hung IEEE ICME, 2006.
Student: Hsu-Yung Cheng Advisor: Jenq-Neng Hwang, Professor
ICME 2004 Tzvetanka I. Ianeva Arjen P. de Vries Thijs Westerveld A Dynamic Probabilistic Multimedia Retrieval Model.
Multi-camera Video Surveillance: Detection, Occlusion Handling, Tracking and Event Recognition Oytun Akman.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.
Jacinto C. Nascimento, Member, IEEE, and Jorge S. Marques
Hand Signals Recognition from Video Using 3D Motion Capture Archive Tai-Peng Tian Stan Sclaroff Computer Science Department B OSTON U NIVERSITY I. Introduction.
Video Trails: Representing and Visualizing Structure in Video Sequences Vikrant Kobla David Doermann Christos Faloutsos.
Presenter: Stefan Zickler
DVMM Lab, Columbia UniversityVideo Event Recognition Video Event Recognition: Multilevel Pyramid Matching Dong Xu and Shih-Fu Chang Digital Video and Multimedia.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Computer vision.
INTRODUCTION  Sibilant speech is aperiodic.  the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ /  we present a sibilant.
Information Extraction from Cricket Videos Syed Ahsan Ishtiaque Kumar Srijan.
CSSE463: Image Recognition Day 30 This week This week Today: motion vectors and tracking Today: motion vectors and tracking Friday: Project workday. First.
SVCL Automatic detection of object based Region-of-Interest for image compression Sunhyoung Han.
1. Introduction Motion Segmentation The Affine Motion Model Contour Extraction & Shape Estimation Recursive Shape Estimation & Motion Estimation Occlusion.
Presented by Tienwei Tsai July, 2005
General Tensor Discriminant Analysis and Gabor Features for Gait Recognition by D. Tao, X. Li, and J. Maybank, TPAMI 2007 Presented by Iulian Pruteanu.
Blind Pattern Matching Attack on Watermark Systems D. Kirovski and F. A. P. Petitcolas IEEE Transactions on Signal Processing, VOL. 51, NO. 4, April 2003.
Intelligent Vision Systems ENT 496 Object Shape Identification and Representation Hema C.R. Lecture 7.
DIEGO AGUIRRE COMPUTER VISION INTRODUCTION 1. QUESTION What is Computer Vision? 2.
Video Segmentation Prepared By M. Alburbar Supervised By: Mr. Nael Abu Ras University of Palestine Interactive Multimedia Application Development.
National Taiwan A Road Sign Recognition System Based on a Dynamic Visual Model C. Y. Fang Department of Information and.
Dynamic Captioning: Video Accessibility Enhancement for Hearing Impairment Richang Hong, Meng Wang, Mengdi Xuy Shuicheng Yany and Tat-Seng Chua School.
1 Research Question  Can a vision-based mobile robot  with limited computation and memory,  and rapidly varying camera positions,  operate autonomously.
Expectation-Maximization (EM) Case Studies
CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.
Jack Pinches INFO410 & INFO350 S INFORMATION SCIENCE Computer Vision I.
Supervisor: Nakhmani Arie Semester: Winter 2007 Target Recognition Harmatz Isca.
A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter )
Demosaicking for Multispectral Filter Array (MSFA)
CSSE463: Image Recognition Day 29 This week This week Today: Surveillance and finding motion vectors Today: Surveillance and finding motion vectors Tomorrow:
Visual Tracking by Cluster Analysis Arthur Pece Department of Computer Science University of Copenhagen
Tracking Groups of People for Video Surveillance Xinzhen(Elaine) Wang Advisor: Dr.Longin Latecki.
Frank Bergschneider February 21, 2014 Presented to National Instruments.
REAL-TIME DETECTOR FOR UNUSUAL BEHAVIOR
Signal and Image Processing Lab
IMAGE PROCESSING RECOGNITION AND CLASSIFICATION
Tracking Objects with Dynamics
Machine Learning Basics
Dynamical Statistical Shape Priors for Level Set Based Tracking
Fast and Robust Object Tracking with Adaptive Detection
Image Segmentation Techniques
Learning with information of features
Online Graph-Based Tracking
EE513 Audio Signals and Systems
CSSE463: Image Recognition Day 30
CSSE463: Image Recognition Day 30
CSSE463: Image Recognition Day 30
EM Algorithm and its Applications
Presentation transcript:

Traffic Sign Recognition Using Discriminative Local Features Andrzej Ruta, Yongmin Li, Xiaohui Liu School of Information Systems, Computing and Mathematics Brunel University, West London

Agenda Problem description Image representation and feature selection Colour discretisation Colour Distance Transform (CDT) Local regions and local dissimilarity Discriminative region selection algorithm Traffic sign recognition System outline Temporal classification Results Conclusions School of Information Systems, Computing and Mathematics, Brunel University, West London

Problem description Input: Output: Points to consider Real-time video stream from a car-mounted, front-looking camera Output: Appropriate visual information, audio signal produced or action taken upon detection and recognition of a sign Points to consider A priori knowledge about the model signs Robustness to noise, varying illumination, uneven motion etc. Real-time performance requirement High cost of false negatives/false positives School of Information Systems, Computing and Mathematics, Brunel University, West London

Colour discretisation Colour discretisation – why bother? Worthwile whenever interest objects contain sparse colours Helps avoid ambiguities Reduces computational burden Scenario 1 – clean template images available Merely changing the physical image representation Proper thresholding in Hue-Saturation-Value space Scenario 2 – real images available Gaussian Mixture model for each distinct colour Supervised training using EM On-line model-driven classification School of Information Systems, Computing and Mathematics, Brunel University, West London

Colour discretisation – example School of Information Systems, Computing and Mathematics, Brunel University, West London

Colour Distance Transform (CDT) The idea: Input: discretised colour image, output: map of distances to the nearest pixel of a given colour Pixels of an interest colour treated as feature pixels, all other pixels treated as non-feature pixels Different distance metrics possible, e.g. (3,4) Chamfer metric Sample output: Original image Black CDT White CDT Red CDT School of Information Systems, Computing and Mathematics, Brunel University, West London

Local regions and local dissimilarity Local image dissimilarity within region rk Average image dissimilarity over region set Weighted average image dissimilarity over region set School of Information Systems, Computing and Mathematics, Brunel University, West London

Discriminative region selection algorithm Assuming a category of targeted object classes and an unknown image , determine the class of by maximising posterior:  indexing variable determining a set of regions to be used  Vector of region relevance In order to learn the best model parameters the following objective function is maximised: School of Information Systems, Computing and Mathematics, Brunel University, West London

Discriminative region selection algorithm – cont. comparison Target class Other classes For each template being compared to the template : Dissimilarity map … STOP when School of Information Systems, Computing and Mathematics, Brunel University, West London

Discriminative region selection algorithm – cont. Determining weights by merging region ranks: Sample output: School of Information Systems, Computing and Mathematics, Brunel University, West London

Traffic sign recognition – system outline School of Information Systems, Computing and Mathematics, Brunel University, West London

Traffic sign recognition – system outline Image preprocessing Intended to highlight characteristic colours and edges Involves colour region clustering to determine RoI-s of certain size Relevant colours enhanced within each RoI to extract colour edges Detection Loy & Barnes’s [2005] equiangular polygon detector Colour edge images used on input Tracking Used only for search region reduction Kalman filter with strong motion assumptions Sign representation Separate discriminative region model trained for each sign Dissimilarity threshold individually tuned for each sign category School of Information Systems, Computing and Mathematics, Brunel University, West London

Temporal classification Single frame Maximum Likelihood approach Maximisation of likelihood equivalent to the minimisation of distance over i Regions and weights denote these learned in the training stage Video sequence Integration of consecutive frame observations instead of individual classifications Classifier’s decision at time t determined from: Observation relevance , dependent on the candidate’s age (and thus size) School of Information Systems, Computing and Mathematics, Brunel University, West London

Results Test data: td RC (55) BC (25) YT (42) BS (13) Overall (135) Real-life video recorded from a moving car 88 clips, 144 signs urban, countryside and motorway scenes td RC (55) BC (25) YT (42) BS (13) Overall (135) detected - 86.4% 100.0% 96.3% 94.4% 95.8% recognised 0.9 90.6% 73.6% 91.2% 85.5% 0.7 88.7% 70.6% 87.7% 0.5 89.5% 96.9% 79.2% 58.3% 80.4% best 93.5% School of Information Systems, Computing and Mathematics, Brunel University, West London

Results School of Information Systems, Computing and Mathematics, Brunel University, West London

Results Classification of signs over time. Ratio of the cumulative distance from the best matching template (upper sign next to each chart) to the cumulative distance from the second best matching template (lower sign next to each chart) is marked with a solid red line. The same but temporally weighted cumulative distance ratio is marked with dashed lines: green (b = 0.8), and blue (b = 0.6). School of Information Systems, Computing and Mathematics, Brunel University, West London

Conclusions Current solution Further work Decent but not sufficient for real-life applications Large gamut of signs recognised Main contribution: road sign representation through discriminative local regions, CDT-based distance metric, region selection method Further work Detection – robustness to noise, occlusions, reduced parametrisation Tracking – probabilistic temporal evolution of pixel/region ”interestingness” Representation – capturing correlations between local regions, alternative definitions of visual saliency Classification – relaxing temporal independence assumption School of Information Systems, Computing and Mathematics, Brunel University, West London

Thank you School of Information Systems, Computing and Mathematics, Brunel University, West London