Recognition of Traffic Lights in Live Video Streams on Mobile Devices

Slides:

Advertisements

Similar presentations

ARTIFICIAL PASSENGER.

Advertisements

QR Code Recognition Based On Image Processing

Road-Sign Detection and Recognition Based on Support Vector Machines Saturnino, Sergio et al. Yunjia Man ECG 782 Dr. Brendan.

Rear Lights Vehicle Detection for Collision Avoidance Evangelos Skodras George Siogkas Evangelos Dermatas Nikolaos Fakotakis Electrical & Computer Engineering.

Mahmoud Abdallah Daniel Eiland. The detection of traffic signals within a moving video is problematic due to issues caused by: Low-light, Day and Night.

Activity Recognition Aneeq Zia. Agenda What is activity recognition Typical methods used for action recognition “Evaluation of local spatio-temporal features.

Database-Based Hand Pose Estimation CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington.

Facial feature localization Presented by: Harvest Jang Spring 2002.

Forward-Backward Correlation for Template-Based Tracking Xiao Wang ECE Dept. Clemson University.

Robust Object Tracking via Sparsity-based Collaborative Model

A KLT-Based Approach for Occlusion Handling in Human Tracking Chenyuan Zhang, Jiu Xu, Axel Beaugendre and Satoshi Goto 2012 Picture Coding Symposium.

Computer and Robot Vision I

December 5, 2013Computer Vision Lecture 20: Hidden Markov Models/Depth 1 Stereo Vision Due to the limited resolution of images, increasing the baseline.

1 of 25 1 of 22 Blind-Spot Experiment Draw an image similar to that below on a piece of paper (the dot and cross are about 6 inches apart) Close your right.

Virtual Dart: An Augmented Reality Game on Mobile Device Supervisor: Professor Michael R. Lyu Prepared by: Lai Chung Sum Siu Ho Tung.

Broadcast Court-Net Sports Video Analysis Using Fast 3-D Camera Modeling Jungong Han Dirk Farin Peter H. N. IEEE CSVT 2008.

Lecture 5 Template matching

Multi video camera calibration and synchronization.

A Study of Approaches for Object Recognition

CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.

Multiple Human Objects Tracking in Crowded Scenes Yao-Te Tsai, Huang-Chia Shih, and Chung-Lin Huang Dept. of EE, NTHU International Conference on Pattern.

Computing motion between images

An Approach to Korean License Plate Recognition Based on Vertical Edge Matching Mei Yu and Yong Deak Kim Ajou University Suwon, , Korea 指導教授張元翔.

CS 376b Introduction to Computer Vision 04 / 01 / 2008 Instructor: Michael Eckmann.

A Self-Organizing Approach to Background Subtraction for Visual Surveillance Applications Lucia Maddalena and Alfredo Petrosino, Senior Member, IEEE.

MULTIPLE MOVING OBJECTS TRACKING FOR VIDEO SURVEILLANCE SYSTEMS.

Jin-Yi Wu, Chien-Chung Tseng,Chun-Hao Chang,

1 Invariant Local Feature for Object Recognition Presented by Wyman 2/05/2006.

A Real-Time for Classification of Moving Objects

Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.

Oral Defense by Sunny Tang 15 Aug 2003

1 Video Surveillance systems for Traffic Monitoring Simeon Indupalli.

Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.

A Brief Overview of Computer Vision Jinxiang Chai.

Olga Zoidi, Anastasios Tefas, Member, IEEE Ioannis Pitas, Fellow, IEEE

Introduction to Computer Vision Olac Fuentes Computer Science Department University of Texas at El Paso El Paso, TX, U.S.A.

1. Introduction Motion Segmentation The Affine Motion Model Contour Extraction & Shape Estimation Recursive Shape Estimation & Motion Estimation Occlusion.

Implementing Codesign in Xilinx Virtex II Pro Betim Çiço, Hergys Rexha Department of Informatics Engineering Faculty of Information Technologies Polytechnic.

Visual Attention Accelerated Vehicle Detection in Low-Altitude Airborne Video of Urban Environment Xianbin Cao, Senior Member, IEEE, Renjun Lin, Pingkun.

報告人 : 林福城指導老師 : 陳定宏 1 From Res. Center of Intell. Transp. Syst., Beijing Univ. of Technol., Beijing, China By Zhe Liu ; Yangzhou Chen ; Zhenlong Li Appears.

The University of Texas at Austin Vision-Based Pedestrian Detection for Driving Assistance Marco Perez.

National Taiwan A Road Sign Recognition System Based on a Dynamic Visual Model C. Y. Fang Department of Information and.

Tracking CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington.

© 2005 Martin Bujňák, Martin Bujňák Supervisor : RNDr.

Efficient Visual Object Tracking with Online Nearest Neighbor Classifier Many slides adapt from Steve Gu.

Crowd Analysis at Mass Transit Sites Prahlad Kilambi, Osama Masound, and Nikolaos Papanikolopoulos University of Minnesota Proceedings of IEEE ITSC 2006.

Expectation-Maximization (EM) Case Studies

Image-Based Segmentation of Indoor Corridor Floors for a Mobile Robot

Image-Based Segmentation of Indoor Corridor Floors for a Mobile Robot Yinxiao Li and Stanley T. Birchfield The Holcombe Department of Electrical and Computer.

Segmentation of Vehicles in Traffic Video Tun-Yu Chiang Wilson Lau.

1 Machine Vision. 2 VISION the most powerful sense.

Presented by: Idan Aharoni

Image features and properties. Image content representation The simplest representation of an image pattern is to list image pixels, one after the other.

Correspondence and Stereopsis. Introduction Disparity – Informally: difference between two pictures – Allows us to gain a strong sense of depth Stereopsis.

Detecting Occlusion from Color Information to Improve Visual Tracking

EE368 Final Project Spring 2003

Author : Sang Hwa Lee, Junyeong Choi, and Jong-Il Park

Paper – Stephen Se, David Lowe, Jim Little

Contents Team introduction Project Introduction Applicability

Traffic Sign Recognition Using Discriminative Local Features Andrzej Ruta, Yongmin Li, Xiaohui Liu School of Information Systems, Computing and Mathematics.

Lecture 26 Hand Pose Estimation Using a Database of Hand Images

Object Tracking Based on Appearance and Depth Information

Common Classification Tasks

Object tracking in video scenes Object tracking in video scenes

Object Recognition Today we will move on to… April 12, 2018

眼動儀與互動介面設計廖文宏 6/26/2009.

CSSE463: Image Recognition Day 30

Optical flow and keypoint tracking

Recognition and Matching based on local invariant features

Presentation transcript:

Recognition of Traffic Lights in Live Video Streams on Mobile Devices Jan Roters Xiaoyi Jiang Kai Rothaus 2011 IEEE Transactions on CSVT

Outline Introduction Problems System Architecture Experiment Results Identification Classification Video Analysis Time-Based Verification Experiment Results Evaluations Conclusion

Introduction People with visual disabilities are limited in mobility. Orientate pedestrians with zebra crossings at intersections Portable PC with a digital camera and a pair of auricular stereo Present a system for mobile devices to help sightless people cross roads. 2. 導盲犬不易訓練且非常地昂貴，所以很多人嘗試著用其它的方法來引導視障同胞行走。早期有利用camera 偵測班馬線的方向來告訴使用者在十字路口時該往哪邊走或者是把 portable PC 放在背包裡，然後用數位相機拍路上的情形之後做分析，然後再用一對耳機告訴使用者目前是紅燈還是綠燈由於目前的手機普及率非常地高，而且配備的相機跟手機運算能力也愈來愈好，所以這篇paper的作者們想要直接在上面開發類似的軟體來服務視障同胞紅綠燈又很少有聲音或很難用觸覺方面的刺激去告訴視障人目前的燈號尤其現在的手機上配備的相機功能都有一定的水準之上，而且手機的運算能力愈來愈高，因此很適合

Problems Program usage Real world conditions Camera resolution Different appearances 要實作這個系統是一件困難的事情，因為視障同胞不知道紅綠燈在哪裡，怎麼拿手機的鏡頭去對？？真實的世界是很複雜的，每支手機的相機畫素不同每個國家的紅綠燈長得不一樣

Problems The scale of traffic lights Many traffic lights Occluded Illumination Rotation 每條馬路的寬度不同，如果馬路很寬的話，對面的紅綠燈會變得很小，看不清楚車子經過時，對面的紅綠燈會被擋到光線角度

Pedestrian Lights in Germany Installation Shape Color arrangement Circuitry Background 為了簡化問題，作者限定這個系統只能偵測德國的紅綠燈所以我們來看一下德國的紅綠燈長什麼樣子：給行人看的紅綠燈是直立的豎在對面的馬路上紅綠燈的形狀是長方形，有三個不同的樣子，分別是兩個燈、三個燈跟四個燈德國的紅綠燈沒有所謂的黃燈，只有不能走的紅燈，跟可以走的綠燈，而且紅燈的位置都左綠燈的上面紅燈跟綠燈不會同時存在，除非它壞了= = 都是黑黑的，跟我們一樣

Mobile Device & Databases Nokia N95 330MHZ ARM processor 18Mb RAM 320×240 2 publicly available database Ground truth segmentation was made manually N95在視障界非常地受歡迎，因為N95上有許多專為視障同胞設計的軟體，像是screen reader, mobile reading或是shopping assistants

System Architecture 1. 2. 4. 3. 上面：針對單張影像來做，先用localization找出所有可能含有traffic light的區塊，然後用classification去確認真正的紅綠燈在哪下面：video analysis 會參考前一張frame的資訊，利用motion estimation去找出理論上紅綠燈的位置最後：利用time-based verification 比較兩種方法找出來的紅綠燈，決定是否有正確地detect到

1. Localization 為了要減少classification要比對的candidate數量，Localization會利用很多個filter來去掉不合適的region。

Red and Green Color Filter(1/3) Analyze the data 先用人圈出ground truth，再去分析圈出來的紅燈/綠燈中顏色的分佈情況紅燈：包含了三個方向 (1) 灰色 (2) 黑色 (3) 紅色用Gaussian mixture model 去代表三個方向因為只有紅色的部份是真正的燈號，所以保留這個distribution

Red and Green Color Filter(2/3) Design the filter rules (ex : red traffic light) The Gaussian distribution of the red cluster is defined by its mean color 𝜇 = (0.48,0.06,0.07) and has three eigenvectors 𝑣 1 , 𝑣 2 𝑎𝑛𝑑 𝑣 3 A color c = (r, g, b) is a red traffic light color when 𝑡ℎ 𝑟𝑒𝑑,1 =0.20 𝑡ℎ 𝑟𝑒𝑑,2 =0.25 𝑡ℎ 𝑟𝑒𝑑,3 =0.07 接下來算出紅色這個cluster的mean，利用下面這三個條件來做 filter V1 v2 v3 分別是 r, g, b 三個方向的 eigenvector 紅色的intensity要大於一個threshold 然後 v2 要被limit在對角線的灰色上 V3 要被壓在low intensity的範圍中，就是黑色的部份

Red and Green Color Filter(3/3) Optimize parameters 10 4 different parameter settings for each color Use 300 images to train Measure the quality of each setting by TP, FP, FN Recall = 𝑇𝑃 (𝑇𝑃 +𝐹𝑁) , Precision = 𝑇𝑃 (𝑇𝑃+𝐹𝑃) 紅燈跟綠燈兩種情況共 8 個 parameter (各 3 個threshold跟 1 個mean u)要train 用300張image下去跑，計算每種setting的 recall和 precision 在紅燈的情況下，找不到燈號比找錯來得危險，如果找錯的話，大不了這個綠燈不過，等下一個綠燈但如果找不到的話，有可能就傻傻地走過去了>< ，所以recal 在 >=75%的情況下，找到最好的precision (recall=76%, precision=89.5%) 而在綠燈的情況下，把燈號誤判的狀況比找不出燈號要來得嚴重(也就是FP比FN更讓人無法接受)，把紅燈判成綠燈是不可原諒的= = ，所以我們要求precision >= 98.5% 的情況下，找到最好的recall (precision=98.5,recall=85.0%)

Size/Circuitry Filter Assume the traffic light is 4 to 24 meters away Fixed camera focal length and possible aspect ratios Filter out regions that are too small or too large Vertical neighbor should not have different color 可以推算出紅綠燈的寬度大概是2.5~15個pixel, 同理可得到一個高度的range，然後我們可以利用這兩個資訊來去掉面積差太多的區塊

Background Color Filter Inspect the region under a red light candidate or above a green light candidate If there are no dark pixels within search region, refuse this candidate Search region Search region

Validation of Localization Validate the localization results with 201 images Optimal Validation recall precision Red 76% 89.5% 71.8% 87% green 85% 98.5% 83.3% 92.6% Error很高，why? Error = 33.7%

2. Classification TLC is the broadest TLC has the smallest distance to the top of image No other traffic light has similar height with TLC 前面的identification找出了很多可能含有紅綠燈的region Classification就是要在這些 candidate 中找出真正含有紅綠燈的region / 真正的紅綠燈 Classification的filter主要用這三個條件來找出真正的紅綠燈

Performance of Classification Red Green Recall 86.3% Precision 97.4% 98.1%

3. Video Analysis(1/2) Temporary Occlusion Falsified Colors Contradictory Scene Repeating Results 為什麼要多加入video analysis？有時候被車擋住了，但是如果用video的話，車子一下子就會離開，所以在下面幾個frame還可以偵測到紅綠燈有時候camera自動調整會讓畫面的顏色或亮度怪怪的，用video的話，只要動一動camera，讓它有再次調整的機會有時候會出現多個紅綠燈，調整一下角度，可能會更清楚正確的result通常會重覆出現一段時間，更可以確保正確性

3. Video Analysis(2/2) Find the motion vector between two frames Use KLT tracker to track feature points Only search in a small area around crucial traffic light candidate (30 pixels in each direction) Correlate the features by using SAD Crucial traffic light Candidate region Feature point 𝑡 𝑖−1 𝑡 𝑖 Search region

4. Time-Based Verification Reduce the false positive detections by comparing 2 kinds of results Use state queue with 4 scenarios Identification and video analysis are both successful and the locations match with each other. Identification and video analysis are successful but the locations are different. Video analysis succeeds but identification fails. Video analysis fails but identification succeeds.

Experiment Results 𝑆𝑄 𝑠𝑖𝑧𝑒 =10 and 𝑆𝑄 𝑚𝑖𝑛 =5 Compute at least 5 frames per second At least 𝑆𝑄 𝑚𝑖𝑛 consecutive correct detection with the same color Switch 狀態會在<= Sqmin的情形下完成

Experiment Results Fig14. state queue的多次確定，可以提高準確度，prevent FP

Evaluations Reliability Prevent false positive green light detection

Evaluations Interactivity Temporal analysis reduce the interactivity The feedback is normally given within 2 seconds 除了sequence01和sequence04比較奇怪之外，其它的sequence在兩個frame之間，最長都不超過38張frame 01: 等40~80秒都還找不到feedback，雖然拖很久，但至少沒有給使用者錯誤的燈號訊息，這是應該要compromise的

Conclusion The system can be helpful on driver assistance systems Limited computational power on mobile devices The verification ideas can be improved