1 Incremental Detection of Text on Road Signs from Video Wen Wu Joint work with Xilin Chen and Jie Yang.

Slides:

Advertisements

Similar presentations

Abstract There is significant need to improve existing techniques for clustering multivariate network traffic flow record and quickly infer underlying.

Advertisements

Genoa, Italy September 2-4, th IEEE International Conference on Advanced Video and Signal Based Surveillance Combination of Roadside and In-Vehicle.

Kensington Oracle Edition: Open Discovery Workflow Meets Oracle 10g Professor Yike Guo.

Image Registration  Mapping of Evolution. Registration Goals Assume the correspondences are known Find such f() and g() such that the images are best.

Kien A. Hua Division of Computer Science University of Central Florida.

Real-Time Hand Gesture Recognition with Kinect for Playing Racing Video Games 2014 International Joint Conference on Neural Networks (IJCNN) July 6-11,

OCRdroid : A Framework to Digitize Text Using Mobile Phones  Authors  Mi Zhang, Anand Joshi, Ritesh Kadmawala, Karthik Dantu, Sameera Poduri, and Gaurav.

A KLT-Based Approach for Occlusion Handling in Human Tracking Chenyuan Zhang, Jiu Xu, Axel Beaugendre and Satoshi Goto 2012 Picture Coding Symposium.

A Versatile Depalletizer of Boxes Based on Range Imagery Dimitrios Katsoulas*, Lothar Bergen*, Lambis Tassakos** *University of Freiburg **Inos Automation-software.

ICIP 2000, Vancouver, Canada IVML, ECE, NTUA Face Detection: Is it only for Face Recognition?  A few years earlier  Face Detection Face Recognition 

Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.

HCI Final Project Robust Real Time Face Detection Paul Viola, Michael Jones, Robust Real-Time Face Detetion, International Journal of Computer Vision,

INTRODUCTION COMPUTATIONAL MODELS. 2 What is Computer Science Sciences deal with building and studying models of real world objects /systems. What is.

Capturing the 3D motion of ski jumpers Trip to Bonn (13-16 Nov 2005) Atle Nes Faculty of Informatics and e-Learning Trondheim University College.

Efficient Moving Object Segmentation Algorithm Using Background Registration Technique Shao-Yi Chien, Shyh-Yih Ma, and Liang-Gee Chen, Fellow, IEEE Hsin-Hua.

ADVISE: Advanced Digital Video Information Segmentation Engine

1 Abstract This paper presents a novel modification to the classical Competitive Learning (CL) by adding a dynamic branching mechanism to neural networks.

Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,

Video summarization by graph optimization Lu Shi Oct. 7, 2003.

Graphics Annotation Usability in eLearning Applications Dorian Gorgan, Teodor Ştefănuţ Computer Science Department Technical University of Cluj-Napoca.

UNIVERSITY OF MURCIA (SPAIN) ARTIFICIAL PERCEPTION AND PATTERN RECOGNITION GROUP REFINING FACE TRACKING WITH INTEGRAL PROJECTIONS Ginés García Mateos Dept.

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 11, NOVEMBER 2011 Qian Zhang, King Ngi Ngan Department of Electronic Engineering, the Chinese university.

Real-Time Decentralized Articulated Motion Analysis and Object Tracking From Videos Wei Qu, Member, IEEE, and Dan Schonfeld, Senior Member, IEEE.

EEL 6935 Embedded Systems Long Presentation 2 Group Member: Qin Chen, Xiang Mao 4/2/20101.

Computer Vision Systems for the Blind and Visually Disabled. STATS 19 SEM Talk 3. Alan Yuille. UCLA. Dept. Statistics and Psychology.

SensEye: A Multi-Tier Camera Sensor Network by Purushottam Kulkarni, Deepak Ganesan, Prashant Shenoy, and Qifeng Lu Presenters: Yen-Chia Chen and Ivan.

Copyright © 2007 Heathkit Company, Inc. All Rights Reserved PC Fundamentals Presentation 21 – Control Panel Managers.

Digital Graphics and Computers. Hardware and Software Working with graphic images requires suitable hardware and software to produce the best results.

Research Area B Leif Kobbelt. Communication System Interface Research Area B 2.

IMPLEMENTATION ISSUES REGARDING A 3D ROBOT – BASED LASER SCANNING SYSTEM Theodor Borangiu, Anamaria Dogar, Alexandru Dumitrache University Politehnica.

WP5.4 - Introduction  Knowledge Extraction from Complementary Sources  This activity is concerned with augmenting the semantic multimedia metadata basis.

RuleML-2007, Orlando, Florida1 Towards Knowledge Extraction from Weblogs and Rule-based Semantic Querying Xi Bai, Jigui Sun, Haiyan Che, Jin.

XP Practical PC, 3e Chapter 16 1 Looking “Under the Hood”

Multimedia Specification Design and Production 2013 / Semester 2 / week 8 Lecturer: Dr. Nikos Gazepidis

Associative Pattern Memory (APM) Larry Werth July 14, 2007

Abstract Developing sign language applications for deaf people is extremely important, since it is difficult to communicate with people that are unfamiliar.

Computer Science Department University of Pittsburgh 1 Evaluating a DVS Scheme for Real-Time Embedded Systems Ruibin Xu, Daniel Mossé and Rami Melhem.

Newsjunkie: Providing Personalized Newsfeeds via Analysis of Information Novelty Gabrilovich et.al WWW2004.

Video Tracking Using Learned Hierarchical Features

Implementing Codesign in Xilinx Virtex II Pro Betim Çiço, Hergys Rexha Department of Informatics Engineering Faculty of Information Technologies Polytechnic.

SPEECH CONTENT Spanish Expressive Voices: Corpus for Emotion Research in Spanish R. Barra-Chicote 1, J. M. Montero 1, J. Macias-Guarasa 2, S. Lufti 1,

Marco Pedersoli, Jordi Gonzàlez, Xu Hu, and Xavier Roca

K. Selçuk Candan, Maria Luisa Sapino Xiaolan Wang, Rosaria Rossini

Visual Attention Accelerated Vehicle Detection in Low-Altitude Airborne Video of Urban Environment Xianbin Cao, Senior Member, IEEE, Renjun Lin, Pingkun.

Road Inventory Data Collection Re-engineering Collected Data Items (more than 50 items): –Street Names. –Pavement width, number of lanes, etc. –Bike path,

Michael Isard and Andrew Blake, IJCV 1998 Presented by Wen Li Department of Computer Science & Engineering Texas A&M University.

Scalable Keyword Search on Large RDF Data. Abstract Keyword search is a useful tool for exploring large RDF datasets. Existing techniques either rely.

UNDER THE GUIDANCE DR. K. R. RAO SUBMITTED BY SHAHEER AHMED ID : Encoding H.264 by Thread Level Parallelism.

Chittampally Vasanth Raja vasanthexperiments.wordpress.com.

A Runtime Verification Based Trace-Oriented Monitoring Framework for Cloud Systems Jingwen Zhou 1, Zhenbang Chen 1, Ji Wang 1, Zibin Zheng 2, and Wei Dong.

Review 1 Chapters Chapter 1 Understanding Computers, 12th Edition 2 Chapter 1 Explain why it is essential to learn about computers today and discuss.

Harnessing the Cloud for Securely Outsourcing Large- Scale Systems of Linear Equations.

Vanderbilt University Toshiba IR Test Apparatus Project Final Design Review Ahmad Nazri Fadzal Zamir Izam Nurfazlina Kamaruddin Wan Othman.

KPIT Cummins Infosystems Ltd. © KPIT Cummins Infosystems Limited Lane Departure Warning System (LDWS) Ref: V1.0.

Portable Camera-Based Assistive Text and Product Label Reading From Hand-Held Objects for Blind Persons.

Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification.

Robust Segmentation of Freight Containers in Train Monitoring Videos Qing-Jie Kong*, Avinash Kumar**, Narendra Ahuja**,Yuncai Liu* **Department of Electrical.

ENTERFACE 08 Project 9 “ Tracking-dependent and interactive video projection ” Mid-term presentation August 19th, 2008.

Date of download: 7/8/2016 Copyright © 2016 SPIE. All rights reserved. A scalable platform for learning and evaluating a real-time vehicle detection system.

Large Scale Semantic Data Integration and Analytics through Cloud: A Case Study in Bioinformatics Tat Thang Parallel and Distributed Computing Centre,

National Taiwan Normal A System to Detect Complex Motion of Nearby Vehicles on Freeways C. Y. Fang Department of Information.

A Forest of Sensors: Using adaptive tracking to classify and monitor activities in a site Eric Grimson AI Lab, Massachusetts Institute of Technology

WP4 Measurements & social indicators.

Automatic Digitizing.

NBKeyboard: An Arm-based Word-gesture keyboard

Novel Face Detection Method Based on Gabor Features

IP Control Gateway (IPCG)

AHED Automatic Human Emotion Detection

Gradient Domain Salience-preserving Color-to-gray Conversion

Presentation transcript:

1 Incremental Detection of Text on Road Signs from Video Wen Wu Joint work with Xilin Chen and Jie Yang

2 Acquire Text, Process Text Corpus Languag e (Text) Languag e (Text) Web Visual Speech NLP Translation IR/IE Multimedia Speech

3 Text helps to understand images

4 Why interested in text on signs? Signs are everywhere in our daily life, such as shop names, billboard, street names, etc; Like other information device, road signs are placed to convey information to human for different purposes; Text could be the most flexible way to express dynamic information. Why not make computer to understand those text and further assist human?

5 Too many signs cause problems

6 It happened in Pittsburgh too!

7 Task Automatically detect text on road signs from video.

8 Related work

9 What makes us to detect sign?

10 What do you think?

11 Vertical plane property of signs

12 Divide-and-Conquer Strategy Decompose the original task into two sub-tasks, that is, localization of road signs and detection of text; Propose algorithms for two sub-tasks respectively, integrate them by mapping corresponding feature points; Use features from not only individual 2D images but also temporal dependency between them.

13 Incremental Detection Framework

14 Why incremental? Computation requirement –Detection is a computation-expensive step; –In contrast, mapping correspondence points is a cheap step; Video resolution –Detection requires low resolution –OCR requires high resolution LocalizeDetectRecognize Time

15 System Implementation Prototype Built on a PC with Intel Pentium 4 GHz and 1GB memory, Windows XP; Data: 1) Captured by a DV camera mounted on a minivan. 2) Video frame size is 640*480. 3) The database included about 3 hours of videos, captured in different conditions, i.e., in the morning, afternoon, and dusk.

16 A Demo Demo

17 Sequences of the Demo

18 Incremental vs. Non-incremental Another demo

19 Summary of Evaluation 22 video sequences with different driving situations; Vehicle ’ s speed varies from 20 to 55 MPH Testing data contain ~90 road signs and > 300 words. # of signsHit rateFalse hits %17.9% Hit rateFalse hitsSpeed Non-Incre-80.2%85.6%2-6 fps Incre-88.9%9.2%8-16fps Table 1. Sign localization performanceTable 2. Text detection performance

20 Contributions Proposed a unified framework for automatically detecting text on road signs from video based on the natural characteristics of the task; Exploited features for text detection not only from individual 2D images but also from temporal dependency in video; Made connection between understanding visual information and understanding language (text).

21 Conclusions & Future Work Automatic detection of text on road signs could be very useful in various applications; Experiments have shown that the new framework could significantly improves robustness and efficiency of any existing text detection algorithm; Future work: Apply various language methods to detected texts in video, e.g., translation, IR, etc.

22 Question ? Thank You