Igor Rosenberg Summer internship Creating a building detector June 16 th to September 15 th in Dublin City University, Ireland Supervisor: Alan Smeaton.

Slides:



Advertisements
Similar presentations
The Fischlar Digital Library: Networked Access to a Video Archive of TV News Prof. Alan Smeaton Centre for Digital Video Processing Dublin City University.
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
DONG XU, MEMBER, IEEE, AND SHIH-FU CHANG, FELLOW, IEEE Video Event Recognition Using Kernel Methods with Multilevel Temporal Alignment.
RGB-D object recognition and localization with clutter and occlusions Federico Tombari, Samuele Salti, Luigi Di Stefano Computer Vision Lab – University.
Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig.
1 Overview of Image Retrieval Hui-Ying Wang. 2/42 Reference Smeulders, A. W., Worring, M., Santini, S., Gupta, A.,, and Jain, R “Content-based.
Activity Recognition Aneeq Zia. Agenda What is activity recognition Typical methods used for action recognition “Evaluation of local spatio-temporal features.
DL:Lesson 11 Multimedia Search Luca Dini
Facial feature localization Presented by: Harvest Jang Spring 2002.
Video Shot Boundary Detection at RMIT University Timo Volkmer, Saied Tahaghoghi, and Hugh E. Williams School of Computer Science & IT, RMIT University.
Pedestrian Detection in Crowded Scenes Dhruv Batra ECE CMU.
Discussion on Video Analysis and Extraction, MPEG-4 and MPEG-7 Encoding and Decoding in Java, Java 3D, or OpenGL Presented by: Emmanuel Velasco City College.
Broadcast News Parsing Using Visual Cues: A Robust Face Detection Approach Yannis Avrithis, Nicolas Tsapatsoulis and Stefanos Kollias Image, Video & Multimedia.
Robust and large-scale alignment Image from
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing.
1 Content Based Image Retrieval Using MPEG-7 Dominant Color Descriptor Student: Mr. Ka-Man Wong Supervisor: Dr. Lai-Man Po MPhil Examination Department.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials 2.
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
3. Introduction to Digital Image Analysis
CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.
LYU 0102 : XML for Interoperable Digital Video Library Recent years, rapid increase in the usage of multimedia information, Recent years, rapid increase.
Chinese Character Recognition for Video Presented by: Vincent Cheung Date: 25 October 1999.
Color a* b* Brightness L* Texture Original Image Features Feature combination E D 22 Boundary Processing Textons A B C A B C 22 Region Processing.
CS 223B Assignment 1 Help Session Dan Maynes-Aminzade.
1 Ecological Statistics and Perceptual Organization Charless Fowlkes work with David Martin and Jitendra Malik at University of California at Berkeley.
A Vision-Based System that Detects the Act of Smoking a Cigarette Xiaoran Zheng, University of Nevada-Reno, Dept. of Computer Science Dr. Mubarak Shah,
Heather Dunlop : Advanced Perception January 25, 2006
DVMM Lab, Columbia UniversityVideo Event Recognition Video Event Recognition: Multilevel Pyramid Matching Dong Xu and Shih-Fu Chang Digital Video and Multimedia.
Traffic Sign Identification Team G Project 15. Team members Lajos Rodek-Szeged, Hungary Marcin Rogucki-Lodz, Poland Mircea Nanu -Timisoara, Romania Selman.
Computer vision.
Image Annotation and Feature Extraction
Slide Image Retrieval: A Preliminary Study Guo Min Liew and Min-Yen Kan National University of Singapore Web IR / NLP Group (WING)
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.
RIAO video retrieval systems. The Físchlár-News-Stories System: Personalised Access to an Archive of TV News Alan F. Smeaton, Cathal Gurrin, Howon.
Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.
Chapter 14: SEGMENTATION BY CLUSTERING 1. 2 Outline Introduction Human Vision & Gestalt Properties Applications – Background Subtraction – Shot Boundary.
Beyond Sliding Windows: Object Localization by Efficient Subwindow Search The best paper prize at CVPR 2008.
Efficient Subwindow Search: A Branch and Bound Framework for Object Localization ‘PAMI09 Beyond Sliding Windows: Object Localization by Efficient Subwindow.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials: Informedia.
An Efficient Search Strategy for Block Motion Estimation Using Image Features Digital Video Processing 1 Term Project Feng Li Michael Su Xiaofeng Fan.
Soccer Video Analysis EE 368: Spring 2012 Kevin Cheng.
Informedia at TRECVID 2003: Analyzing and Searching Broadcast News Video TRECVID 2003 Carnegie Mellon University A. Hauptmann, R.V. Bron, M.-Y. Chen, M.Christel,
Kylie Gorman WEEK 1-2 REVIEW. CONVERTING AN IMAGE FROM RGB TO HSV AND DISPLAY CHANNELS.
MMDB-9 J. Teuhola Standardization: MPEG-7 “Multimedia Content Description Interface” Standard for describing multimedia content (metadata).
Image and Video Retrieval INST 734 Doug Oard Module 13.
CSE 185 Introduction to Computer Vision Feature Matching.
TREC-2003 (CDVP TRECVID 2003 Team)- 1 - Center for Digital Video Processing C e n t e r f o r D I g I t a l V I d e o P r o c e s s I n g CDVP & TRECVID-2003.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
1/12/ Multimedia Data Mining. Multimedia data types any type of information medium that can be represented, processed, stored and transmitted over.
Detection of Illicit Content in Video Streams Niall Rea & Rozenn Dahyot
An MPEG-7 Based Semantic Album for Home Entertainment Presented by Chen-hsiu Huang 2003/08/12 Presented by Chen-hsiu Huang 2003/08/12.
A New Method for Crater Detection Heather Dunlop November 2, 2006.
1 CS 430 / INFO 430 Information Retrieval Lecture 17 Metadata 4.
Automatic Caption Localization in Compressed Video By Yu Zhong, Hongjiang Zhang, and Anil K. Jain, Fellow, IEEE IEEE Transactions on Pattern Analysis and.
CSSE463: Image Recognition Day 14 Lab due Weds. Lab due Weds. These solutions assume that you don't threshold the shapes.ppt image: Shape1: elongation.
Institute of Informatics & Telecommunications NCSR “Demokritos” Spidering Tool, Corpus collection Vangelis Karkaletsis, Kostas Stamatakis, Dimitra Farmakiotou.
Ontology-based Automatic Video Annotation Technique in Smart TV Environment Jin-Woo Jeong, Hyun-Ki Hong, and Dong-Ho Lee IEEE Transactions on Consumer.
Week 5 Emily Hand UNR. AdaBoost For our previous detector, we used SVM.  Color Histogram We decided to try AdaBoost  Mean Blocks.
 Corpus Formation [CFT]  Web Pages Annotation [Web Annotator]  Web sites detection [NEACrawler]  Web pages collection [NEAC]  IE Remote.
Presenter: Ibrahim A. Zedan
A Fuzzy Indexing and Retrieval System
Mentor: Salman Khokhar
Multimedia Information Retrieval
Outline Announcement Perceptual organization, grouping, and segmentation Hough transform Read Chapter 17 of the textbook File: week14-m.ppt.
A Tutorial on Object Detection Using OpenCV
CSE 185 Introduction to Computer Vision
Support vector machine-based text detection in digital video
REU: Week 7 TRECVID Sean McMillan.
Presentation transcript:

Igor Rosenberg Summer internship Creating a building detector June 16 th to September 15 th in Dublin City University, Ireland Supervisor: Alan Smeaton

2 Environment DCU: Dublin City University CDVP : Centre for Digital Video Processing (25 people) My lab: 1 professor 3 post docs 5 PhD students

3 My activities - little modules - building detector - visiting Ireland

4 Fischlar: video enhancement Adding content to a video to use it as search data. For example, separating shots, extracting stories in news video, finding text in the video

5 Adding information to the MPEG7 descriptor XML This is the shot where the sun sets. Manual annotations Audio information One video XML descr. MPEG1

6 Just to get back into coding Width & height of keyframes ASR Frame rate XML descriptor (read the extracted images) (read the time stamps) (read mpeg1) Creation of thumbnails from the keyframes (changing size of images)

7 Closed captions 0.00 this 0.10 is 0.15 the 0.22 time 0.35 of 0.41 red 0.44 and 0.50 Sean ASR (time stamps) XML descriptor CC (precise) Shot boundary This is the time of redemption ENS ‘99 Shot boundary

8 ASR: w1 …….....w4 ….w7 …... CC:...w1 ….w4……………w7... matching x>y => match (x) > match(y) maximum number of matches “tree”=match(“Trees”) Rules : Closed captions: matching

9 M(Ua, Vb) = f( M(U,V) M(Ua,V) M(U, Vb) ) Closed captions : dynamic programming Time is up, Time is cut Time is up, man! Time is cut up Time is up, man! Time is cut Time is up, Time is cut up Man!  cut X

10 Alignment ASR not aligned to the video time (slight offset ~ ±30 sec). ASR VIDEO 05:30.2 word 05:48.5 word The ASR delay file is man made errors BUT TREC changed the guidelines: work thrown in the bin

11 Research

12 Building detector Given an image, say if a building can be seen Literature : 40 % precision Use for TREC - one of the features to detect: landscape/cityscape?

13 Ideas - Region segmentation - Dominant color - Texture homogeneity - Edge histogram - Support Vector Machine to aggregate results Extract possible building regions before anything else

14 Then evaluate each regions Values could describe: - dominant color - texture homogeneity - measure of how straight the lines are… v Values = …

15 Finally sum up these values Have to decide on strategy: - mean ? - highest score? - values + importance in the image? - Support Vector Machine (once trained, decides without heuristics)

16 What my utility does Extracts regions from image examines regions with different tools Sums up the results returns boolean

17 Tools not used Canny Edge detector Fast Fourrier Transform Line kernel Hough transform Sobel is enough Time’s up Doesn’t work Works only in simple cases

What should be added - better regional weighing - better overall measure - SVM

19 Results Number of images tested: 268 Precision: 29.7 % Recall 7.46%

20 This research experience was cool… I met lots of people Learnt a lot about programming Praticed english.

21 That’s it folks! - Get yourselves a good supervisor - Don’t go out on the second last week - Don’t go to Ireland (filthy weather) - Start early Thank you!

22 Structure of fischl á r Cocoon TomCat configure process configure process