LYU 0102 : XML for Interoperable Digital Video Library Recent years, rapid increase in the usage of multimedia information, Recent years, rapid increase.

Slides:



Advertisements
Similar presentations
CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 4 – Digital Image Representation Klara Nahrstedt Spring 2009.
Advertisements

Automatic Video Shot Detection from MPEG Bit Stream Jianping Fan Department of Computer Science University of North Carolina at Charlotte Charlotte, NC.
Image Information Retrieval Shaw-Ming Yang IST 497E 12/05/02.
July 27, 2002 Image Processing for K.R. Precision1 Image Processing Training Lecture 1 by Suthep Madarasmi, Ph.D. Assistant Professor Department of Computer.
A presentation by Modupe Omueti For CMPT 820:Multimedia Systems
Discussion on Video Analysis and Extraction, MPEG-4 and MPEG-7 Encoding and Decoding in Java, Java 3D, or OpenGL Presented by: Emmanuel Velasco City College.
Information Retrieval in Practice
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials 2.
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
Content-based Image Retrieval CE 264 Xiaoguang Feng March 14, 2002 Based on: J. Huang. Color-Spatial Image Indexing and Applications. Ph.D thesis, Cornell.
Text Detection in Video Min Cai Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.
ADVISE: Advanced Digital Video Information Segmentation Engine
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
Chinese Character Recognition for Video Presented by: Vincent Cheung Date: 25 October 1999.
Video summarization by graph optimization Lu Shi Oct. 7, 2003.
Visual Information Retrieval Chapter 1 Introduction Alberto Del Bimbo Dipartimento di Sistemi e Informatica Universita di Firenze Firenze, Italy.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Smart Traveller with Visual Translator. What is Smart Traveller? Mobile Device which is convenience for a traveller to carry Mobile Device which is convenience.
Computer Vision Ronald Frazier CIS 479 April 20, 1999.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.
Overview of Search Engines
Batch VIP — A backend system of video processing VIEW Technologies The Chinese University of Hong Kong.
Information Retrieval in Practice
TERMS TO KNOW. Programming Language A vocabulary and set of grammatical rules for instructing a computer to perform specific tasks. Each language has.
Multimedia Databases (MMDB)
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Multimedia Information Retrieval
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
S EGMENTATION FOR H ANDWRITTEN D OCUMENTS Omar Alaql Fab. 20, 2014.
Computer Vision – Overview Hanyang University Jong-Il Park.
MULTIMEDIA DATABASES -Define data -Define databases.
80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.
Understanding The Semantics of Media Chapter 8 Camilo A. Celis.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials: Informedia.
Introduction to metadata
XML stands for Extensible Mark-up Language XML is a mark-up language much like HTML XML was designed to carry data, not to display data XML tags are not.
March 31, 1998NSF IDM 98, Group F1 Group F Multi-modal Issues, Systems and Applications.
1 Applications of video-content analysis and retrieval IEEE Multimedia Magazine 2002 JUL-SEP Reporter: 林浩棟.
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #15 Secure Multimedia Data.
Bachelor of Engineering In Image Processing Techniques For Video Content Extraction Submitted to the faculty of Engineering North Maharashtra University,
Image and Video Retrieval INST 734 Doug Oard Module 13.
MMDB-9 J. Teuhola Standardization: MPEG-7 “Multimedia Content Description Interface” Standard for describing multimedia content (metadata).
1 Machine Vision. 2 VISION the most powerful sense.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
1/12/ Multimedia Data Mining. Multimedia data types any type of information medium that can be represented, processed, stored and transmitted over.
Text From Corners: A Novel Approach to Detect Text and Caption in Videos Xu Zhao, Kai-Hsiang Lin, Yun Fu, Member, IEEE, Yuxiao Hu, Member, IEEE, Yuncai.
Soon Joo Hyun Database Systems Research and Development Lab. US-KOREA Joint Workshop on Digital Library t Introduction ICU Information and Communication.
MULTIMEDIA DATA MODELS AND AUTHORING
Preliminary Transformations Presented By: -Mona Saudagar Under Guidance of: - Prof. S. V. Jain Multi Oriented Text Recognition In Digital Images.
Automatic Caption Localization in Compressed Video By Yu Zhong, Hongjiang Zhang, and Anil K. Jain, Fellow, IEEE IEEE Transactions on Pattern Analysis and.
Web mining is the use of data mining techniques to automatically discover and extract information from Web documents/services
Introduction to MPEG  Moving Pictures Experts Group,  Geneva based working group under the ISO/IEC standards.  In charge of developing standards for.
Portable Camera-Based Assistive Text and Product Label Reading From Hand-Held Objects for Blind Persons.
MPEG 7 &MPEG 21.
Information Retrieval in Practice
MPEG-7 What is MPEG-7 ? MPEG-7 is a multimedia content description standard. These descriptions are based on catalogue (e.g., title, creator, rights),
Digital Video Library - Jacky Ma.
Visual Information Retrieval
Search Engine Architecture
Automatic Video Shot Detection from MPEG Bit Stream
Introduction Multimedia initial focus
Presenter: Ibrahim A. Zedan
Content-based Image Retrieval
Text Detection in Images and Video
Multimedia Content Description Interface
Discussion Class 9 Informedia.
Presentation transcript:

LYU 0102 : XML for Interoperable Digital Video Library Recent years, rapid increase in the usage of multimedia information, Recent years, rapid increase in the usage of multimedia information, New approach: DIGITAL VIDEO LIBRARY New approach: DIGITAL VIDEO LIBRARY Automated video and audio indexing Automated video and audio indexing Navigation, visualization Navigation, visualization Search and retrieval Search and retrieval Video segmentation and summarization

Video Information Integration of speech, language, and image processing Integration of speech, language, and image processing Text processing Text processing Audio processing Audio processing Image processing Image processing Video processing Video processing

Digital Video Library System Overview

Techniques to segment data

Techniques we may apply VOCD VOCD Scene changes Scene changes Text processing Text processing Face detection Face detection Storage as XML Storage as XML

Techniques to be discussed VOCR VOCR Scene changes Scene changes Storage and editing with XML Storage and editing with XML

Video OCR for Digital News

Detection of Text Region Video news program comprises huge numbers of frames Video news program comprises huge numbers of frames Roughly detect text region Roughly detect text region Increase processing speed Increase processing speed Reduce processing cost Reduce processing cost

Detection of Text Region Typical text region can be characterized as a horizontal rectangular structure Typical text region can be characterized as a horizontal rectangular structure With clustered sharp edges With clustered sharp edges Regions of high contrast against the background Regions of high contrast against the background

Image Enhancement Sub-pixel Interpolation: Sub-pixel Interpolation: –To magnify the text area –To increase the resolution of caption Multi-frame Integration: Multi-frame Integration: –Video motion of non-caption areas, caption relatively stable –To reduce the variability on background

Character Segmentation Vertical project profile Vertical project profile Character segmentation Character segmentation

Character Recognition Binarize the character image with threshold Binarize the character image with threshold Filter the binary image with morphological filter Filter the binary image with morphological filter Filter the character image with connected component filter Filter the character image with connected component filter

Post-Processing Further improve the recognition rate Further improve the recognition rate 1.Using the words of dictionary to refine the character 2.Integrate the recognition result of multiple frames

Scene change detection technique detection technique effective method for segmenting a video sequence into significant components effective method for segmenting a video sequence into significant components

Existing Method Image difference method Histogram Difference Method Histogram Difference Method using DC Coefficient Image Our Method Our Method  histogram difference method with a dynamic threshold

Scene change grasp scene from the video for every 0.05 second grasp scene from the video for every 0.05 second grasped scenes are 24-bit image, 8 bits for each color (red R, green G, blue B) grasped scenes are 24-bit image, 8 bits for each color (red R, green G, blue B) check each pixel with the most 2 significant bits check each pixel with the most 2 significant bits classify them into 64 different classes classify them into 64 different classes build a color histogram build a color histogram

Scene change Compared the histogram with the pervious scene Compared the histogram with the pervious scene For each column of the histogram, calculate the difference For each column of the histogram, calculate the difference Sum all the difference Sum all the difference If (total difference) > threshold If (total difference) > threshold => scene change Use the first frame as key frame Use the first frame as key frame

XML Extensible Markup Language Extensible Markup Language Create its own mark-up language for describing the contents Create its own mark-up language for describing the contents Look like a big database Look like a big database

Advantages of using XML Platform and system independent Platform and system independent Create your own tag Create your own tag Adopt Unicode Adopt Unicode Universal format Universal format Easy to search Easy to search

Design schema Starts with choosing a vocabulary Words and phrases that are able to describe extracted video information content and therefore can be used as tag name Show relationship between vocabulary entries

XML Parser A parser is a interface between an XML document and the application program A parser is a interface between an XML document and the application program Document Object Model (DOM) Document Object Model (DOM)

How to present XML Tree model becomes very similar to an XML schema Tree model becomes very similar to an XML schema Represented as nodes that show element/attribute names or the text content and their relative places within the XML Represented as nodes that show element/attribute names or the text content and their relative places within the XML

OUR TOOL

COMING EXTRACT SECONDARY INFORMATION EXTRACT SECONDARY INFORMATION

THE END