2002.09.17 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 am Fall 2002

Slides:



Advertisements
Similar presentations
Taxonomy & Ontology Impact on Search Infrastructure John R. McGrath Sr. Director, Fast Search & Transfer.
Advertisements

National Technical University of Athens Department of Electrical and Computer Engineering Image, Video and Multimedia Systems Laboratory
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
A Stepwise Modeling Approach for Individual Media Semantics Annett Mitschick, Klaus Meißner TU Dresden, Department of Computer Science, Multimedia Technology.
Kien A. Hua Division of Computer Science University of Central Florida.
Image Information Retrieval Shaw-Ming Yang IST 497E 12/05/02.
1 Content-Based Retrieval (CBR) -in multimedia systems Presented by: Chao Cai Date: March 28, 2006 C SC 561.
Discussion on Video Analysis and Extraction, MPEG-4 and MPEG-7 Encoding and Decoding in Java, Java 3D, or OpenGL Presented by: Emmanuel Velasco City College.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials 2.
EE442—Multimedia Networking Jane Dong California State University, Los Angeles.
SLIDE 1IS246 - SPRING 2003 Lecture 18: Final Project Overview IS246 Multimedia Information (FILM 240, Section 4) Prof. Marc Davis UC Berkeley.
SLIDE 1IS 257 – Fall 2007 Thesaurus Construction and Use University of California, Berkeley School of Information IS 245: Organization of.
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
ADVISE: Advanced Digital Video Information Segmentation Engine
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
Multimedia Search and Retrieval Presented by: Reza Aghaee For Multimedia Course(CMPT820) Simon Fraser University March.2005 Shih-Fu Chang, Qian Huang,
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
ISP 433/633 Week 5 Multimedia IR. Goals –Increase access to media content –Decrease effort in media handling and reuse –Improve usefulness of media content.
SLIDE 1IS 202 – FALL 2003 Lecture 10: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.
SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 am Fall 2002
Architecture & Data Management of XML-Based Digital Video Library System Jacky C.K. Ma Michael R. Lyu.
SLIDE 1IS 202 – FALL 2004 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2004
Visual Information Retrieval Chapter 1 Introduction Alberto Del Bimbo Dipartimento di Sistemi e Informatica Universita di Firenze Firenze, Italy.
Presented by Zeehasham Rasheed
SLIDE 1IS246 - SPRING 2003 Lecture 15: Automated Analysis: Video IS246 Multimedia Information (FILM 240, Section 4) Prof. Marc Davis UC Berkeley.
SLIDE 1IS 202 – FALL 2003 Lecture 10: Metadata for Media Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am.
Visual Information Systems visual information retrieval.
SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.
Marc Davis Chairman and Chief Technology Officer Representing Video for Retrieval and Repurposing SIMS 202 Information Organization and Retrieval.
1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System Supervisor: Prof Michael Lyu Presented by: Lewis Ng,
Metadata Presentation by Rick Pitchford Chief Engineer, School of Communication COM 633, Content Analysis Methods Fall 2009.
김덕주 (Duck Ju Kim). Problems What is the objective of content-based video analysis? Why supervised identification has limitation? Why should use integrated.
Multimedia Enabling Software. The Human Perceptual System Since the multimedia systems are intended to be used by human, it is a pragmatic approach to.
1 Samson Cheung EE 639, Fall 2004 Lecture 1: Applications & Trends Multimedia Information Systems advent: open communicator browser, screen cam, hari’s.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Multimedia Databases (MMDB)
CHAPTER FOUR COMPUTER SOFTWARE.
Information Systems & Semantic Web University of Koblenz ▪ Landau, Germany Semantic Web - Multimedia Annotation – Steffen Staab
1 CS 430 / INFO 430 Information Retrieval Lecture 23 Non-Textual Materials 2.
MULTIMEDIA DEFINITION OF MULTIMEDIA
SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 am Fall 2002
Research Projects 6v81 Multimedia Database Yohan Jin, T.A.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials: Informedia.
PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun.
Co-funded by the European Union Semantic CMS Community Content and Knowledge Management From free text input to automatic entity enrichment Copyright IKS.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
MMDB-9 J. Teuhola Standardization: MPEG-7 “Multimedia Content Description Interface” Standard for describing multimedia content (metadata).
Semantic Extraction and Semantics-Based Annotation and Retrieval for Video Databases Authors: Yan Liu & Fei Li Department of Computer Science Columbia.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
SAPIR Search in Audio-Visual Content using P2P Information Retrival For more information visit: Support.
MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University.
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
Video on the Semantic Web Experiences with Media Streams CWI Amsterdam Joost Geurts Jacco van Ossenbruggen Lynda Hardman UC Berkeley SIMS Marc Davis.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
1 CS 430 / INFO 430 Information Retrieval Lecture 17 Metadata 4.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
SLIDE 1IS 202 – FALL 2004 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2004
Relevance Feedback in Image Retrieval System: A Survey Tao Huang Lin Luo Chengcui Zhang.
BIT 3193 MULTIMEDIA DATABASE CHAPTER 4 : QUERING MULTIMEDIA DATABASES.
 The same story, information, etc can be represented in different media  Text, images, sound, moving pictures  All media can be represented digitally.
MPEG 7 &MPEG 21.
Digital Video Library - Jacky Ma.
Visual Information Retrieval
Introduction Multimedia initial focus
Multimedia Content-Based Retrieval
CHAPTER 8 Multimedia Authoring Tools
Overview What is Multimedia? Characteristics of multimedia
Multimedia Information Retrieval
MUMT611: Music Information Acquisition, Preservation, and Retrieval
Presentation transcript:

SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 am Fall SIMS 202: Information Organization and Retrieval Lecture 07: Multimedia Information

SLIDE 2IS 202 – FALL 2002 Last Time Review –Dublin Core –Other Metadata Systems Controlled Vocabularies Name Authority Files –Choice of Names –Form of Names Other Types of Controlled Vocabularies Faceted vs. Hierarchic Organization of Vocabularies

SLIDE 3IS 202 – FALL 2002 Hierarchical Classification Each category is successively broken down into smaller and smaller subdivisions No item occurs in more than one subdivision Each level divided out by a “character of division” (also known as a feature) –Example: Distinguish “Literature” based on: –Language –Genre –Time Period Slide author: Marti Hearst

SLIDE 4IS 202 – FALL 2002 Hierarchical Classification Literature SpanishFrenchEnglish DramaPoetryProse 18th17th16th DramaPoetryProse 19th18th17th16th19th... Slide author: Marti Hearst

SLIDE 5IS 202 – FALL 2002 Labeled Categories for Hierarchical Classification LITERATURE –100 English Literature 110 English Prose –English Prose 16th Century –English Prose 17th Century –English Prose 18th Century – English Poetry –121 English Poetry 16th Century –122 English Poetry 17th Century – English Drama –130 English Drama 16th Century –… –200 French Literature Slide author: Marti Hearst

SLIDE 6IS 202 – FALL 2002 Faceted Classification Create a separate, free-standing list for each characteristic or division (feature) Combine features to create a classification Slide author: Marti Hearst

SLIDE 7IS 202 – FALL 2002 Faceted Classification A Language –a English –b French –c Spanish B Genre –a Prose –b Poetry –c Drama C Period –a 16th Century –b 17th Century –c 18th Century –d 19th Century Aa English Literature AaBa English Prose AaBaCa English Prose 16th Century AbBbCd French Poetry 19th Century BbCd Drama 19th Century Slide author: Marti Hearst

SLIDE 8IS 202 – FALL 2002 Today’s Lecture Goals Overview of major concepts, issues, and challenges for multimedia information Introduction to some of my research areas in digital media at SIMS –Not a survey of existing systems –Not an in depth discussion of algorithms for multimedia indexing and retrieval For more breadth and depth, talk to me and take “IS 246: Multimedia Information” next semester

SLIDE 9IS 202 – FALL 2002 Lecture 07: Multimedia Information Problem Setting Representing Media Current Approaches New Solutions Methodological Considerations Future Work

SLIDE 10IS 202 – FALL 2002 Lecture 07: Multimedia Information Problem Setting Representing Media Current Approaches New Solutions Methodological Considerations Future Work

SLIDE 11IS 202 – FALL 2002 Marc Davis Research Creating technology and applications that will enable daily media consumers to become daily media producers Research and teaching in the theory, design, and development of digital media systems for creating and using media metadata to automate media production and reuse

SLIDE 12IS 202 – FALL 2002 Global Media Network Digital media produced anywhere by anyone accessible to anyone anywhere Today’s media users become tomorrow’s media producers Not 500 Channels — 500,000,000 multimedia Web Sites

SLIDE 13IS 202 – FALL 2002 What is the Problem? Today people cannot easily create, find, edit, share, and reuse media Computers don’t understand media content –Media is opaque and data rich –We lack structured representations Without content representation (metadata), manipulating digital media will remain like word- processing with bitmaps

SLIDE 14IS 202 – FALL 2002 Technology Goals Goals –Increase access to media content –Decrease effort in media handling and reuse –Improve usefulness of media content Technology –Create metadata about media content –Use metadata to manipulate media

SLIDE 15IS 202 – FALL 2002 Types of Multimedia Data 1D –Audio (speech, music, sound effects, etc.) –MIDI 2D –Photographs –Graphics 3D –Video (2D + Time) –Animation (2D + Time) –Computer graphic models 4D –Computer graphic model animation (3D + Time)

SLIDE 16IS 202 – FALL 2002 Chang: Content-Based Media Analysis “Traditional views of content-based technologies focus on search and retrieval—which is important but relatively narrow.” “[…] emphasizing the end-to-end content chain and the many issues evolving around it. What’s the best way to integrate manual and automatic solutions in different parts of the chain?”

SLIDE 17IS 202 – FALL 2002 Media Production Chain PRE-PRODUCTIONPOST-PRODUCTIONPRODUCTIONDISTRIBUTION

SLIDE 18IS 202 – FALL 2002 Chang: Content-Based Media Analysis Areas of research –Reverse engineering of the media capturing and editing processes –Extracting and matching objects –Meaning decoding and automatic annotation –Analysis and retrieval with user feedback –Generating time-compressed skims

SLIDE 19IS 202 – FALL 2002 Chang: Content-Based Media Analysis Impact criteria –Generating metadata not available from production –Providing metadata that humans aren’t good at generating –Focusing on content with large volume and low individual value –Adopting well-defined tasks and performance metrics

SLIDE 20IS 202 – FALL 2002 Lecture 07: Multimedia Information Problem Setting Representing Media Current Approaches New Solutions Methodological Considerations Future Work

SLIDE 21IS 202 – FALL 2002 Representing Video Streams vs. Clips Video syntax and semantics Ontological issues in video representation

SLIDE 22IS 202 – FALL 2002 Video is Temporal

SLIDE 23IS 202 – FALL 2002 Streams vs. Clips

SLIDE 24IS 202 – FALL 2002 Stream-Based Representation Makes annotation pay off –The richer the annotation, the more numerous the possible segmentations of the video stream Clips –Change from being fixed segmentations of the video stream, to being the results of retrieval queries based on annotations of the video stream Annotations –Create representations which make clips, not representations of clips

SLIDE 25IS 202 – FALL 2002 Video Syntax and Semantics The Kuleshov Effect Video has a dual semantics –Sequence-independent invariant semantics of shots –Sequence-dependent variable semantics of shots

SLIDE 26IS 202 – FALL 2002 Ontological Issues for Video Video plays with rules for identity and continuity –Space –Time –Character –Action

SLIDE 27IS 202 – FALL 2002 Space and Time: Actual vs. Inferable Actual Recorded Space and Time –GPS –Studio space and time Inferable Space and Time –Establishing shots –Cues and clues

SLIDE 28IS 202 – FALL 2002 Lecture 07: Multimedia Information Problem Setting Representing Media Current Approaches New Solutions Methodological Considerations Future Work

SLIDE 29IS 202 – FALL 2002 The Search for Solutions Current approaches to creating metadata don’t work –Signal-based analysis –Keywords –Natural language Need standardized metadata framework –Designed for video and rich media data –Human and machine readable and writable –Standardized and scaleable –Integrated into media capture, archiving, editing, distribution, and reuse

SLIDE 30IS 202 – FALL 2002 The Semantic Gap “[…] the semantic gap between the rich meaning that users want when they query and browse media and the shallowness of the content descriptions that we can actually compute is weakening today’s automatic content-annotation systems.” –Dorai and Venkatesh, “Computational Media Aesthetics: Finding Meaning Beautiful”

SLIDE 31IS 202 – FALL 2002 Signal-Based Parsing Practical problem –Parsing unstructured, unknown video is very, very hard Theoretical problem –Mismatch between percepts and concepts

SLIDE 32IS 202 – FALL 2002 Perceptual/Conceptual Issue Clown NoseRed Sun Similar Percepts / Dissimilar Concepts

SLIDE 33IS 202 – FALL 2002 Perceptual/Conceptual Issue Car Dissimilar Percepts / Similar Concepts John Dillinger’sTimothy McVeigh’s

SLIDE 34IS 202 – FALL 2002 Signal-Based Parsing Effective and useful automatic parsing –Video Scene break detection Camera motion analysis Facial recognition Feature tracking Low level visual similarity –Audio Pause detection Audio pattern matching Simple speech recognition Approaches to automated parsing –At the point of capture, integrate the recording device, the environment, and agents in the environment into an interactive system –After capture, use “human- in-the-loop” algorithms to leverage human and machine intelligence

SLIDE 35IS 202 – FALL 2002 Keywords vs. Semantic Descriptors dog, biting, Steve

SLIDE 36IS 202 – FALL 2002 Keywords vs. Semantic Descriptors dog, biting, Steve

SLIDE 37IS 202 – FALL 2002 Why Keywords Don’t Work Are not a semantic representation Do not describe relations between descriptors Do not describe temporal structure Do not converge Do not scale

SLIDE 38IS 202 – FALL 2002 Jack, an adult male police officer, while walking to the left, starts waving with his left arm, and then has a puzzled look on his face as he turns his head to the right; he then drops his facial expression and stops turning his head, immediately looks up, and then stops looking up after he stops waving but before he stops walking. Natural Language vs. Visual Language

SLIDE 39IS 202 – FALL 2002 Natural Language vs. Visual Language Jack, an adult male police officer, while walking to the left, starts waving with his left arm, and then has a puzzled look on his face as he turns his head to the right; he then drops his facial expression and stops turning his head, immediately looks up, and then stops looking up after he stops waving but before he stops walking.

SLIDE 40IS 202 – FALL 2002 Notation for Time-Based Media: Music

SLIDE 41IS 202 – FALL 2002 Visual Language Advantages A language designed as an accurate and readable representation of time-based media –For video, especially important for actions, expressions, and spatial relations Enables Gestalt view and quick recognition of descriptors due to designed visual similarities Supports global use of annotations

SLIDE 42IS 202 – FALL 2002 Retrieving Video Query: –Retrieve a video segment of “a hammer hitting a nail into a piece of wood” Sample results: –Video of a hammer hitting a nail into a piece of wood –Video of a hammer, a nail, and a piece of wood –Video of a nail hitting a hammer, and a piece of wood –Video of a sledgehammer hitting a spike into a railroad tie –Video of a rock hitting a nail into a piece of wood –Video of a hammer swinging –Video of a nail in a piece of wood

SLIDE 43IS 202 – FALL 2002 Types of Video Similarity Low-level numeric features –Color –Motion –Blobs Semantic –Similarity of descriptors Relational –Similarity of relations among descriptors in compound descriptors Temporal –Similarity of temporal relations among descriptors and compound descriptors

SLIDE 44IS 202 – FALL 2002 Retrieval Examples to Think With “Video of a hammer, a nail, and a piece of wood” –Exact semantic and temporal similarity, but no relational similarity “Video of a nail hitting a hammer, and a piece of wood” –Exact semantic and temporal similarity, but incorrect relational similarity “Video of a sledgehammer hitting a spike into a railroad tie” –Approximate semantic similarity of the subject and objects of the action and exact semantic similarity of the action; and exact temporal and relational similarity “Video of a hammer swinging” cut to “Video of a nail in a piece of wood”

SLIDE 45IS 202 – FALL 2002 What is Retrieval For? Redefine retrieval task as part of a larger user goal –Using a recipe –Getting to a location –Making a video greeting Smoliar: Rethinking information organization and retrieval –Context –Form –Content

SLIDE 46IS 202 – FALL 2002 Lecture 07: Multimedia Information Problem Setting Representing Media Current Approaches New Solutions Methodological Considerations Future Work

SLIDE 47IS 202 – FALL 2002 New Solutions for Creating Metadata After CaptureDuring Capture

SLIDE 48IS 202 – FALL 2002 Evolution of Media Production Customized production –Skilled creation of one media product Mass production –Automatic replication of one media product Mass customization –Skilled creation of adaptive media templates –Automatic production of customized media

SLIDE 49IS 202 – FALL 2002 Editing Paradigm Has Not Changed

SLIDE 50IS 202 – FALL 2002 Movies change from being static data to programs Shots are inputs to a program that computes new media based on content representation and functional dependency (US Patents 6,243,087 & 5,969,716) Central Idea: Movies as Programs Parser Producer Media Content Representation Content Representation

SLIDE 51IS 202 – FALL 2002 Automatic Video and Audio Editing Automatically edit the output movie based on content representation of dialogue and sound Example of editing based on dialogue Example of synchronizing video to music

SLIDE 52IS 202 – FALL 2002 Automatic Audio-Video Synchronization Raw Celery Chopping VideoU2 “Numb” AudioUnsynched Numb Celery Music Video Synched Numb Celery Music Video

SLIDE 53IS 202 – FALL 2002 Lecture 07: Multimedia Information Problem Setting Representing Media Current Approaches New Solutions Methodological Considerations Future Work

SLIDE 54IS 202 – FALL 2002 Computational Media More intimately integrate two great 20 th century inventions

SLIDE 55IS 202 – FALL 2002 Non-Technical Challenges Standardization of media metadata (MPEG-7) Broadband infrastructure and deployment Intellectual property and economic models for sharing and reuse of media assets

SLIDE 56IS 202 – FALL 2002 Next Time Metadata for Motion Pictures: Media Streams (MED) Readings for next time (in Protected) –“Media Streams: An Iconic Visual Language for Video Representation” (M. Davis) –“Garage Cinema and the Future of Media Technology” (M. Davis)“

SLIDE 57IS 202 – FALL 2002 Homework (!) Do Readings Assignment 3: Photo Metadata Design –Due by Thursday, September 19