Download presentation
Presentation is loading. Please wait.
2
2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2003 http://www.sims.berkeley.edu/academics/courses/is202/f04/ SIMS 202: Information Organization and Retrieval
3
2004.11.18 - SLIDE 2IS 202 – FALL 2004 Today’s Agenda Problem Setting New Solutions –Media Streams –Active Capture –Adaptive Media Discussion Questions Action Items for Next Time
4
2004.11.18 - SLIDE 3IS 202 – FALL 2004 Today’s Agenda Problem Setting New Solutions –Media Streams –Active Capture –Adaptive Media Discussion Questions Action Items for Next Time
5
2004.11.18 - SLIDE 4IS 202 – FALL 2004 Global Media Network Digital media produced anywhere by anyone accessible to anyone anywhere Today’s media users become tomorrow’s media producers Not 500 Channels — 500,000,000 multimedia Web sources
6
2004.11.18 - SLIDE 5IS 202 – FALL 2004 Media Asset Management and Reuse Media Asset Management –Corporate Media companies, media archives, training, sales, catalogs, etc. –Government Military, surveillance, law enforcement, etc. –Academia Libraries, research, instruction, etc. –Consumer Home video and photos, fan reuse of popular content, etc.
7
2004.11.18 - SLIDE 6IS 202 – FALL 2004 Applications of Analysis and Retrieval Professional and educational applications –Automated authoring of Web content –Searching and browsing large video archives –Easy access to educational materials –Indexing and archiving multimedia presentations –Indexing and archiving multimedia collaborative sessions Consumer domain applications –Video overview and access –Video content filtering –Enhanced access to broadcast video
8
2004.11.18 - SLIDE 7IS 202 – FALL 2004 The Media Opportunity Vastly more media will be produced Without ways to manage it (metadata creation and use) we lose the advantages of digital media Most current approaches are insufficient and perhaps misguided Great opportunity for innovation and invention Need interdisciplinary approaches to the problem
9
2004.11.18 - SLIDE 8IS 202 – FALL 2004 What is the Problem? Today people cannot easily find, edit, share, and reuse media Computers don’t understand media content –Media is opaque and data rich –We lack structured representations Without content representation (metadata), manipulating digital media will remain like word- processing with bitmaps
10
2004.11.18 - SLIDE 9IS 202 – FALL 2004 Signal-to-Symbol Problems Semantic Gap –Gap between low- level signal analysis and high-level semantic descriptions –“Vertical off-white rectangular blob on blue background” does not equal “Campanile at UC Berkeley”
11
2004.11.18 - SLIDE 10IS 202 – FALL 2004 Signal-to-Symbol Problems Sensory Gap –Gap between how an object appears and what it is –Different images of same object can appear dissimilar –Images of different objects can appear similar
12
2004.11.18 - SLIDE 11IS 202 – FALL 2004 Computer Vision and Context You go out drinking with your friends You get drunk Really drunk You get hit over the head and pass out You are flown to a city in a country you’ve never been to with a language you don’t understand and an alphabet you can’t read You wake up face down in a gutter with a terrible hangover You have no idea where you are or how you got there This is what it’s like to be most computer vision systems—they have no context and no memory Context and memory are what enable us to understand what we see
13
2004.11.18 - SLIDE 12IS 202 – FALL 2004 Disabling Assumptions 1.Media capture and media analysis are separated in time and space –Therefore removed from their context of creation and the users who created them 2.Contextual metadata about the capture and use of media are not available to media analysis –Therefore all analysis of media content must be focused on the media signal alone 3.Multimedia content analysis must be fully automatic –Therefore missing out on the possibility of “human- in-the-loop” approaches to algorithm design and network effects of groups of users
14
2004.11.18 - SLIDE 13IS 202 – FALL 2004 Enabling Assumptions 1.Integrate media capture and analysis at the point of capture and throughout the media lifecycle 2.Leverage contextual metadata (spatial, temporal, social, etc.) about the capture and use of media content 3.Design systems that incorporate human beings as functional components and aggregate user behavior Human-in-the-loop algorithms Network effects of the aggregation and analysis of human activity and media use
15
2004.11.18 - SLIDE 14IS 202 – FALL 2004 M E T A D A T A Traditional Media Production Chain PRE-PRODUCTIONPOST-PRODUCTIONPRODUCTIONDISTRIBUTION Metadata-Centric Production Chain
16
2004.11.18 - SLIDE 15IS 202 – FALL 2004 Asset Retrieval and Reuse Automated Media Production Process Web Integration and Streaming Media Services Flash Generator MMS XHTML Print/Physical Media Active Capture 1 Automatic Editing 3 Personalized/ Customized Delivery 4 Adaptive Media Engine 2 Annotation and Retrieval Reusable Online Asset Database Annotation of Media Assets
17
2004.11.18 - SLIDE 16IS 202 – FALL 2004 Chang: Content-Based Media Analysis “Traditional views of content-based technologies focus on search and retrieval—which is important but relatively narrow.” “[…] emphasizing the end-to-end content chain and the many issues evolving around it. What’s the best way to integrate manual and automatic solutions in different parts of the chain?”
18
2004.11.18 - SLIDE 17IS 202 – FALL 2004 Chang: Content-Based Media Technology Practical impact criteria for evaluating multimedia research directions –Generating metadata not available from production –Providing metadata that humans aren’t good at generating –Focusing on content with large volume and low individual value –Adopting well-defined tasks and performance metrics
19
2004.11.18 - SLIDE 18IS 202 – FALL 2004 Chang: Content-Based Media Technology Areas of research –Reverse engineering of the media capturing and editing processes –Extracting and matching objects –Meaning decoding and automatic annotation –Analysis and retrieval with user feedback –Generating time-compressed skims –Efficient indexing for large databases –Content adaptation for accessing, multimedia over heterogeneous devices –Standards for specifying content description language and scheme like MPEG-7
20
2004.11.18 - SLIDE 19IS 202 – FALL 2004 Computational Media Aesthetics “ […] the algorithmic study of a variety of image and aural elements in media (based on their use in film grammar). It is also the computational analysis of the principles that have emerged underlying their manipulation in the creative art of clarifying, intensifying, and interpreting an event for an audience.” “Our research systematically uses film grammar to inspire and underpin an automated process of analyzing, characterizing, and structuring professionally produced videos.”
21
2004.11.18 - SLIDE 20IS 202 – FALL 2004 CMA Challenges Can we dynamically detect successful aesthetic principles with accuracy and consistency using computational analysis? Can we build new postproduction tools based on this analysis for rapid, cost-efficient, and effective moviemaking and consistent evaluation? How can we use these successful audio–visual strategies for improved training and education in mass communication? How do we raise the quality of media annotation and improve the usability of content-based video search and retrieval systems?
22
2004.11.18 - SLIDE 21IS 202 – FALL 2004 Today’s Agenda Problem Setting New Solutions –Media Streams –Active Capture –Adaptive Media Discussion Questions Action Items for Next Time
23
2004.11.18 - SLIDE 22IS 202 – FALL 2004 Garage Cinema Research Research and develop technology and applications that will enable daily media consumers to become daily media producers Theory, design, and development of digital media systems that –Create descriptions of media content and structure (metadata) –Use metadata to automate media production and reuse
24
2004.11.18 - SLIDE 23IS 202 – FALL 2004 Research Projects Media Streams –A framework for creating metadata throughout the media production cycle to enable media reuse Active Capture –Automates direction and cinematography using real-time audio- video analysis in an interactive control loop to create reusable media assets Adaptive Media –Uses adaptive media templates and automatic editing functions to mass customize and personalize media Mobile Media Metadata –Leverages the spatio-temporal context and social community of media capture to automate metadata creation for mobile media Social Uses of Personal Media –Analysis of social uses of media to predict future uses and shape the design of next-generation personal media devices and applications
25
2004.11.18 - SLIDE 24IS 202 – FALL 2004 Today’s Agenda Problem Setting New Solutions –Media Streams –Active Capture –Adaptive Media Discussion Questions Action Items for Next Time
26
2004.11.18 - SLIDE 25IS 202 – FALL 2004 Research Projects Media Streams –A framework for creating metadata throughout the media production cycle to enable media reuse Active Capture –Automates direction and cinematography using real-time audio- video analysis in an interactive control loop to create reusable media assets Adaptive Media –Uses adaptive media templates and automatic editing functions to mass customize and personalize media Mobile Media Metadata –Leverages the spatio-temporal context and social community of media capture to automate metadata creation for mobile media Social Uses of Personal Media –Analysis of social uses of media to predict future uses and shape the design of next-generation personal media devices and applications
27
2004.11.18 - SLIDE 26IS 202 – FALL 2004 Media Metadata: Media Streams
28
2004.11.18 - SLIDE 27IS 202 – FALL 2004 Media Streams Features Key features –Stream-based representation (better segmentation) –Semantic indexing (what things are similar to) –Relational indexing (who is doing what to whom) –Temporal indexing (when things happen) –Iconic interface (designed visual language) –Universal annotation (standardized markup schema) Key benefits –More accurate annotation and retrieval –Global usability and standardization –Reuse of rich media according to content and structure
29
2004.11.18 - SLIDE 28IS 202 – FALL 2004 Today’s Agenda Problem Setting New Solutions –Media Streams –Active Capture –Adaptive Media Discussion Questions Action Items for Next Time
30
2004.11.18 - SLIDE 29IS 202 – FALL 2004 Research Projects Media Streams –A framework for creating metadata throughout the media production cycle to enable media reuse Active Capture –Automates direction and cinematography using real-time audio- video analysis in an interactive control loop to create reusable media assets Adaptive Media –Uses adaptive media templates and automatic editing functions to mass customize and personalize media Mobile Media Metadata –Leverages the spatio-temporal context and social community of media capture to automate metadata creation for mobile media Social Uses of Personal Media –Analysis of social uses of media to predict future uses and shape the design of next-generation personal media devices and applications
31
2004.11.18 - SLIDE 30IS 202 – FALL 2004 Creating Metadata During Capture New Capture Paradigm 1 Good Capture Drives Multiple Uses Current Capture Paradigm Multiple Captures To Get 1 Good Capture
32
2004.11.18 - SLIDE 31IS 202 – FALL 2004 Active Capture Processing CaptureInteraction Active Capture Computer Vision/ Audition Human Computer Interaction Direction/ Cinematography
33
2004.11.18 - SLIDE 32IS 202 – FALL 2004 Active Capture Active engagement and communication among the capture device, agent(s), and the environment Re-envision capture as a control system with feedback Use multiple data sources and communication to simplify the capture scenario Use HCI to support “human- in-the-loop” algorithms for computer vision and audition
34
2004.11.18 - SLIDE 33IS 202 – FALL 2004 Human-In-The-Loop Algorithms Leverage what humans and computers are respectively good at –Example: Object recognition and tracking Leverage interaction with the situated human agent –Examples: Activity recognition (Jump detector with “Simon Says” interaction) Object recognition (Car finder with “Treasure Hunt” interaction)
35
2004.11.18 - SLIDE 34IS 202 – FALL 2004 Active Capture Setup
36
2004.11.18 - SLIDE 35IS 202 – FALL 2004 Active Capture
37
2004.11.18 - SLIDE 36IS 202 – FALL 2004 Active Capture: Reusable Shots
38
2004.11.18 - SLIDE 37IS 202 – FALL 2004 Today’s Agenda Problem Setting New Solutions –Media Streams –Active Capture –Adaptive Media Discussion Questions Action Items for Next Time
39
2004.11.18 - SLIDE 38IS 202 – FALL 2004 Research Projects Media Streams –A framework for creating metadata throughout the media production cycle to enable media reuse Active Capture –Automates direction and cinematography using real-time audio- video analysis in an interactive control loop to create reusable media assets Adaptive Media –Uses adaptive media templates and automatic editing functions to mass customize and personalize media Mobile Media Metadata –Leverages the spatio-temporal context and social community of media capture to automate metadata creation for mobile media Social Uses of Personal Media –Analysis of social uses of media to predict future uses and shape the design of next-generation personal media devices and applications
40
2004.11.18 - SLIDE 39IS 202 – FALL 2004 Marc Davis in T2 Trailer
41
2004.11.18 - SLIDE 40IS 202 – FALL 2004 Evolution of Media Production Customized production –Skilled creation of one media product Mass production –Automatic replication of one media product Mass customization –Skilled creation of adaptive media templates –Automatic production of customized media
42
2004.11.18 - SLIDE 41IS 202 – FALL 2004 Editing Paradigm Has Not Changed
43
2004.11.18 - SLIDE 42IS 202 – FALL 2004 Computational Media More intimately integrate two great 20 th century inventions
44
2004.11.18 - SLIDE 43IS 202 – FALL 2004 Movies change from being static data to programs Shots are inputs to a program that computes new media based on content representation and functional dependency (US Patents 6,243,087 & 5,969,716) Central Idea: Movies as Programs Parser Producer Media Content Representation Content Representation
45
2004.11.18 - SLIDE 44IS 202 – FALL 2004 Automatic Video and Audio Editing Automatically edit the output movie based on content representation of dialogue and sound Example of editing based on dialogue Example of synchronizing video to music
46
2004.11.18 - SLIDE 45IS 202 – FALL 2004 1-Shot/2-Shot/Cutaway L-Cutting
47
2004.11.18 - SLIDE 46IS 202 – FALL 2004 Automatic Audio-Video Synchronization Raw Celery Chopping VideoU2 “Numb” AudioUnsynched Numb Celery Music Video Synched Numb Celery Music Video
48
2004.11.18 - SLIDE 47IS 202 – FALL 2004 Content Not Author- Generated Author- Generated Author- Generated Structure Compilation Movie Making Traditional Movie Making Historical Documentary Movie Making Adaptive Media Design Space
49
2004.11.18 - SLIDE 48IS 202 – FALL 2004 Video Lego (structure is constrained) Video MadLibs (structure is determined) Content Not Author- Generated Author- Generated Author- Generated Structure Compilation Movie Making Traditional Movie Making Historical Documentary Movie Making Adaptive Media Design Space
50
2004.11.18 - SLIDE 49IS 202 – FALL 2004 The Blank Page Approach
51
2004.11.18 - SLIDE 50IS 202 – FALL 2004 Captain Zoom IV MadLib™
52
2004.11.18 - SLIDE 51IS 202 – FALL 2004 Constructing With Lego™ Blocks
53
2004.11.18 - SLIDE 52IS 202 – FALL 2004 Video MadLibs and Video Lego Video MadLibs –Adaptive media template with open slots –Structure is fixed –Content can be varied Video Lego –Reusable media components that know how to fit together –Structure is constrained –Content can be varied
54
2004.11.18 - SLIDE 53IS 202 – FALL 2004 Today’s Agenda Problem Setting New Solutions –Media Streams –Active Capture –Adaptive Media Discussion Questions Action Items for Next Time
55
2004.11.18 - SLIDE 54IS 202 – FALL 2004 Discussion Questions (Chang) Jen King on “The Holy Grail of Content- Based Media Analysis” –Chang mentions three projects his lab has been working on: Live sports video filtering Medical video indexing and summarizing Computational parsing and skimming of films –What types of consumer-focused applications would benefit from content- based media analysis?
56
2004.11.18 - SLIDE 55IS 202 – FALL 2004 Discussion Questions (Chang) Jen King on “The Holy Grail of Content- Based Media Analysis” –One of the impact criteria Chang mentions is “focusing on content with large volume and low individual value,” such as home/family videos. What value is there to be gained by annotating millions of hours of weddings, graduations, and birthday parties?
57
2004.11.18 - SLIDE 56IS 202 – FALL 2004 Discussion Questions (CMA) Tim Dennis on “Computational Media Aesthetics” –Dorai and Venkatesh propose a "computational media aesthetics" and its potential use of film grammar to create future tools that will allow mass adoption of "successful techniques" of media production. Is it possible to use film grammar -- pacing, tempo, lighting – to identify "successful aesthetic" principles?
58
2004.11.18 - SLIDE 57IS 202 – FALL 2004 Discussion Questions (CMA) Tim Dennis on “Computational Media Aesthetics” –Dorai and Venkatesh describe creating a framework for computationally determined elements based on "basic devices" of film grammar -- shot, motion, recording distances, and practices -- and use these primitive features to build up higher order semantic based on production knowledge and film grammar. Will the notion of production knowledge and film grammar be something that is culturally situated? Will there be different film grammars for each film production milieus, e.g., Bollywood, Hollywood, etc.?
59
2004.11.18 - SLIDE 58IS 202 – FALL 2004 Discussion Questions (Davis) Andrew Iskandar on “Editing Out Video Editing” –The article mentions briefly the concept of “Video Lego” where a “set of reusable media components” will “know how to fit together.” This shifts the idea of computational video creation from merely switching around paradigmatic media elements in a fixed syntagmatic structure (Video Mad Libs) to creation of syntagmatic structure. What issues and challenges do you foresee in this idea? Will ‘Video Lego’ ever ‘know’ enough to make video media productions?
60
2004.11.18 - SLIDE 59IS 202 – FALL 2004 Discussion Questions (Davis) Andrew Iskandar on “Editing Out Video Editing” –Active Capture is used to create different paradigmatic media elements outside of a particular context (i.e. screaming, turning of head, etc.) This is creates the building blocks for computational media production. Does the fact that these media elements are created outside of a particular context present any problems and challenges? How can they be resolved?
61
2004.11.18 - SLIDE 60IS 202 – FALL 2004 Discussion Questions (Davis) Andrew Iskandar on “Editing Out Video Editing” –This concept of pulling content out of context so that video media elements can be reused has further applications. We’ve seen it in document engineering through the XML class. What other areas of media creation or other can this idea, of pulling content out of context to promote reusability, apply to? Graphic Art? Music? Poetry? Why is it more challenging in certain fields than others?
62
2004.11.18 - SLIDE 61IS 202 – FALL 2004 Today’s Agenda Problem Setting New Solutions –Media Streams –Active Capture –Adaptive Media Discussion Questions Action Items for Next Time
63
2004.11.18 - SLIDE 62IS 202 – FALL 2004 Assignment 7 Metadata Consolidation –Excel Phase –RDF Phase Taxonomy Items –Ontology MMMBase Items –Facet Syntax AnnotationBase Items –Exposed in UI Relations –Semantics beyond subclasses
64
2004.11.18 - SLIDE 63IS 202 – FALL 2004 Assignment 7 Protégé –Workshop Monday at 1:00 pm
65
2004.11.18 - SLIDE 64IS 202 – FALL 2004 Next Time Metadata for Motion Pictures: Media Streams and MPEG-7 Readings for next time –“Media Streams: An Iconic Visual Language for Video Representation” (Davis) Jennifer –“MPEG-7 (Part 1)” (Martinez, Koenen, Pereira) JingHua –“MPEG-7 (Part 2)” (Martinez, Koenen, Pereira) Sarita
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.