DIVA - University of Fribourg - Switzerland Seminar presentation, jan. 2005 Lawrence Michel, MSc Student Portable Meeting Recorder.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

System Integration and Performance
Descriptive schemes for facial expression introduction.
Advanced Image Processing Student Seminar: Lipreading Method using color extraction method and eigenspace technique ( Yasuyuki Nakata and Moritoshi Ando.
Time-Frequency Analysis Analyzing sounds as a sequence of frames
Toward Automatic Music Audio Summary Generation from Signal Analysis Seminar „Communications Engineering“ 11. December 2007 Patricia Signé.
Chapter 5 Input and Output. What Is Input? What is input? p. 166 Fig. 5-1 Next  Input device is any hardware component used to enter data or instructions.
Using Multiple Synchronized Views Heymo Kou.  What is the two main technologies applied for efficient video browsing? (one for audio, one for visual.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials 2.
Video Object Tracking and Replacement for Post TV Production LYU0303 Final Year Project Spring 2004.
Quicktime Howell Istance School of Computing De Montfort University.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.
Visual Information Retrieval Chapter 1 Introduction Alberto Del Bimbo Dipartimento di Sistemi e Informatica Universita di Firenze Firenze, Italy.
Processor Frequency Setting for Energy Minimization of Streaming Multimedia Application by A. Acquaviva, L. Benini, and B. Riccò, in Proc. 9th Internation.
Facial Tracking and Animation Project Proposal Computer System Design Spring 2004 Todd BeloteDavid Brown Brad BusseBryan Harris.
Chapter 9 Audio.
1 JCM 106 Computer Application for Journalism Lecture 1 – Introduction to Computing.
Digital Sound and Video Chapter 10, Exploring the Digital Domain.
Smart Meeting Systems Josh Reilly. Why are Smart Meeting Systems worth studying?
Navigating and Browsing 3D Models in 3DLIB Hesham Anan, Kurt Maly, Mohammad Zubair Computer Science Dept. Old Dominion University, Norfolk, VA, (anan,
Multimedia Hardware. Network LAN- to transfer data inside a local area. WAN – to transfer data in long distance. Ethernet – method for connecting computers.
CSCI-235 Micro-Computers in Science Hardware Part II.
Contactforum: Digitale bibliotheken voor muziek. 3/6/2005 Real music libraries in the virtual future: for an integrated view of music and music information.
Multimodal Interaction Dr. Mike Spann
Input Devices.  Identify audio and video input devices  List the function of the respective devices.
An Overview of MPEG-21 Cory McKay. Introduction Built on top of MPEG-4 and MPEG-7 standards Much more than just an audiovisual standard Meant to be a.
Introduction to Computer Organization and Architecture.
Privacy Protection for Life-log Video Jayashri Chaudhari, Sen-ching S. Cheung, M. Vijay Venkatesh Department of Electrical and Computer Engineering Center.
Umm Al-Qura University Collage of Computer and Info. Systems Computer Engineering Department Automatic Camera Tracking System IMPLEMINTATION CONCLUSION.
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.
ScreenPlay Director Training By Erik Collett
The Browser Evaluation Test A Proposal Pierre Wellner, Mike Flynn IDIAP, September 2003.
1 CS 430 / INFO 430 Information Retrieval Lecture 23 Non-Textual Materials 2.
COMPUTER PARTS AND COMPONENTS INPUT DEVICES
Tutorial 7 Working with Multimedia. New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 2 Objectives Explore various multimedia applications.
Tutorial 7 Working with Multimedia. New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 2 Objectives Explore various multimedia applications.
Computer Vision Technologies for Remote Collaboration Using Physical Whiteboards, Projectors and Cameras Zhengyou Zhang Microsoft Research mailto:
MULTIMEDIA TECHNOLOGY SMM 3001 MEDIA - VIDEO. In this chapter How digital video differs from conventional analog video How digital video differs from.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials: Informedia.
MULTIMEDIA INPUT / OUTPUT TECHNOLOGIES INTRODUCTION 6/1/ A.Aruna, Assistant Professor, Faculty of Information Technology.
Beyond the PC Kiosks & Handhelds Albert Huang Larry Rudolph Oxygen Research Group MIT CSAIL.
In and Out are opposites. This is something to keep in mind when considering Input and Output. INPUT OUTPUT Ask: Does this device send information in?
By: Victoria Cain CPU- the component of a computer system that PROCESSES basic operations of the system. Monitor- a cathode-ray tube used for display.
1 Applications of video-content analysis and retrieval IEEE Multimedia Magazine 2002 JUL-SEP Reporter: 林浩棟.
 Describe the general organization and architecture of computers.  Identify computers’ major components and study their functions.  Identify the various.
Beginning Snapshots Chapter 0. C++ An Introduction to Computing, 3rd ed. 2 Objectives Give an overview of computer science Show its breadth Provide context.
CSCI-100 Introduction to Computing Hardware Part II.
MMDB-9 J. Teuhola Standardization: MPEG-7 “Multimedia Content Description Interface” Standard for describing multimedia content (metadata).
Semantic Extraction and Semantics-Based Annotation and Retrieval for Video Databases Authors: Yan Liu & Fei Li Department of Computer Science Columbia.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Team Members Ming-Chun Chang Lungisa Matshoba Steven Preston Supervisors Dr James Gain Dr Patrick Marais.
Behrooz ChitsazLorrie Apple Johnson Microsoft ResearchU.S. Department of Energy.
Multimedia. A medium (plural media) is something that a presenter can use for presentation of information Two basic ways to present information are: –Unimedium.
WBI/WCI - SKM 14 July Analysis and Knowledge Extraction from Video & Audio Rick Parent Jim Davis Raghu Machiraju Deleon Wang Department of Computer.
  Computer vision is a field that includes methods for acquiring,prcessing, analyzing, and understanding images and, in general, high-dimensional data.
MPEG 7 &MPEG 21.
1 Chapter 1 Basic Structures Of Computers. Computer : Introduction A computer is an electronic machine,devised for performing calculations and controlling.
MIT Artificial Intelligence Laboratory — Research Directions Intelligent Perceptual Interfaces Trevor Darrell Eric Grimson.
INTRODUCTION TO COMPUTERS. A computer system is an electronic device used to input data, process data, store data for later use and produce output in.
REAL-TIME DETECTOR FOR UNUSUAL BEHAVIOR
Digital Video Library - Jacky Ma.
Visual Information Retrieval
Introduction to Computers
Introduction to Computers
Tooling and Diagnostics
Multimedia Content Description Interface
Introduction to Computers
Chapter 9 Audio.
Jiwon Kim Steve Seitz Maneesh Agrawala
Presentation transcript:

DIVA - University of Fribourg - Switzerland Seminar presentation, jan Lawrence Michel, MSc Student Portable Meeting Recorder A multimodal meeting recorder solution designed by Ricoh Dar-Shyang Lee Berna Erol Jamey Graham Jonathan J. Hull Norihiko Murata

Concept 1/3 Intended goal A methodology to enable a full multimodal (A/V, metadata) recording and browsing of a meeting under strong constraint of minimal hardware intrusion, portability and maximal data extraction capability

Concept 2/3 The Portable Meeting Recorder system Hardware specifications “Minimal intrusive” A/V capture component 4 Microphones 1 360° Videocamera Meeting Recorder interface Touchscreen browsing A common PC for processing data

Concept 3/3 The Portable Meeting Recorder system Overview of the recording process 1° Computer records A/V 2° Computer processes Data 3° Computer creates Metadata (XML) (input for browser) 4° Computer consolidates data in database

Computer Metadata processing 2/ STEP 1 - Recording data Aud io (4) Vid eo

Computer Metadata processing 3/ STEP 2 – Processing data Aud io (4) Vid eo Sound localization Mpeg2 Cmpr. Mpeg 2 video (Pano ramic) Sound Direct ions View selection Face extraction Location recognition Motion analysis Audio Activity

Computer Metadata processing 4/ STEP 3 & 4 – Processing metadata and storing View selection Face extraction Location recognition Motion analysis Audio Activity Metadat a (XML) Mpeg2 video (Panoramic) Storage Audio (Manual transcription)

Sound Localization An interesting algorithm : the 360° Sound localization using 4 microphones α°α° β°β° Elevation computingAzimuth computing ► Method basically based on phase properties of 4 input signals, computing differences between them and “guessing” the appropriate angle.

Sound Localization 2/ Properties ► The method is applied at real-time meeting recording (30-40% CPU load in a 933MHz PC) ► Permits a maximum data extraction while requiring a minimum of hardware (thus needed a boily human brain output!) ► Accuracy is highly dependent on several factor, such as room specifications (e.g. reflectiv surfaces that leads to high signal reverberation), amplitude of signals, speech overlap, particular angles, etc. ► hardware dependency : Accuracy effectiveness is strongly correlated with signal sampling rate, sensitivity of input devices, etc. ► These datas are mainly needed for view selection and face extraction process

Meeting Location Recognition 1/ Another interesting method : recognizing the meeting location - adaptiv background modeling The process is as follow : 1° Analyzing frame by comparing its historgram with template 2° Applying foreground extraction 3° Resulting background image will be set as the newest template

Searching and Browsing with Visual and Audio Content How are the audio files, video files and XML metadatas efficiently exploited?

Searching and Browsing with Visual and Audio Content 1/ Introduction Searching and browsing audiovisual information is a time consuming task. The Audio and Video Recorder is, at it's actual state of development, unable to transcript automatically audio files. Alternatively, searching and browsing within our meeting document is based on visual and audio content activity.

Searching and Browsing with Visual and Audio Content 2/ Visual activity analysis In most of meeting sequences, there are most of the time minimal motions. High motion segments sequences will be corresponded to significant events

Searching and Browsing with Visual and Audio Content 3/ Audio activity analysis The system, which is highly based on audio analysis, enables to navigate through our document in various way, such as : Speaker segmentation using audio data ► Lost of efficiency when bad audio based tracking data are present (resulting from speech overlap, hardware specification, bad angle positioning,...).

Searching and Browsing with Visual and Audio Content 4/ Image : screenshot from Meeting Browser using the Muvie Client

Searching and Browsing with Visual and Audio Content 5/ Time Speaker transitions Visual Activity Audio Activity Key Frames Transcription

Thank you