Presentation is loading. Please wait.

Presentation is loading. Please wait.

Carnegie Mellon. Carnegie Mellon Multimedia Michael Christel Alex Hauptmann Rong Jin (TA)

Similar presentations


Presentation on theme: "Carnegie Mellon. Carnegie Mellon Multimedia Michael Christel Alex Hauptmann Rong Jin (TA)"— Presentation transcript:

1 Carnegie Mellon

2 Carnegie Mellon Multimedia Michael Christel Alex Hauptmann Rong Jin (TA) http://www.cs.cmu.edu/~alex/mmCourse

3 Carnegie Mellon How to get in touch with us Mike Christel christel@cs.cmu.edu http://www.cs.cmu.edu/~christel (412)268-7799 or x8-7799 WeH5212 Alex Hauptmann alex@cs.cmu.edu http://www.cs.cmu.edu/~alex (412)268-1448 or x8-1448 WeH5124 – Office Hours by Appointment

4 Carnegie Mellon Teaching Assistant Rong Jin jin+@andrew.cmu.edu Office WeH5316 Office hours by appointment (412)268-4050 or x8-4050

5 Carnegie Mellon Course Outline, Part 1 of 3 More details at www.cs.cmu.edu/~alex/mmCourse October 22Intro to Multimedia October 25Multimedia Enabling Technologies, Macromedia Flash Intro and Demo October 29Sound Processing, Speech Recognition November 1Digital Video Creation and Transmission November 5Speech Synthesis

6 Carnegie Mellon Course Outline, Part 2 of 3 More details at www.cs.cmu.edu/~alex/mmCourse November 8Image Processing November 12Digital Music and Music Processing November 15Multimedia Internet Protocols, SMIL November 19Synthetic Interviews: A Multimedia Company (Experiences from the Field) November 22Programming for Interactive Multimedia (CGI Scripts/ASP)

7 Carnegie Mellon Course Outline, Part 3 of 3 More details at www.cs.cmu.edu/~alex/mmCourse November 29Content Analysis and Coding of Digital Audio and Video, Multimedia Storage and Retrieval Management. December 3Video Retrieval Evaluation and Testing Multimedia Interface Design, Digital Libraries December 6Visual Design, Multimedia Interface Design Guidelines, Multimedia use in the future (Experience on Demand) December 10Multimedia as Entertainment Technology, Virtual Reality

8 Carnegie Mellon

9 Carnegie Mellon Homeworks See http://www.cs.cmu.edu/~alex/mmCourse 9 Homeworks planned, 10 points each One hard homework will be worth 20 points No final, no midterm Publish homeworks on your web page - email us URL Space?

10 Carnegie Mellon Today: Intro to Multimedia Apple Knowledge Navigator Vision 1988

11 Audio Images InformationRetrieval StorageSystems Networking Psychology HCI DataCompression NaturalLanguageProcessing Multimedia CPU Power Video

12 Carnegie Mellon Definition of Multimedia Multi (latin multus - numerous) Media, medium (latin medius, medium: middle, center, intermediary; latin mediat: intermediary, means) Multiple types of information captured, stored, manipulated, transmitted, and presented. Specifically: Images, Video, Audio (+Speech) and Text

13 Carnegie Mellon Definition of Multimodal Multi (latin multus - numerous) Modal (latin modus: manner) Traditionally refers to input/output formats: Input: sounds, speech (mike) gestures (camera, tablet) eye-gaze (camera), mouse, keyboard Output: sounds, speech video Pictures Animations Text

14 Carnegie Mellon Perceived Information Physical Variables Sound is a waveform An image is a waveform light is electromagnetic radiation with different intensity in spatial coordinates color corresponds to wavelength

15 Carnegie Mellon History of Multimedia I Analog signals to sensors E.g. vinyl records Fidelity is faithfulness to the original Digital representation (‘60s) Sampling Quantizing Coding codec, modem, (A/D and D/A)

16 Carnegie Mellon Hardware Advances CPU Bus Network I/O Keyboard, Mouse Disk Mike + A/D Board Camera + A/D Board Speakers (+ D/A Board) Display

17 Carnegie Mellon History of Multimedia II Analog controls only Special hardware (Displays, Scanners, FFTs) Integrated hardware components Further Integration Other devices

18 Carnegie Mellon History of Multimedia III Limiting Factors: Storage Limits CPU Speeds I/O Speeds Network Bandwidth

19 Carnegie Mellon Why Digital? Universal storage, transmission format CD, internet Precision (Range of values, number of bits, floating point) Lossless transmission/storage BUT: sampling rate distorts information size requirements may be ‘large’ compared to analog

20 Carnegie Mellon Digitization Process Sampling from an analog signal Sampling Errors relate to signal frequencies Quantization Errors

21 Carnegie Mellon Text ASCII, Unicode Formatted Text, Rich Text Document Formats: –Structured: Tex, HTML –Page Descriptions: Postscript, PDF

22 Carnegie Mellon Graphics Objects –circles, splines, rectangles, lines Editable –resize, reshape, move, colorize Synthetic

23 Carnegie Mellon Images (Pictures) Fixed digitized representation –bitmap, colors per pixel Editable in limited ways –retouch, cut and paste, remap colors, filter [Photoshop tools] –no ‘model’ of the thing Captured –not just from real life, clip art, screen dump

24 Carnegie Mellon Audio Sounds –hear 15 Hz to 20 kHz –Speech is 50 Hz to 10 kHz Speech Recognition –It is hard to wreck a nice beach –Ice cream I scream Synthesis –Speech –Music MIDI for 127 instruments, 47 percussion sounds Notes, timing

25 Carnegie Mellon Speech Recognition Issues Continuous vs Discrete Vocabulary Size Channel (Microphone) Environment (Location of mike and Speaker) Speaker Dependent/Speaker Independent Context (Language Model) Interactivity (Dialog Model)

26 Carnegie Mellon Acoustic Modeling Describes the sounds that make up speech Lexicon Describes which sequences of speech sounds make up valid words Language Model Describes the likelihood of various sequences of words being spoken Speech Recognition Speech Recognition Knowledge Sources

27 Carnegie Mellon Speech Variations Style Variations careful, clear, articulated, formal, casual spontaneous, normal, read, dictated, intimate Voice Quality breathy, creaky, whispery, tense, lax, modal Context sport, professional, interview, free conversation, man-machine dialogue Speaking Rate normal, slow, fast, very fast Stress in noise, with increased vocal effort (Lombard reflex), emotional factors (e.g. angry), under cognitive load

28 Carnegie Mellon Video Frames comprise the video –Frame rate = delay between successive frames –minimal change between frames Sequencing creates the illusion of movement > 16 fps is “smooth” Standards: 29.97 is NTSC, 25 is PAL, 60 is HDTV Interlacing Display scan rate is different –monitor refresh rate –60 - 70 Hz (= 1/s)

29 Carnegie Mellon Captured vs. Synthetic Animation vs Video Graphics vs Pictures Synthesizer vs Recording Storage? Manipulation? Processor Requirements? Fidelity to real world Hybrids are possible

30 Carnegie Mellon Why is Multimedia Important? Our society - – captures its experience, – records its accomplishments, – portrays its past – informs its masses ……in pictures, audio and video For many, CNN has become the “publication of record” Multimedia learning leverages “multiple intelligences” Gardner, 1993 Multimedia Digital libraries are an essential component of – formal, informal, and professional learning – distance education, telemedicine

31 Carnegie Mellon Technology Push vs Market Pull –Home Entertainment –Catalog Ordering –Multimedia Training, Education –Videoconferencing –Professional Video Services –Videomail –Speech Recognition

32 Carnegie Mellon Hype vs. Reality What is feasible, under what circumstances? What is possible? What is impossible? What is unlikely?

33 Carnegie Mellon Multimedia Visions DARPA: Dominate the Battle Space HP “1995” LSI “Flash Point” HP “Synergies”

34 Carnegie Mellon Intro to Multimedia That’s all for today

35 Carnegie Mellon


Download ppt "Carnegie Mellon. Carnegie Mellon Multimedia Michael Christel Alex Hauptmann Rong Jin (TA)"

Similar presentations


Ads by Google