Dmitri Bitouk Shree K. Nayar Columbia University Creating a Speech Enabled Avatar from a Single Photograph.

Slides:



Advertisements
Similar presentations
Descriptive schemes for facial expression introduction.
Advertisements

1 Registration of 3D Faces Leow Wee Kheng CS6101 AY Semester 1.
Hand Gesture for Taking Self Portrait Shaowei Chu and Jiro Tanaka University of Tsukuba Japan 12th July 15 minutes talk.
Created By: Lauren Snyder, Juliana Gerard, Dom Williams, and Ryan Holsopple.
Speech Recognition Part 3 Back end processing. Speech recognition simplified block diagram Speech Capture Speech Capture Feature Extraction Feature Extraction.
Chapter 5 p. 6 What Is Input? What is input? p. 230 and 232 Fig. 5-1 Next  Input device is any hardware component used to enter data.
1. Facial Expression Editing in Video Using a Temporally- Smooth Factorization 2. Face Swapping: Automatically Replacing Faces in Photographs.
Digital Interactive Entertainment Dr. Yangsheng Wang Professor of Institute of Automation Chinese Academy of Sciences
Face Alignment with Part-Based Modeling
3D Face Modeling Michaël De Smet.
 INTRODUCTION  STEPS OF GESTURE RECOGNITION  TRACKING TECHNOLOGIES  SPEECH WITH GESTURE  APPLICATIONS.
1 Face Synthesis M. L. Gavrilova. 2 Outline Face Synthesis From Modeling to Synthesis Facial Expression Synthesis Conclusions.
Stereo.
Exchanging Faces in Images SIGGRAPH ’04 Blanz V., Scherbaum K., Vetter T., Seidel HP. Speaker: Alvin Date: 21 July 2004.
Video Rewrite Driving Visual Speech with Audio Christoph Bregler Michele Covell Malcolm Slaney Presenter : Jack jeryes 3/3/2008.
LYU0603 A Generic Real-Time Facial Expression Modelling System Supervisor: Prof. Michael R. Lyu Group Member: Cheung Ka Shun ( ) Wong Chi Kin ( )
SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.
RECOGNIZING FACIAL EXPRESSIONS THROUGH TRACKING Salih Burak Gokturk.
Human Face Modeling and Animation Example of problems in application of multimedia signal processing.
MUltimo3-D: a Testbed for Multimodel 3-D PC Presenter: Yi Shi & Saul Rodriguez March 14, 2008.
Create Photo-Realistic Talking Face Changbo Hu * This work was done during visiting Microsoft Research China with Baining Guo and Bo Zhang.
Automatic Face Recognition for Film Character Retrieval in Feature-Length Films Ognjen Arandjelović Andrew Zisserman.
Realistic Facial Modelling For Animation. Facial Modeling For Animation Building a general face mesh Building a general face mesh 3D digitization of the.
Vision-based Control of 3D Facial Animation Jin-xiang Chai Jing Xiao Jessica Hodgins Carnegie Mellon University.
1 Bronstein 2 & Kimmel An isometric model for facial animation and beyond AMDO, Puerto de Andratx, 2006 An isometric model for facial animation and beyond.
Visual Speech Recognition Using Hidden Markov Models Kofi A. Boakye CS280 Course Project.
Faces: Analysis and Synthesis Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski.
Special Applications If we assume a specific application, many image- based rendering tools can be improved –The Lumigraph assumed the special domain of.
The University of Ontario CS 4487/9587 Algorithms for Image Analysis n Web page: Announcements, assignments, code samples/libraries,
Chapter 5 Input. What Is Input? What are the input devices? Input device is any hardware component used to enter data or instructions Data or instructions.
Research Area B Leif Kobbelt. Communication System Interface Research Area B 2.
Facial Animation By: Shahzad Malik CSC2529 Presentation March 5, 2003.
Helsinki University of Technology Laboratory of Computational Engineering Modeling facial expressions for Finnish talking head Michael Frydrych, LCE,
GIP: Computer Graphics & Image Processing 1 1 Medical Image Processing & 3D Modeling.
A FACEREADER- DRIVEN 3D EXPRESSIVE AVATAR Crystal Butler | Amsterdam 2013.
Facial animation retargeting framework using radial basis functions Tamás Umenhoffer, Balázs Tóth Introduction Realistic facial animation16 is a challenging.
Research & development Learning optimal audiovisual phasing for an HMM-based control model for facial animation O. Govokhina (1,2), G. Bailly (2), G. Breton.
Three Topics Facial Animation 2D Animated Mesh MPEG-4 Audio.
110/20/ :06 Graphics II Paper Reviews Facial Animation Session 8.
Slide 1 Standard Grade Computing Multimedia and Presentation.
Computer Science Department Pacific University Artificial Intelligence -- Computer Vision.
Presented by Matthew Cook INFO410 & INFO350 S INFORMATION SCIENCE Paper Discussion: Dynamic 3D Avatar Creation from Hand-held Video Input Paper Discussion:
1 Reconstructing head models from photograph for individualized 3D-audio processing Matteo Dellepiane, Nico Pietroni, Nicolas Tsingos, Manuel Asselot,
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean 5403 T-R 3:00pm – 4:20pm Lecture #16.
12/7/10 Looking Back, Moving Forward Computational Photography Derek Hoiem, University of Illinois Photo Credit Lee Cullivan.
Animated Speech Therapist for Individuals with Parkinson Disease Supported by the Coleman Institute for Cognitive Disabilities J. Yan, L. Ramig and R.
Computer Vision, CS766 Staff Instructor: Li Zhang TA: Yu-Chi Lai
1 Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents S. Kawamoto, et al. October 27, 2004.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
University of Washington v The Hebrew University * Microsoft Research Synthesizing Realistic Facial Expressions from Photographs Frederic Pighin Jamie.
Presented by : P L N GANESH CH DURGA PRASAD M RAVI TEJA 08551A A A0446.
Facial Motion Cloning Using Global Shape Deformation Marco Fratarcangeli and Marco Schaerf University of Rome “La Sapienza”
OTHER RESEARCH IN SIGNAL PROCESSING AND COMMUNICATIONS IN ECE Richard Stern Carnegie Mellon University (with Dave Casasent, Tsuhan Chen, Vijaya Kumar,
Facial Animation Wilson Chang Paul Salmon April 9, 1999 Computer Animation University of Wisconsin-Madison.
Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:
UCL Human Representation in Immersive Space. UCL Human Representation in Immersive Space Body ChatSensing Z X Y Zr YrXr Real–Time Animation.
FACE RECOGNITION. A facial recognition system is a computer application for automatically identifying or verifying a person from a digital image or a.
03/26/03© 2003 University of Wisconsin Last Time Light Fields.
Computer Animation Algorithms and Techniques
Digital Video Library - Jacky Ma.
CALO VISUAL INTERFACE RESEARCH PROGRESS
VideoTrace: Interactive 3d modelling for all
Proposed architecture of a Fully Integrated
MikeTalk:An Adaptive Man-Machine Interface
AN INTRODUCTION TO COMPUTER GRAPHICS Subject: Computer Graphics Lecture No: 01 Batch: 16BS(Information Technology)
AHED Automatic Human Emotion Detection
AHED Automatic Human Emotion Detection
Founded Meuro/year funding 12 FP6 and 20 FP5 IST projects
Paper presentation by: Dan Andrei Ganea and Anca Negulescu
End-to-End Speech-Driven Facial Animation with Temporal GANs
Presentation transcript:

Dmitri Bitouk Shree K. Nayar Columbia University Creating a Speech Enabled Avatar from a Single Photograph

Speech Enabled Avatar Input photograph

Speech Enabled Avatar Input photographAvatar

Speech Enabled Avatar Input photographAvatar Applications: mobile messaging and video conferencing news reporting and information kiosks novel user interfaces

Facial Motion Synthesis Challenges Mapping phonemes to static mouth shapes produces unrealistic, jerky animations Co-articulation: facial articulations can be dominated the preceding as well upcoming phonemes Asynchrony: facial motion may precede the corresponding sound

Related Work Avatars from video sequences Bregler et al 1997, Ezzat et al 2002, etc 2D Avatars from photographs Blanz et al 2003, CrazyTalk TM, MotionPortrait TM

Generic Facial Motion Model - Facial motion parameters Bitouk 2006 Prototype SurfaceDeformed Surface

Generic Facial Motion Model

Facial Motion Transfer Bitouk 2006 Prototype FaceNovel Faces

Facial Motion Transfer Bitouk 2006 Prototype FaceNovel Faces

Phonemes: /B/, /K/, /AA/, /IY/, etc With lexical: /B/, /K/, /AA0/, /AA1/, /IY0/, /IY1/, etc stress Triphones: Hidden Markov Models s1s1 s2s2 Facial motion parameters

Training Hidden Markov Models Training set consists of motion capture data Baum-Welch embedded re-estimation Cluster triphone states to predict triphones not seen in the training set

Facial Motion Synthesis from Text Text-to-Speech Engine Hidden Markov Models TextSpeech Facial Motion Parameters Time-labeled phonemes

Fitting the Prototype Model to an Image 2D Prototype FacePhotograph

Fitting the Prototype Model to an Image 2D Prototype FacePhotograph

Facial Motion Synthesis

Eye Motion Synthesis

Eyeball Texture Synthesis Eye ImageSynthesized Eyeball Texture

Eye Motion Synthesis Eye Motion Geometry

Eye Motion and Blinking

Visual Text-to-Speech Synthesis

Facial Motion Synthesis from Speech Speech Recognition Hidden Markov Models Speech Facial Motion Parameters Time-labeled phonemes

Facial Motion Synthesis from Speech

3D Avatars Mirror ViewDirect View Captured Stereo Image Gluckman & Nayar, 2001

3D Avatars Rectified Images3D Model Mirror ViewDirect View

3D Avatars Digital projector Point cloud engraved inside a glass cube Nayar & Anand, 2007

3D Avatars

Limitations and Future Work Automatic facial feature detection Synthesis of rigid head motion Expressive speech Web demo of our system will be available in early April

The End