Real Time Gesture Recognition of Human Hand Wu Hai Atid Shamaie Alistair Sutherland.

Slides:



Advertisements
Similar presentations
Face Recognition Sumitha Balasuriya.
Advertisements

Face Recognition. Introduction Why we are interested in face recognition? Why we are interested in face recognition? Passport control at terminals in.
人機介面 Gesture Recognition
Designing Facial Animation For Speaking Persian Language Hadi Rahimzadeh June 2005.
Face Alignment with Part-Based Modeling
Finger print classification. What is a fingerprint? Finger skin is made of friction ridges, with pores (sweat glands). Friction ridges are created during.
Low Complexity Keypoint Recognition and Pose Estimation Vincent Lepetit.
Robust 3D Head Pose Classification using Wavelets by Mukesh C. Motwani Dr. Frederick C. Harris, Jr., Thesis Advisor December 5 th, 2002 A thesis submitted.
Vision Based Control Motion Matt Baker Kevin VanDyke.
 INTRODUCTION  STEPS OF GESTURE RECOGNITION  TRACKING TECHNOLOGIES  SPEECH WITH GESTURE  APPLICATIONS.
Emerging biometrics Presenter : Shao-Chieh Lien Adviser : Wei-Yang Lin.
Robust and large-scale alignment Image from
Motion Tracking. Image Processing and Computer Vision: 82 Introduction Finding how objects have moved in an image sequence Movement in space Movement.
Motion Detection And Analysis Michael Knowles Tuesday 13 th January 2004.
1Ellen L. Walker Segmentation Separating “content” from background Separating image into parts corresponding to “real” objects Complete segmentation Each.
LYU0603 A Generic Real-Time Facial Expression Modelling System Supervisor: Prof. Michael R. Lyu Group Member: Cheung Ka Shun ( ) Wong Chi Kin ( )
Department of Electrical and Computer Engineering Physical Biometrics Matthew Webb ECE 8741.
Pores and Ridges: High- Resolution Fingerprint Matching Using Level 3 Features Anil K. Jain Yi Chen Meltem Demirkus.
Recognizing and Tracking Human Action Josephine Sullivan and Stefan Carlsson.
Fitting a Model to Data Reading: 15.1,
Objective of Computer Vision
EECE 279: Real-Time Systems Design Vanderbilt University Ames Brown & Jason Cherry MATCH! Real-Time Facial Recognition.
Presented by Pat Chan Pik Wah 28/04/2005 Qualifying Examination
Smart Traveller with Visual Translator. What is Smart Traveller? Mobile Device which is convenience for a traveller to carry Mobile Device which is convenience.
Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.
Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.
An Illumination Invariant Face Recognition System for Access Control using Video Ognjen Arandjelović Roberto Cipolla Funded by Toshiba Corp. and Trinity.
Real-Time Face Detection and Tracking Using Multiple Cameras RIT Computer Engineering Senior Design Project John RuppertJustin HnatowJared Holsopple This.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
A survey of image-based biometric identification methods: Face, finger print, iris, and others Presented by: David Lin ECE738 Presentation of Project Survey.
Iris Recognition Sathya Swathi Mabbu Long N Vuong.
Irfan Essa, Alex Pentland Facial Expression Recognition using a Dynamic Model and Motion Energy (a review by Paul Fitzpatrick for 6.892)
Multimodal Interaction Dr. Mike Spann
Graphite 2004 Statistical Synthesis of Facial Expressions for the Portrayal of Emotion Lisa Gralewski Bristol University United Kingdom
A Method for Hand Gesture Recognition Jaya Shukla Department of Computer Science Shiv Nadar University Gautam Budh Nagar, India Ashutosh Dwivedi.
Computer Graphics An Introduction. What’s this course all about? 06/10/2015 Lecture 1 2 We will cover… Graphics programming and algorithms Graphics data.
BIOMETRICS By: Lucas Clay and Tim Myers. WHAT IS IT?  Biometrics are a method of uniquely identifying a person based on physical or behavioral traits.
Intelligent Vision Systems ENT 496 Object Shape Identification and Representation Hema C.R. Lecture 7.
March 10, Iris Recognition Instructor: Natalia Schmid BIOM 426: Biometrics Systems.
Access Control Via Face Recognition. Group Members  Thilanka Priyankara  Vimalaharan Paskarasundaram  Manosha Silva  Dinusha Perera.
ECE 8443 – Pattern Recognition EE 3512 – Signals: Continuous and Discrete Objectives: Spectrograms Revisited Feature Extraction Filter Bank Analysis EEG.
Template matching and object recognition. CS8690 Computer Vision University of Missouri at Columbia Matching by relations Idea: –find bits, then say object.
CS654: Digital Image Analysis Lecture 25: Hough Transform Slide credits: Guillermo Sapiro, Mubarak Shah, Derek Hoiem.
1 Perception and VR MONT 104S, Fall 2008 Lecture 21 More Graphics for VR.
Vehicle Segmentation and Tracking From a Low-Angle Off-Axis Camera Neeraj K. Kanhere Committee members Dr. Stanley Birchfield Dr. Robert Schalkoff Dr.
Iris Scanning By, rahul vijay 1. Introduction  Biometrics provides a secure method of authentication and identification.  Biometric identification utilises.
EE 7740 Fingerprint Recognition. Bahadir K. Gunturk2 Biometrics Biometric recognition refers to the use of distinctive characteristics (biometric identifiers)
1 Iris Recognition Ying Sun AICIP Group Meeting November 3, 2006.
Hand Gesture Recognition Using Haar-Like Features and a Stochastic Context-Free Grammar IEEE 高裕凱 陳思安.
Team Members Ming-Chun Chang Lungisa Matshoba Steven Preston Supervisors Dr James Gain Dr Patrick Marais.
Course14 Dynamic Vision. Biological vision can cope with changing world Moving and changing objects Change illumination Change View-point.
Gesture Recognition 12/3/2009.
Face Recognition Summary –Single pose –Multiple pose –Principal components analysis –Model-based recognition –Neural Networks.
776 Computer Vision Jan-Michael Frahm Spring 2012.
Robotics Chapter 6 – Machine Vision Dr. Amit Goradia.
BLOCK BASED MOTION ESTIMATION. Road Map Block Based Motion Estimation Algorithms. Procedure Of 3-Step Search Algorithm. 4-Step Search Algorithm. N-Step.
Computer Graphics One of the central components of three-dimensional graphics has been a basic system that renders objects represented by a set of polygons.
3D Ojbects: Transformations and Modeling. Matrix Operations Matrices have dimensions: Vectors can be thought of as matrices: v=[2,3,4,1] is a 1x4 matrix.
Over the recent years, computer vision has started to play a significant role in the Human Computer Interaction (HCI). With efficient object tracking.
By: Suvigya Tripathi (09BEC094) Ankit V. Gupta (09BEC106) Guided By: Prof. Bhupendra Fataniya Dept. of Electronics and Communication Engineering, Nirma.
776 Computer Vision Jan-Michael Frahm Spring 2012.
IRIS RECOGNITION 1 CITY ENGINEERING COLLEGE Technical Seminar On “IRIS RECOGNITION” By NANDAN.T.MURTHY 1CE06EC043.
When CSI Meets Public WiFi: Inferring Your Mobile Phone Password via WiFi Signals Adekemi Adedokun May 2, 2017.
Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
José Manuel Iñesta José Martínez Sotoca Mateo Buendía
GESTURE RECOGNITION TECHNOLOGY
Iris Recognition.
Recognition: Face Recognition
Vehicle Segmentation and Tracking in the Presence of Occlusions
Coarse Classification
Presentation transcript:

Real Time Gesture Recognition of Human Hand Wu Hai Atid Shamaie Alistair Sutherland

Overview: What are gestures? What can gestures be used for? How to find a hand in an image? How to recognise its shape? How to recognise its motion? How to find its position in 3D space?

What is Gesture? A movement of a limb or the body as an expression of thought or feeling. --Oxford Concise Dictionary 1995

Mood, emotion Mood and emotion are expressed by body language Facial expressions Tone of voice Allows computers to interact with human beings in a more natural way

Human Computer Interface using Gesture Replace mouse and keyboard Pointing gestures Navigate in a virtual environment Pick up and manipulate virtual objects Interact with a 3D world No physical contact with computer Communicate at a distance

Public Display Screens Information display screens Supermarkets Post Offices, Banks Allows control without having to touch the device

Sign Language 5000 gestures in vocabulary each gesture consists of a hand shape, a hand motion and a location in 3D space facial expressions are important full grammar and syntax each country has its own Sign language Irish Sign Language is different from British Sign Language or American Sign Language

A C F

Datagloves

Datagloves provide very accurate measurements of hand-shape But are cumbersome to wear Expensive Connected by wires- restricts freedom of movement

Datagloves - the future Will get lighter and more flexible Will get cheaper ~ $100 Wireless?

Our vision-based system Wireless & FlexibleNo specialised hardware Single CameraReal-time

Coloured Gloves User must wear coloured gloves Very cheap Easy to put on BUT get dirty Eventually we wish to use natural skin

Colour Segment Noise Removal Scale by Area 32

Demo Gesture Video

Feature Space Each point represents a different image Clusters of points represent different hand-shapes Distance between points depends on how similar the images are

A continuous gesture creates a trajectory in feature space We can project a new image onto the trajectory

Multiple sub-spaces Global space Gesture 1 Gesture 2 Classifying a new unknown image

3D spatial position of hand Subspaces and trajectories calculated with hand at origin We know the image co-ordinates and the area of the hand in the original image We can calculate depth and xy-position camera x y

ABC Y Yes/No?

Hierarchical Search We need to search thousands of images How to do this efficiently? We need to use a “coarse-to-fine”search strategy

Original image Blurring Factor = 1 Blurring Factor = 2 Blurring Factor = 3

Factor = 3.0 Factor = 2.0 Factor = 1.0 Multi-scale Hierarchy

Hidden Markov Model ( HMM ) --- time sequence of images modeling Motion Recognition HMM1 (Hello) HMM2 (Good) HMM3(Bad) HMM4 (House) P(f |HMM1) f P(f |HMM2)

Prediction and Tracking Given previous frames we can predict what will happen next Speeds up search. occlusions -

Co-articulation In fluent dialogue signs are modified by preceding and following signs intermediate forms A B

Future Work: Occlusions (Atid) Grammars in Irish Sign Language. --- Sentence Recognition Body Language.

Face Recognition

A noisy environment

Errors

Model-based Recognition

Pose-tracking

Facial Expressions Anger Fear Disgust Happy Sad Surprise

Human Body Tracking

Face Recognition Summary –Single pose –Multiple pose –Principal components analysis –Model-based recognition –Neural Networks

Single Pose Standard head-and-shoulders view with uniform background Easy to find face within image

Aligning Images Alignment –Faces in the training set must be aligned with each other to remove the effects of translation, scale, rotation etc. –It is easy to find the position of the eyes and mouth and then shift and resize images so that are aligned with each other

Nearest Neighbour Once the images have been aligned you can simply search for the member of the training set which is nearest to the test image. There are a number of measures of distance including Euclidean distance, and the cross- correlation

Principal Components PCA reduces the number of dimensions and so the memory requirement is much reduced. The search time is also reduced

Two ways to apply PCA (1) We could apply PCA to the whole training set. Then each face is represented by a point in the PC space We could then apply nearest neighbour to these points

Two ways to apply PCA (2) Alternatively we could apply PCA to the set of faces belonging to each person in the training set Each class (person) is then reprented by a different ellipsoid and Mahalanobis distance can be used to classify a new unknown face You need a lot of images of each person to do this

Problems with PCA The same person may sometimes appear differently due to –Beards, moustaches –Glasses, –Makeup These have to be represented by different ellipsoids

(2) (3) (4) (5) (6) (7) (8) (9) (10)

Problems with PCA Facial expressions –Differing facial expressions Opening and closing the mouth Raised eyebrows Widening the eyes Smiling, frowing etc, These mean that the class is no longer ellipsoidal and must be represented by a manifold

Facial Expressions There are six types of facial expression We could use PCA on the eyes and mouth – so we could have eigeneyes and eigenmouths Anger Fear Disgust Happy Sad Surprise

Multiple Poses Heads must now be aligned in 3D world space Classes now form trajectories in feature space It becomes difficult to recognise faces because the variation due to pose is greater than the variation between people

Model-based Recognition We can fit a model directly to the face image Model consists of a mesh which is matched to facial features such as the eyes, nose, mouth and edges of the face. We use PCA to describe the parameters of the model rather than the pixels.

Model-based Recognition The model copes better with multiple poses and changes in facial expression.

Coarse Classification Fingerprints can be divided into 6 basic classes (some systems use other classes) –Arch –Tented Arch –Whorl –Right loop –Left Loop –Double Loop

Orientation Field The orientation field gives the ridge direction at each point in the image

Identifying Core and Delta The orientation field can be used to identify the core and delta in an image

PCA applied to fingerprints We can align different images so that their cores line up We can then apply PCA to the orientation fields just as we did to face images We can project a new unknown image into the PC space and find the nearest matches in the training set.

Accurate Matching Orientation fields and PCA are not good enough to give an accurate match They can reduce the number of possible candidates We can then apply a more accurate but time-consuming technique to the remaining candidate images

Minutiae Matching Minutiae are fine details of the ridges in the fingerprint image such as –Ridge terminations, –Crossovers –Bifurcations etc. The pattern of minutiae is unique to each individual person

Minutiae Types Different systems define different types of minutiae The most common are terminations (ridge endings) and bifurcations(forks)

Binarisation and Thinning Binarisation –Every pixel is set to either 0 or 1 Thinning –Lines are thinned to a width of 1 pixel

Identifying minutiae Each black pixel in the image is classified using its “crossing number”. The crossing number of pixel p is defined as: cn(p)= Half the sum of the differences between adjacent pixels in the 8- neighbourhood of p

Identifying Minutiae If the crossing number cn is equal to 2 then the pixel is not a minutia but a normal intra- ridge pixel If the crossing number is not equal to 2 then the pixel is some kind of minutia

Removing false-minutiae

Minutiae on a real image The position of the minutiae is marked The direction of the ridge at each minutia is shown as a short line

Matching minutiae between fingerprints We now have to compare minutiae between two different fingerprints to see if they came from the same person Remember the two images may have been rotated, translated or distorted with respect to one another We have to find the combination of rotations, translations and distortions which gives the largest number of matched pairs of minutiae

Two different impressions of the same finger

Matched pairs of minutiae Each pair must match position, direction and type

Hough Transform Discretise the range of values for translation, rotation and distortion Set up an accumulator matrix A in which each element represents a different combination of translation, rotation and distortion For each possible pair of minutiae calculate the best values of translation, rotation and distortion which makes them match Increase the corresponding element of A by 1 At the end of the process the element of A with the largest value represents the best combination

An alternative to minutiae- matching FingerCodes –1. Centre image on core –2. Divide image into circular zones –3. Pass each zone through a set of 8 Gabor filters (more about this next week) –4. Compare the results using Euclidean distance

Iris Recognition

John Daugman There is only one iris recognition algorithm in use The algorithm was developed mainly by John Daugman, PhD, OBE It is owned by the company Iridian Technologies

Advantages of Iris Recognition Irises do not change with age – unlike faces Irises do not suffer from scratches, abrasions, grease or dirt – unlike fingerprints Irises do not suffer from distortion – unlike fingerprints

Finding the Iris in the Image It is easy to find the circular boundaries of the iris

Masking The boundaries of the eyelids can be found Eyelashes and specularities (reflections) can be found These areas can be masked out

Gabor Wavelets

Gabor Wavelets filter out structures at different scales and orientations For each scale and orientation there is a pair of odd and even wavelets A scalar product is carried out between the wavelet and the image (just as in the Discrete Fourier Transform) The result is a complex number

Phase Demodulation The complex number is converted to 2 bits The modulus is thrown away because it is sensitive to illumination intensity The phase is converted to 2 bits depending on which quadrant it is in

IrisCodes This process is carried out at a number of points throughout the image. The result is 2048 bits which describe each iris uniquely Two codes from different irises can be compared by finding the number of bits different between them – this is called the Hamming distance This is equivalent to computing an XOR between the two codes. This can be done very quickly To allow for rotation of the iris images the codes can be shifted with respect to each other and the minimum Hamming distance found

Hamming Distance

Binomial Distribution If two codes come from different irises the different bits will be random The number of different bits will obey a binomial distribution with mean 0.5

Identification If two codes come from the same iris the differences will no longer be random The Hamming distance will be less than expected than if the differences were random If the Hamming distance is < 0.33 the chances of the two codes coming from different irises is 1 in 2.9 million So far it has been tried out on 2.3 million people without a single error

More Advantages of IrisCodes IrisCodes are extremely accurate Matching is very fast compared to fingerprints or faces Memory requirments are very low – only 2048 bits per iris

Disadvantages of the Iris for Identification Small target (1 cm) to acquire from a distance (1 m) Moving target...within another... on yet another Located behind a curved, wet, reflecting surface Obscured by eyelashes, lenses, reflections Partially occluded by eyelids, often drooping Deforms non-elastically as pupil changes size Illumination should not be visible or bright Some negative (Orwellian) connotations

Fake Iris Attacks

Fake Iris Fourier Spectrum Due to the dot matrix grid the Fourier Spectrum of the fake iris has 4 extra points