A novel depth-based head tracking and facial gesture recognition system by Dr. Farzin Deravi– EDA UoK Dr Konstantinos Sirlantzis– EDA UoK Shivanand Guness – EDA UoK Dr. Mohamed Sakel - EKHUFT Dr. Matthew Pepper - EKHUFT
Overview Clinical Background Objectives Kinect : RGB-D sensor Technical Approach Evaluation Technique Experimentation Results Conclusion Future Work
Clinical Background Individuals with conditions such as Motor Neuron Disease (MND), Cerebral Palsy (CP) and Multiple Sclerosis (MS): – Can lose the ability to speak – May only make small head/facial movements in cases where the person has MND or MS Capture and interpret the intentions and messages of the patients from their limited movement
Objectives Develop reliable automatic gesture recognition system – Tracking head movement; – Detect facial gestures such as eye blink, wink etc. Adaptive system to adapt to condition or user over time Develop a low cost assistive device
Kinect : RGB-D sensor
Kinect : RGB-D sensor(cont.) Projects a known pattern (Speckles) in Near-Infrared light. CMOS IR camera observes the scene. Calibration between the projector and camera has already been carried out and is known. Projection generated by a diffuser and diffractive element of IR light
Technical Approach
Technical Approach(cont.) Depth Map RGB Image Area of object nearest to sensor
Experimentation SetupFitts’ Test target screen
Evaluation Technique Fitts’ test for HCI is used to evaluate the tracking algorithms Initially developed by Paul Fitts in 1953 to model human movement Adapted to HCI by Scott MacKenzie in the 1992 ISO :2000 (Ergonomic requirements for office work with visual display terminals (VDTs)—Part 9— Requirements for non-keyboard input devices) – is based on Fitts’ Test
Fitts’ Test Two Key Parameters
Fitts’ Test (cont) Effective index of difficulty ID e = log 2 (D/W e +1) – where (D) is the distance from the home to the target and W e is the effective width of the target. W e, is calculated Effective Width W e = x SD – where SD is the standard deviation of the selection coordinates.
Fitts’ Test (cont)
Fitts’ Test Evaluation Width (W)Distance (D)Index of Difficulty(ID)
Result DevicesEffective Throughput (TP e ) DwellBlinkEyebrows Standard Mouse(ms) 0.84n/a CameraMouse (cm) 0.48n/a SmartNav (sn) 0.42n/a Vision head tracker (using webcam) 0.21 (ht-dwell) 0.15 (ht-blink) 0.08 (ht-brows) RGB-D head tracker (using Kinect) 0.30 (kht-dwell) 0.28 (kht-blink) 0.09 (kht-brows)
Result Throughput and Effective Throughput
Result – Index of Difficulty
Result – Effective Index of Difficulty
Result – Index of Difficulty
Result – Effective Index of Difficulty
Conclusion RGB-D head tracking system is shown to have an improved performance over the vision-based head tracking system – TP e of dwell clicking increased by a third (from 0.21 to 0.30 bits per second) – TP e of blink clicking double (from 0.15 to 0.28 bits per second) The eye blink detection algorithm performance is quite near to the performance of the dwell click switch
Future Work Increase the robustness of the HeadTracker Investigate the impact of the different facial gesture switches on performance Investigate additional facial gesture switches such as mouth (open-close), tongue etc. Conduct Fitts’ Test experimentation using about participants – with healthy volunteer
THANK YOU Project Website :