Automated Drowsiness Detection For Improved Driving Safety Aytül Erçil November 13, 2008
Outline Problem Background and Description Technological Background Action Unit Detection Drowsiness Prediction
Objectives/Overview Statistical Inference of fatigue Using Machine Learning Techniques
In over accidents in (in Turkey): Injured: 123,985 people Deceased: 3,215 people Financial loss: 651,166,236 USD Driver error has been blamed as the primary cause for approximately 80% of these traffic accidents.
The US National Highway Tra ffi c Safety Administration estimates that in the US alone approximately 100,000 crashes each year are caused primarily by driver drowsiness or fatigue
Growing Interest In Intelligent Vehicles US Department of Transportation Initiative European Transport Policy for 2010: set a target to halve road fatalities by Problem Background
The Drivesafe Project
Current Funding Status: Turkish Development Agency funding of Drive-Safe (August 2005-July. 2009) Japanese New Energy and Industrial Technology Development Organization (NEDO) (October December 2008) FP6 SPICE Project at Sabancı University (May October 2008) FP6 AUTOCOM Project at ITU Mekar (May April 2008).
Readiness-to-perform Mathematical models of alertness dynamics Vehicle-based performance technologies (Vehicle Speed, Lateral Position, Pedal Movement) In-vehicle, on-line, operator status monitoring technologies Fatigue Detection and Prediction Technologies
Physiological Signals (heart rate, pulse rate and Electroencephalography (EEG)) Computer Vision Systems (detect and recognize the facial motion and appearance changes occurring during drowsiness) In-vehicle, on-line, operator status monitoring technologies
Computer Vision Systems Visual Behaviors Visual Behaviors Examples Examples Gaze Direction Gaze Direction Head Movement Head Movement Yawning Yawning No requirement for physical contact No requirement for physical contact
Facial Actions Ekman & Friesen, 1978
Background Information- Action Units
Proposed Work Detection Of Driver Fatigue From A Recorded Video Using Facial Appearance Changes The framework will be based on graphical models and machine learning approaches
Proposed Architecture Sensing Channels Eye Tracker AU 61 Pupil Motion AU 62 Gaze Tracker Gaze AU 51AU 52 Eye Tracker AU 61 Pupil Motion AU 62 Gaze Tracker Gaze AU 51AU 52 Features Time n-1Time n InattentiveFalling Asleep Fatigue InattentiveFalling Asleep Fatigue Entire Face Behavior Partial Face Behavior Single AU
Action Unit Tracking Previous techniques Previous techniques Do not employ a spatially and temporally dependent structure for Action Unit Tracking Do not employ a spatially and temporally dependent structure for Action Unit Tracking Contextual information is not exploited Contextual information is not exploited Temporal information is not exploited Temporal information is not exploited
Classification- Challenges Which action units or combinations is a cue for fatigue?
Learning from real examples Posed Drowsiness Actual Drowsiness Different Neural pathways for posed/spontaneous expressions
Initial Experimental Setup Subjects played a driving video game on a windows machine using a steering wheel and an open source multi-platform video game. At random times, a wind e ff ect was applied that dragged the car to the right or left, forcing the subject to correct the position of the car.
Head movement measures Head movement was measured using an accelerometer that has 3 degrees of freedom. This three dimensional accelerometer has three one dimensional accelerometers mounted at right angles measuring accelerations in the range of 5g to +5g
The one minute preceding a sleep episode or a crash was identified as a non-alert state. There was a mean of 24 non- alert episodes with a minimum of 9 and a maximum of 35. Fourteen alert segments for each subject were collected from the first 20 minutes of the driving task.
Crash Overcorrection Seconds 0 20 Steering Distance from center Eye opening Eyes closed
Histograms for Eye Closure and Eye Brow Up Eye Closure: AU45Brow Raise:AU2 Area under the ROC
Pattern Recognition (Adaboost) (SVM) Feature Selection Machine Learning Facial Action Unit Detection AU1 AU2 AU4 …. AU46 ++
Drowsiness Prediction The facial action outputs were passed to a classifier for predicting drowsiness based on the automatically detected facial behavior. Two learning-based classifiers, Adaboost and multinomial logistic regression are compared. Within-subject prediction of drowsiness and across-subject (subject independent) prediction of drowsiness were both tested.
Classification Task Multinomial Logistic Regression (MLR) Frame Alert 60 sec Before crash : 31 Facial Action Channels Continuous output for each frame AU1 AU2 AU4 AU31
Testing: MLR Weighted Temporal Windows
Within subject drowsiness prediction For the within-subject prediction, 80% of the alert and non- alert episodes were used for training and the other 20% were reserved for testing. This resulted in a mean of 19 non-alert and 11 alert episodes for training, and 5 non-alert and 3 alert episodes for testing per subject.
Across Subject Drowsiness Prediction Training : 31 actions -> MLR Classifier Framewise training Cross validation: 3 subjects –> training 1 subject –> testing Crash prediction: choose 5 best features by sequential feature selection Sum MLR weighted features over 12 second time interval.98 across subjects (Area under the ROC)
More when critically drowsy Eye Closure Brow Raise Chin Raise Frown Nose Jaw WrinkleSideways Predictive Performance of Individual Facial Actions
Less when critically drowsy Smile Squint Nostril Brow Lower Jaw Drop Compressor A’ >.75
We observed during this study that many subjects raised their eyebrows in an attempt to keep their eyes open, and the strong association of the AU 2 detector is consistent with that observation. Also of note is that action 26, jaw drop, which occurs during yawning, actually occurred less often in the critical 60 seconds prior to a crash. This is consistent with the prediction that yawning does not tend to occur in the final moments before falling asleep.
Drowsiness detection performance, using an MLR classifier with di ff erent feature combinations.
Effect of Temporal Window Length * 12 secondsA’ Seconds
Coupling of Facial Movements ALERTDROWSY Eye Openness Brow Raises Brow Raise Eye Closure Brow Raise Eye Closure r=0.87 0Seconds10 Seconds0
Coupling of Steering and Head Motion ALERT DROWSY r=0.27 r=0.65 Steering Head Acceleration Steering Seconds60 00 Seconds
Coupling of Steering and Head Motion
New associations between facial behavior and drowsiness Brow raise Chin raise More head roll Possibly less yawning just before crash Coupling of behaviors – Head movement and steering – Brow raise and eye opening
Future Work Extend the graphical model so that it captures the temporal relationships using a discriminative approach Extend the graphical model so that it captures the temporal relationships using a discriminative approach
Future Work: More Data Collection in Simulator Environment Uykucu (Sleepy)
Thank you