Download presentation
Presentation is loading. Please wait.
Published byBuck Barry Watts Modified over 9 years ago
1
Where, Who and What? @AIT Intelligent Affective Interaction ICANN, Sept. 14, Athens, Greece Aristodemos Pnevmatikakis, John Soldatos and Fotios Talantzis Athens Information Technology, Autonomic & Grid Computing
2
Overview CHIL –AIT SmartLab Signal Processing for perceptual components –Video Processing –Audio Processing Services Middleware –Easing application assembly
3
Computers in the Human Interaction Loop EU FP6 Integrated Project (IP 506909) Coordinators: Universität Karlsruhe (TH) Fraunhofer Institute IITB Duration: 36 months Total Project costs: Over 24M€ Goal: Create environments in which computers serve humans who focus on interacting with other humans as opposed to having to attend to and being preoccupied with the machines themselves Key Research Areas: –Perceptual Technologies –Software Infrastructure –Human-Centric Pervasive Services
4
AIT SmartLab Equipment Five fixed cameras (one with fish-eye lens) PTZ camera NIST 64-channel array 4 clusters of 4 inverted T-shaped SHURE microphone clusters 4 tabletop microphones 6 dual Xeon 3 GHz, 2 Gb PCs Firewire cables & repeaters
5
AIT SmartLab
6
Perceptual Components
7
Detection and Identification System Recognizer Detector Eye detector Head detector Tracker Face normalizer Face recognizer Frontal verifier Confidence estimator Weighted voting Classifier confidence ID Frontality confidence
8
Unconstrained Video Difficulties
9
Where and Who are the World Cup Finalists? and European Champions?
10
Tracking
11
Tracking – Smart Spaces
12
Tracking – 3D from Synchronized Cameras
13
Tracking – Outdoors Surveillance AIT system 2 nd in the VACE / NIST surveillance evaluations
14
Head Detection Eye detector Head detector Tracker Face normalizer Face recognizer Frontal verifier Confidence estimator Weighted voting Detection of head by processing the outline of the foreground belonging to the body
15
Eye Detection Eye detector Head detector Tracker Face normalizer Face recognizer Frontal verifier Confidence estimator Weighted voting Vector quantization of colors in head region Detect candidate eye regions –Based on resemblance to skin, brightness, shape and size Selection amongst candidates based on face geometry
16
Face Recognition from Video
17
Effect of Eye Misalignment: LDA
18
Effect of Eye Misalignment
19
Classifier Fusion Illumination variations Pose variations Classifier fusion addresses the fact that different classifiers are optimum for different recognition impairments
20
Fusion Across Time, Classifiers and Modalities
21
Face Recognition @ CLEAR2006 15 sec training30 sec training Testing duration (sec) 151020151020 AIT50.5729.6823.1820.2247.3131.1426.6424.72 UKA46.8233.5828.0323.0340.1323.1120.4216.29 UPC79.7778.5977.5176.4080.4277.1374.3973.03 New AIT45.3527.0117.6515.7343.7217.7613.497.86
22
Speaker ID @ CLEAR2006 15 sec training30 sec training Testing duration (sec) 151020151020 AIT26.929.737.964.4915.172.681.730.56 CMU23.657.797.273.9314.362.191.380.00 LIMSI51.7110.956.573.3738.835.842.080.00 UPC24.9610.7110.7311.8015.992.923.812.81 AIT IS2006 25.695.604.502.2515.012.192.420.0
23
Audiovisual ID @ CLEAR2006 15 sec training30 sec training Testing duration (sec) 151020151020 AIT23.656.816.572.8113.702.191.730.56 UIUC primary 17.612.681.730.5613.212.431.380.56 UIUC contrast 20.555.603.812.2515.993.412.421.12 UKA / CMU 43.0729.2023.8820.2235.7319.7116.6112.36 UPC23.168.035.883.9313.382.922.081.12
24
Audiovisual Tracker Information-theoretic speaker localization from mic. array –Accurate azimuth, approximate depth, no elevation Moderate targeting of speaker’s face using a PTZ camera Refine targeting by visual face detection
25
Services
26
Memory Jog Memory Jog: –Context-Aware Human-Centric Assistant for meetings, lectures, presentations –Proactive, Reactive Assistance and Information Retrieval Features-Functionalities –Sophisticated Situation Modeling / Tracking –Essentially Non-obtrusive Operation –Intelligent Meeting Recording Functionality –GUI runs also on PDA –Full Compliance to CHIL Architecture –Integration actuating devices (Targeted Audio, Projectors)
27
Context as Network of Situations TransitionElements & Components NIL S1Table Watcher (people in table area), SAD S1 S2 White-Board Watcher (presenter in speaker area), Face ID, Speaker ID S2 S3 Speaker ID (speaker ID ≠ presenter ID), Speaker Tracking S3 S2 Face Detection (presenter in speaker area), Face ID, Speaker ID S2 S4 White-Board Watcher (no face in speaker area for N seconds), Table Watcher (all participants in meeting table) S4 S5Table Watcher (nobody in table area)
28
What Happened While I was Away?
29
Middleware
30
Virtualized Sensor Access
31
CHIL Compliant Perceptual Components Several sites develop site, room, configuration specific Perceptual Components for CHIL Provide common abstractions in the input and output of the PC (black box) Facilitate Component Exchange Across Sites & Vendors Standardization commenced for Body Trackers –Continues to Face ID Components
32
Architecture for Body Tracker Exchange Information retrieval Transparent connection to sensor output Common control API (CHILiX) Services complying to current API Non-CHIL Compliant Body Tracker Sensor abstraction
33
Thank you! Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.