Building an Aware Home Irfan Essa Aware Home Research Initiative GVU Center / College of Computing Georgia Institute of Technology
© Irfan Essa and Georgia Institute of Technology, Research Goal How can your house help, if it is aware (of your whereabouts, activities, needs, intentions, etc.)?
© Irfan Essa and Georgia Institute of Technology, Important Goals: Ubiquity Sensing and output technology that is transparent to everyday activities. Passive Anywhere, anytime input/output. Provide an ability to sense, interact, display information, communicate, without increasing burden/load on users. Aware of residents, sense them! who, what, where, why? (W4) noninvasive, unobtrusive, perceptual, ubiquitous, natural interface
© Irfan Essa and Georgia Institute of Technology, Sense, Measure, Monitor? Issues of location: Where are people? Identity: Where are which people? What about new people? Local action “Sitting/Getting up”, “Climbing stairs”, “Washing dishes”, “Reading book”, etc. Extended action “Eating a meal”,”preparing a meal”, Really extended action “Change of mobility”, “eating well”
© Irfan Essa and Georgia Institute of Technology, Good hard perceptional problems From a perception standpoint, sensing in the Aware Home demands the solution to several classes of fundamental problems: Sensing user state Understanding user activity Noticing variation over longer time scales “trending”, “routines” … a really good set for Computer Vision researchers. … but vision may not be (is not) enough. … a sensor fusion, sensor interpretation problem.
© Irfan Essa and Georgia Institute of Technology, So what form of sensing? Typical, do-able but: “Grandma fell down and didn’t get up” Why not: Because if that is all you want to do, there are better, cheaper, more reliable ways (though the failure modes need to be designed well). Tracking is STILL HARD! Many other sensors can be pervasive but … If you have a vision infrastructure, and basic primitive capabilities, every new task is not a re- engineering job. It can help focusing on some important event (context)
© Irfan Essa and Georgia Institute of Technology, Practical Indoor Sensing RF ID instrumentation Floor mats Below-knee tags Room-level positioning Can other sensing build on top of this?
© Irfan Essa and Georgia Institute of Technology, Vision infrastructure 20+ Fixed Cameras (Analog & Digital *IEEE 1394*) 16+ PIII PCs (2 cameras / PC) 8 Pan-Tilt-Zoom Cameras Stereo and other special purpose cameras
© Irfan Essa and Georgia Institute of Technology, Vision-based tracking methods Background Segmentation / Modeling. Color Histograms / Segmentation. Template / Appearance Modeling & Matching. Motion Integration. Calibration, Perspective Modeling. Sensor Fusion (between cameras and other sensors). Learning Methods Client-server Architecture for distributed Processing.
© Irfan Essa and Georgia Institute of Technology, Tracking from ceiling sensors A person is tracked and his activities are reported on the map.
© Irfan Essa and Georgia Institute of Technology, Tracking from Above
© Irfan Essa and Georgia Institute of Technology, Room mapping 2D descriptions Overlapping cameras
© Irfan Essa and Georgia Institute of Technology, The Gesture Pendant Wear sensors looking outwards. (1 st vs. 2 nd vs. 3 rd person perspective). Simplified home control Biometrics, biomedical, etc. Starner et al.
© Irfan Essa and Georgia Institute of Technology, Eye/Pupil Tracking
© Irfan Essa and Georgia Institute of Technology, Audio Sensors Speech recognition. Augment interaction. Tracking / identification. Affect Determination (anger, stress, sadness, happiness). Noise cancellation. Acoustic Modeling.
© Irfan Essa and Georgia Institute of Technology, Auditory Localization I Phased Array Microphones Localize a speaker and move a pan-tilt-zoom camera to their face microphone system Vision can help with face tracking Sensor-fusion
© Irfan Essa and Georgia Institute of Technology, Auditory Localization II Adaptive Array Processing Determine Time Delay of Arrival (TDoA) to determine source. 59 microphone array Interaction with NIST (Vince Stanford).
© Irfan Essa and Georgia Institute of Technology, Video-based Tracking Cameras
© Irfan Essa and Georgia Institute of Technology, System Architecture Video Locations Camera 1 (Fixed) Camera 2 (Fixed) Color Tracking Color Tracking Motion Tracking Motion Tracking Calibrated Video Camera 3 (PTZ) Camera 4 (PTZ) Color Tracking Beam Former Face Tracking Auditory Localization Face Tracking Video More Sensors Room Manager Face Recog.
© Irfan Essa and Georgia Institute of Technology, Occupancy Grid
© Irfan Essa and Georgia Institute of Technology, Combining Sensors Map of the Room, showing sensors and 2 residents in the room Visual tracking of a resident Visual identification of a resident Paper appears in PUI 2001
© Irfan Essa and Georgia Institute of Technology, Multi-modal tracking
© Irfan Essa and Georgia Institute of Technology, What Was I Cooking? Mynatt, Abowd, et al.
© Irfan Essa and Georgia Institute of Technology, Video
© Irfan Essa and Georgia Institute of Technology, Behavior Analysis Detection Behavior Accuracy Low-Risk 92% High-Risk 76% Novice 100% Expert 90% After ~10 trials per person
© Irfan Essa and Georgia Institute of Technology, Routine Activities share a set of component tasks identify a subset of tasks and measure the demand for the performance of such tasks model and predict successful and independent performance of an activity (Clark, Czaja, & Weber, 1990; Connell & Sanford, 1997; Sanford, Story, & Ringholz, 1998).
© Irfan Essa and Georgia Institute of Technology, Routine Household Activities Activities of Daily Living (ADLs) [dressing, bathing, etc.] Instrumental Activities of Daily Living (IADLs) [house cleaning, laundry, cooking]. Enhanced Activities of Daily Living (EADLs). ADLs, IADLs, and EADLs can potentially be aided by Aware Environments.
© Irfan Essa and Georgia Institute of Technology, Face of the House! PS. Did some facial expression recognition earlier, ask Sandy.
© Irfan Essa and Georgia Institute of Technology, Finally, the Context. We need it. How do we represent what: really the heart of the question What is context? Helps define the target vocabulary of sensing and perception, and input information for decision making. NEED Experts. Software Engineering: inflow, synchronization, storage, access, delivery (e.g. Context Toolkit, Abowd et al.)
© Irfan Essa and Georgia Institute of Technology, Ethical Issues These visions concern some people (as they should!). For example, with automated capture: who controls and distributes capture? what about silent and intimidated minority? Educate & confront Policy
© Irfan Essa and Georgia Institute of Technology, More! We are interested in building useful (important) “Living Laboratories” (and learning how to build them too). We will build, test, evaluate, and rebuild. “This Aware House.” See: Others Gregory Abowd, Beth Mynatt, Wendy Rogers, Aaron Bobick, Thad Starner, many UG, MS, PhD students and Research Scientists.
© Irfan Essa and Georgia Institute of Technology, “blob” management b/w clients
© Irfan Essa and Georgia Institute of Technology, All the same person?
© Irfan Essa and Georgia Institute of Technology, Location Awareness of a resident is crucial! Claims of reliable location sensing are somewhat exaggerated. Vision can help (so can audio), but we need something reliable (24/7). Room-level accuracy a major requirement.
© Irfan Essa and Georgia Institute of Technology, Natural tasks for vision Location refinement Non-location determined activity “Couch potatoes” Basic activity Contextual-triggers ”Preparing to leave the house” Lots of potential features Statistical characterization “Slower going up the stairs this week than last”
© Irfan Essa and Georgia Institute of Technology, A Video Study 1. A video-based naturalistic protocol developed to record and study routine activities (Sanford et al. 1997, 2000). Video of 28 Participants (50-80) in their homes. 2. Analyze existing video and code specific task related problems. Determine what codings can be represented and modeled computationally.
© Irfan Essa and Georgia Institute of Technology, Keeping track of blobs Overhead cameras These are not plan view cameras, but require mapping (calibrate if desired). A messy house is not a lab – much less control. Integrate according to 2.1D location – really foot location in plan view by simple learning. The (dreaded) N-to-M problem: Temporal integration on appearance Probabilistic assignment Finite look ahead and look behind