1 Scene Understanding perception, multi-sensor fusion, spatio-temporal reasoning and activity recognition. Francois BREMOND PULSAR project-team, INRIA.

Slides:



Advertisements
Similar presentations
GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
Advertisements

Genoa, Italy September 2-4, th IEEE International Conference on Advanced Video and Signal Based Surveillance Combination of Roadside and In-Vehicle.
HealthCare Monitoring: GERHOME Project Monique Thonnat, Francois Bremond & Nadia Zouba PULSAR, INRIA Date.
1 Early Pest Detection in Greenhouses Vincent Martin, Sabine Moisan INRIA Sophia Antipolis Méditerranée, Pulsar project-team, France.
Road-Sign Detection and Recognition Based on Support Vector Machines Saturnino, Sergio et al. Yunjia Man ECG 782 Dr. Brendan.
Kien A. Hua Division of Computer Science University of Central Florida.
Change Detection C. Stauffer and W.E.L. Grimson, “Learning patterns of activity using real time tracking,” IEEE Trans. On PAMI, 22(8): , Aug 2000.
PETS’05, Beijing, October 16 th 2005 ETISEO Project Ground Truth & Video annotation.
Context Awareness System and Service SCENE JS Lee 1 An Energy-Aware Framework for Dynamic Software Management in Mobile Computing Systems.
Towards a Video Camera Network for Early Pest Detection in Greenhouses
Visual Event Detection & Recognition Filiz Bunyak Ersoy, Ph.D. student Smart Engineering Systems Lab.
Decision Making: An Introduction 1. 2 Decision Making Decision Making is a process of choosing among two or more alternative courses of action for the.
PULSAR Perception Understanding Learning Systems for Activity Recognition Theme: Cognitive Systems Cog C Multimedia data: interpretation and man-machine.
Page16/2/2015 Sirlan Usage and usability considerations for SIRLAN solution success.
Civil and Environmental Engineering Carnegie Mellon University Sensors & Knowledge Discovery (a.k.a. Data Mining) H. Scott Matthews April 14, 2003.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
1 Temporal Scenarios, learning and Video Understanding Francois BREMOND, Monique THONNAT, … INRIA Sophia Antipolis, PULSAR team, FRANCE
Vigilant Real-time storage and intelligent retrieval of visual surveillance data Dr Graeme A. Jones.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 7: Expert Systems and Artificial Intelligence Decision Support.
Robotics for Intelligent Environments
Building Knowledge-Driven DSS and Mining Data
Jacinto C. Nascimento, Member, IEEE, and Jorge S. Marques
Software Process and Product Metrics
19 April, 2017 Knowledge and image processing algorithms for real-life applications. Dr. Maria Athelogou Principal Scientist & Scientific Liaison Manager.
DVMM Lab, Columbia UniversityVideo Event Recognition Video Event Recognition: Multilevel Pyramid Matching Dong Xu and Shih-Fu Chang Digital Video and Multimedia.
Jason Li Jeremy Fowers Ground Target Following for Unmanned Aerial Vehicles.
GeoPKDD Geographic Privacy-aware Knowledge Discovery and Delivery Kick-off meeting Pisa, March 14, 2005.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse.
Object detection, tracking and event recognition: the ETISEO experience Andrea Cavallaro Multimedia and Vision Lab Queen Mary, University of London
Katanosh Morovat.   This concept is a formal approach for identifying the rules that encapsulate the structure, constraint, and control of the operation.
An approach to Intelligent Information Fusion in Sensor Saturated Urban Environments Charalampos Doulaverakis Centre for Research and Technology Hellas.
Chapter 6 : Software Metrics
Orion Image Understanding for Object Recognition Monique Thonnat INRIA Sophia Antipolis.
Video Event Recognition Algorithm Assessment Evaluation Workshop VERAAE ETISEO – NICE, May Dr. Sadiye Guler Sadiye Guler - Northrop Grumman.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
January Smart Environments: Artificial Intelligence in the Home and Beyond Diane J. Cook
© 2010 IBM Corporation IBM Research - Ireland © 2014 IBM Corporation xStream Data Fusion for Transport Smarter Cities Technology Centre IBM Research.
ETISEO Benoît GEORIS, François BREMOND and Monique THONNAT ORION Team, INRIA Sophia Antipolis, France Nice, May th 2005.
Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.
1 ETISEO: Video Understanding Performance Evaluation Francois BREMOND, A.T. Nghiem, M. Thonnat, V. Valentin, R. Ma Orion project-team, INRIA Sophia Antipolis,
Ground Truth Free Evaluation of Segment Based Maps Rolf Lakaemper Temple University, Philadelphia,PA,USA.
Recognition of Human Behaviors with Video Understanding M. Thonnat, F. Bremond and B. Boulay Projet ORION INRIA Sophia Antipolis, France 08/07/2003 Inria/STMicroelectronics.
Copyright © 2012, SAS Institute Inc. All rights reserved. ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY,
Project on Visual Monitoring of Human Behavior and Recognition of Human Behavior in Metro Stations with Video Understanding M. Thonnat Projet ORION INRIA.
Networked Audio Visual Systems and Home Platforms ADMIRE-P at Med-e-Tel 2005 April 6-8, Application of Video Technologies and Pattern Recognition.
Chapter 4 Decision Support System & Artificial Intelligence.
ETISEO Project Evaluation for video understanding Nice, May th 2005 Evaluation du Traitement et de l’Interprétation de Séquences vidEO.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Human Activity Recognition at Mid and Near Range Ram Nevatia University of Southern California Based on work of several collaborators: F. Lv, P. Natarajan,
VISION for Security Monique THONNAT ORION INRIA Sophia Antipolis.
A modular metadata-driven statistical production system The case of price index production system at Statistics Finland Pekka Mäkelä, Mika Sirviö.
Data Mining for Surveillance Applications Suspicious Event Detection Dr. Bhavani Thuraisingham.
Using decision trees to build an a framework for multivariate time- series classification 1 Present By Xiayi Kuang.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
ETISEO François BREMOND ORION Team, INRIA Sophia Antipolis, France.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
TRECVID IES Lab. Intelligent E-commerce Systems Lab. 1 Presented by: Thay Setha 05-Jul-2012.
SENSOR-INDEPENDENT PLATFORM FOR CIRCADIAN RHYTHM ANALYSIS Andrea Caroppo Institute for Microelectronics and Microsystems (IMM) National Research Council.
Student Gesture Recognition System in Classroom 2.0 Chiung-Yao Fang, Min-Han Kuo, Greg-C Lee, and Sei-Wang Chen Department of Computer Science and Information.
Introduction to Machine Learning, its potential usage in network area,
Data Mining for Surveillance Applications Suspicious Event Detection
From LSE-30: Observatory System Spec.
WP3 INERTIA Local Control and Automation Hub
Automatic cLasification d
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Scene Understanding Francois BREMOND
Data Mining for Surveillance Applications Suspicious Event Detection
Data Mining for Surveillance Applications Suspicious Event Detection
Presentation transcript:

1 Scene Understanding perception, multi-sensor fusion, spatio-temporal reasoning and activity recognition. Francois BREMOND PULSAR project-team, INRIA Sophia Antipolis, FRANCE Key words: Artificial intelligence, knowledge-based systems, cognitive vision, human behavior representation, scenario recognition

2 ETISEO: French initiative for algorithm validation and knowledge acquisition: Approach: 3 critical evaluation concepts Selection of test video sequences Follow a specified characterization of problems Study one problem at a time, several levels of difficulty Collect long sequences for significance Ground truth definition Up to the event level Give clear and precise instructions to the annotator E.g., annotate both visible and occluded part of objects Metric definition Set of metrics for each video processing task Performance indicators: sensitivity and precision Video Understanding: Performance Evaluation (V. Valentin, R. Ma)

3 Evaluation : current approach (AT. NGHIEM) ETISEO limitations: Selection of video sequence according to difficulty levels is subjective Generalization of evaluation results is subjective. One video sequence may contain several video processing problems at many difficulty levels Approach: treat each video processing problem separately Define a measure to compute difficulty levels of input data (e.g. video sequences) Select video sequences containing only the current problems at various difficulty levels For each algorithm, determine the highest difficulty level for which this algorithm still has acceptable performance. Approach validation : applied to two problems Detect weakly contrasted objects Detect objects mixed with shadows

4 Evaluation : conclusion A new evaluation approach to generalise evaluation results. Implement this approach for 2 problems. Limitations: only detect the upper bound of algorithm capacity. The difference between the upper bound and the real performance may be significant if: The test video sequence contains several video processing problems The same set of parameters is tuned differently to adapt to several concurrent problems Ongoing evaluation campaigns: PETS at ECCV2008 TRECVid (NIST) with ILids video Benchmarking databases:

5 Video Understanding: Program Supervision

6 Goal : easy creation of reliable supervised video understanding systems Approach Use of a supervised video understanding platform A reusable software tool composed of three separate components: program library – control – knowledge base Formalize a priori knowledge of video processing programs Explicit the control of video processing programs Issues ? Video processing programs which can be supervised A friendly formalism to represent knowledge of programs A general control engine to implement different control strategies A learning tool to adapt system parameters to the environment Supervised Video Understanding : Proposed Approach

7 Control Application Domain Expert Video Processing Expert Application domain knowledge base Scene environment knowledge base Video processing program knowledge base Learning Evaluation Particular System Evaluation Video Processing Program Library Proposed Approach

8 Use of an operator formalism [Clément and Thonnat, 93] to represent knowledge of video processing programs Composed of frames and production rules Frames: declarative knowledge Operators: abstract model of a video processing program –primitive: particular program –composite: particular combination of programs Production rules: inferential knowledge Choice and optional criteria Initialization criteria Assessment criteria Adjustment and repair criteria Supervised Video Understanding Platform: Operator Formalism

9 Program Supervision: Knowledge and Reasoning Primitive operator Functionality Characteristics Input data Parameters Output data Preconditions Postconditions Effects Calling syntax Rule Bases Parameter initialization rules Parameter adjustment rules Result evalutation rules Repair rules Composite operator Functionality Characteristics Input data Parameters Output data Preconditions Postconditions Effects Decomposition into suboperators (sequential, parallel, alternative) Data flow Rule bases Parameter initialization rules Parameter adjustment rules Choice rules Result evalutation rules Repair rules

10 Objective: a learning tool to automatically tune algorithm parameters with experimental data Used for learning the segmentation parameters with respect to the illumination conditions Method Identify a set of parameters of a task 18 segmentation thresholds depending on environment characteristics Image intensity histogram Study the variability of the characteristic Histogram clustering -> 5 clusters Determine optimal parameters for each cluster Optimization of the 18 segmentation thresholds Video Understanding: Learning Parameters (B.Georis)

11 Video Understanding: Learning Parameters Camera View

12 Learning Parameters Clustering the Image Histograms Number of pixels [%] Pixel intensity [0-255] X Z Y A X-Z slice represents an image histogram ß i opt4 ß i opt1 ß i opt2 ß i opt5 ß i opt3

13 CARETAKER: An FP6 IST European initiative to provide an efficient tool for the management of large multimedia collections. Applications to surveillance and safety issues, in urban/environment planning, resource optimization, disabled/elderly person monitoring. Currently being validated on large underground video recordings ( Torino, Roma). Complex Events Raw Data Simple Events Knowledge Discovery Raw data Primitives Event and Meta data Audio/Video acquisition and encoding Multiple Audio/Video sensors Knowledge Discovery Generic Event recognition Video Understanding : Knowledge Discovery (E. Corvee, JL. Patino_Vilchis)

14 Event detection examples

15 Data Flow Object/Event Detection Information Modelling Object Detection Id Type Info 2D Info 3D Event Detection Id Type (inside_zone, stays_inside_zone) Involved Mobile Object Involved Contextual Object Mobile object table Event table Contextual object table

16 Mobile Objects People characterised by: Trajectory Shape Significant Event in which they are involved … Contextual Objects Find interactions between mobile objects and contextual objects Interaction type Time … Table Contents Events Model the normal activities in the metro station Event type Involved objects Time …

17 Knowledge Discovery: trajectory clustering Objective: Clustering of trajectories into k groups to match people activities Feature set Entry and exit points of an object Direction, speed, duration, … Clustering techniques Agglomerative Hierarchical Clustering. K-means Self-Organizing (Kohonen) Maps Evaluation of each cluster set based on Ground-Truth

18 Feature Vector Key points: x y x entry y entry m1m1 m2m2 mkmk mKmK Trajectory: Clustering Methods Parameter tuning: which features?

19 Agglomerative clustering Trajectory: Clustering Parameter tuning: which distance function?

20 Results on Torino subway (45min), 2052 trajectories

21 SOM K-meansAgglomerative Groups with mixed overlap Trajectory: Analysis

22 Trajectory: Semantic characterisation SOM CL14 / Kmeans CL12 Agglomerative CL 21 Consistency of clusters between algorithms Semantic meaning: walking towards vending machines

23 Intraclass & Interclass variance SOM algorithm has the lowest intraclass and higher interclass separation, Parameter tuning: which clustering techniques? Trajectory: Analysis

24 Video features modeled under three different tables with topological and temporal relations for quantitative and semantic description. Trajectory clustering gives information about frequent enter exit zones, density occupation and behavior characterization. Meaningful trajectory clusters are validated by the consistency through different algorithms Trajectory: Analysis

25 Mobile Objects

26 Mobile Object Analysis Building statistics on Objects There is an increase of people after 6:45

27 Contextual Object Analysis Vending Machine 2 Vending Machine 1 With an increase of people, there is an increase on the use of vending machines

28 Contextual Object Analysis Gates

29 Analysis: Use of the Gates Gates 7 to 9 are the most used (right side of gates) Gates 1 to 3 are the less in use (left side of gates)

30 Results : Trajectory Clustering

31 Semantic knowledge extracted by the off-line long term analysis of on-line interactions between moving objects and contextual objects: 70% of people are coming from north entrance Most people spend 10 sec in the hall 64% of people are going directly to the gates without stopping at the ticket machine At rush hours people are 40% quicker to buy a ticket, … Issues: At which level(s), should be designed clustering techniques: low level (image features)/ middle level (trajectories, shapes)/ high level (primitive events)? to learn what : visual concepts, scenario models? uncertainty (noise/outliers/rare), what are the activities of interest? Parameter tuning (e.g. distance, clustering tech.) and performance evaluation (criteria, ground-truth). Knowledge Discovery: achievements

32 Video Understanding : Learning Scenario Models (A. Toshev) or Frequent Composite Event Discovery in Videos event time series

33 Why unsupervised model learning in Video Understanding? Complex models containing many events, Large variety of models, Different parameters for different models  The learning of models should be automated. Learning Scenarios: Motivation Video surveillance in a parking lot

34 Input: A set of primitive events from the vision module: object-inside-zone(Vehicle, Entrance) [5,16] Output: frequent event patterns. A pattern is a set of events: object-inside-zone(Vehicle, Road) [0, 35] object-inside-zone(Vehicle, Parking_Road) [36, 47] object-inside-zone(Vehicle, Parking_Places) [62, 374] object-inside-zone(Person, Road) [314, 344] Learning Scenarios: Problem Definition Goals: Automatic data-driven modeling of composite events, Reoccurring patterns of primitive events  correspond to frequent activities, Find classes with large size & similar patterns. Zones

35 Approach: Iterative method from data mining for efficient frequent patterns discovery in large datasets, A PRIORI: Sub-patterns of frequent patterns are also frequent (Agrawal & Srikant, 1995), At i th step consider only i-patterns which have frequent (i-1) – sub-patterns  the search space is thus pruned. A PRIORI-property for activities represented as classes: size(C m-1 ) ≥ size(C m ) where C m is a class containing patterns of length m, C m-1 is a sub-activity of C m. Learning Scenarios: A PRIORI Method

36 Learning Scenarios: A PRIORI Method Merge two i-patterns with (i-1) primitive events in common to form an (i+1)-pattern:

37 2 types of Similarity Measure between event patterns : similarities between event attributes similarities between pattern structures Generic Similarity Measure : Generic properties when possible  easy usage in different domains, It should incorporate domain-dependent properties  relevance to the concrete application. Learning Scenarios: Similarity Measure

38 Attributes: the corresponding events in two patterns should have similar (same) attributes (duration, names, object types,...). Learning Scenarios: Attribute Similarity Comparison between corresponding events (same type, same color). For numeric attributes: G(x,y)= attr(p i, p j ) = average of all event attribute similarities.

39 Test data: Video surveillance at a parking lot, 4 hours records from 2 days in 2 test sets, Every test set contains appr. 100 primitive events. Learning Scenarios: Evaluation Results: In both test sets the following event pattern was recognized: object-inside-zone(Vehicle, Road) object-inside-zone(Vehicle, Parking_Road) object-inside-zone(Vehicle, Parking_Places) object-inside-zone(Person, Parking_Road)

40 Test data: Video surveillance at a parking lot, 4 hours records from 2 days in 2 test sets, Every test set contains appr. 100 primitive events. Learning Scenarios: Evaluation Results: In both test sets the following event pattern was recognized: object-inside-zone(Vehicle, Road) object-inside-zone(Vehicle, Parking_Road) object-inside-zone(Vehicle, Parking_Places) object-inside-zone(Person, Parking_Road)

41 Test data: Video surveillance at a parking lot, 4 hours records from 2 days in 2 test sets, Every test set contains appr. 100 primitive events. Learning Scenarios: Evaluation Results: In both test sets the following event pattern was recognized: object-inside-zone(Vehicle, Road) object-inside-zone(Vehicle, Parking_Road) object-inside-zone(Vehicle, Parking_Places) object-inside-zone(Person, Parking_Road)

42 Test data: Video surveillance at a parking lot, 4 hours records from 2 days in 2 test sets, Every test set contains appr. 100 primitive events. Learning Scenarios: Evaluation Results: In both test sets the following event pattern was recognized: object-inside-zone(Vehicle, Road) object-inside-zone(Vehicle, Parking_Road) object-inside-zone(Vehicle, Parking_Places) object-inside-zone(Person, Parking_Road) Maneuver Parking!

43 Conclusion: Application of a data mining approach, Handling of uncertainty without losing computational effectiveness, General framework: only a similarity measure and a primitive event library must be specified. Future Work: Other similarities, Handling of different aspects of uncertainty, Qualification of the learned patterns, Frequent equal interesting ? Different applications: different event libraries or features. Learning Scenarios: Conclusion & Future Work

44 GERHOME (CSTB, INRIA, CHU Nice) : Ageing population Approach : Multi-sensor analysis based on sensors embedded in the home environment Detect in real-time any alarming situation Identify a person profile – his/her usual behaviors - from the global trends of life parameters, and then to detect any deviation from this profile HealthCare Monitoring: (N. Zouba)

45 Monitoring of Activities of Daily Living for Elderly Goal: Increase independence and quality of life:Goal: Increase independence and quality of life: Enable elderly to live longer in their preferred environment.Enable elderly to live longer in their preferred environment. Reduce costs for public health systems.Reduce costs for public health systems. Relieve family members and caregivers.Relieve family members and caregivers. Approach:Approach: Detecting alarming situations (eg. Falls)Detecting alarming situations (eg. Falls) Detecting changes in behaviorDetecting changes in behavior (missing activities, disorder, interruptions, repetitions, inactivity). repetitions, inactivity). Calculate the degree of frailty of elderly people.Calculate the degree of frailty of elderly people. Example of normal activity: Meal preparation (in kitchen) (11h– 12h) Eating (in dinning room) (12h -12h30) Resting, TV watching, (in living room) (13h– 16h) …

46 GERHOME (Gerontology at Home) : homecare laboratoryGERHOME (Gerontology at Home) : homecare laboratory Experimental site in CSTB (Centre Scientifique et Technique du Bâtiment) at Sophia AntipolisExperimental site in CSTB (Centre Scientifique et Technique du Bâtiment) at Sophia Antipolis Partners: INRIA, CSTB, CHU-Nice, Philips-NXP, CG06…Partners: INRIA, CSTB, CHU-Nice, Philips-NXP, CG06… Gerhome laboratory

47 Gerhome laboratory Position of the sensors in Gerhome laboratory Video cameras installed in the kitchen and in the living-room to detect and track the person in the apartment. Video cameras installed in the kitchen and in the living-room to detect and track the person in the apartment. Contact sensors mounted on many devices to determine the interactions with the person. Contact sensors mounted on many devices to determine the interactions with the person. Presence sensors installed in front of sink and cooking stove to detect the presence of people near sink and stove. Presence sensors installed in front of sink and cooking stove to detect the presence of people near sink and stove.

48 Dans la cuisine Sensors installed in Gerhome laboratory Video camera in the living-room Pressure sensor underneath the legs of armchair Contact sensor in the window Contact sensor in the cupboard door

49 We have modelled a set of activities by using a event recognition language developedWe have modelled a set of activities by using a event recognition language developed in our team. This is an example for “Meal preparation” event.in our team. This is an example for “Meal preparation” event. Composite Event (Prepare_meal_1, “detected by a video camera combined with a contact sensors” Physical Objects ( (p: Person), (Microwave: Equipment), (Fridge: Equipment), (Kitchen: Zone)) Components ((p_inz: PrimitiveState Inside_zone (p, Kitchen)) “detected by video camera” (open_fg: PrimitiveEvent Open_Fridge (Fridge)) “detected by contact sensor” (close_fg: PrimitiveEvent Close_Fridge (Fridge)) “detected by contact sensor” (open_mw: PrimitiveEvent Open_Microwave (Microwave)) “detected by contact sensor” (close_mw: PrimitiveEvent Close_Microwave (Microwave))) “detected by contact sensor” Constraints ((open_fg during p_inz ) (open_mw before_meet open_fg ) (open_fg Duration>= 10) (open_mw Duration>=5)) Action ( AText (“Person prepares meal”) AType (“NOT URGENT”)) ) Event modelling

50 Multi-sensor monitoring: results and evaluation We have validated and visualized the recognized events with a 3D visualization tool. We have validated and visualized the recognized events with a 3D visualization tool. Activity # Videos # Events TPFNFPPrecisionSensitivity In the kitchen ,888 In the living-room ,8881 Open microwave Open fridge Open cupboard Preparing meal We have studied and tested a range of activities in the Gerhome laboratory, such as: using microwave, using fridge, preparing meal, … We have studied and tested a range of activities in the Gerhome laboratory, such as: using microwave, using fridge, preparing meal, …

51 Recognition of the “Prepare meal” event Visualization of a recognized event in the Gerhome laboratory The person is recognized with the posture "standing with one arm up”, “located in the kitchen” and “using the microwave”. The person is recognized with the posture "standing with one arm up”, “located in the kitchen” and “using the microwave”.

52 Recognition of the “Resting in living-room” event Recognition of the “Resting in living-room” event The person is recognized with the posture “sitting in the armchair” and “located in the living- room”. The person is recognized with the posture “sitting in the armchair” and “located in the living- room”. Visualization of a recognized event in the Gerhome laboratory

53End-users There are several end-users in homecare:There are several end-users in homecare: Doctors (gerontologists): Doctors (gerontologists): Frailty measurement (depression, …)Frailty measurement (depression, …) Alarm detection (falls, gas, dementia, …).Alarm detection (falls, gas, dementia, …). Caregivers and nursing home: Caregivers and nursing home: Cost reduction: no false alarm and reduction employee involvement.Cost reduction: no false alarm and reduction employee involvement. Employee protection.Employee protection. Persons with special needs, including young children, disabled and elderly people: Persons with special needs, including young children, disabled and elderly people: Feeling safe at home.Feeling safe at home. Autonomy: at night, lighting up the way to bathroom.Autonomy: at night, lighting up the way to bathroom. Improving life: smart mirror, summary of user day, week, month, in terms of walking distance, TV, water consumption.Improving life: smart mirror, summary of user day, week, month, in terms of walking distance, TV, water consumption. Family members and relatives: Family members and relatives: Elderly safety and protection.Elderly safety and protection. Social connectivity.Social connectivity.

54 Social problems and solutions ProblemsSolutions Privacy confidentiality and ethics: video (and other data) recording, processing and transmission. No video recording and transmission, only textual alarms. Acceptability for elderly User empowerment. Usability Easy ergonomic interface (no keyboard, large screen), friendly usage of the system. Cost effectiveness The right service for the right price, large variety of solutions. Legal issues, no certification Robustness, benchmarking, on site evaluation Installation, maintenance, training, interoperability with other home devices Adaptability, X-Box integration, wireless, standards (OSGI, …) Research financing ? France (no money, lobbies), Europe (delay), US, Asia.

55 Conclusion A global framework for building video understanding systems: Hypotheses: mostly fixed cameras 3D model of the empty scene predefined behavior models Results: Video understanding real-time systems for Individuals, Groups of People, Vehicles, Crowd, or Animals … Knowledge structured within the different abstraction levels (i.e. processing worlds) Formal description of the empty scene Structures for algorithm parameters Structures for object detection rules, tracking rules, fusion rules, … Operational language for event recognition (more than 60 states and events), video event ontology Tools for knowledge management Metrics, tools for performance evaluation, learning Parsers, Formats for data exchange …

56 Object and video event detection Finer human shape description: gesture models Video analysis robustness: reliability computation Knowledge Acquisition Design of learning techniques to complement a priori knowledge: visual concept learning scenario model learning System Reusability Use of program supervision techniques: dynamic configuration of programs and parameters Scaling issue: managing large network of heterogeneous sensors (cameras, microphones, optical cells, radars….) Conclusion: perspectives