Presentation is loading. Please wait.

Presentation is loading. Please wait.

BMVA CANTATA – INRIA, December 12, 2007 page 1 JL-1 Content Aware Networked systems Towards Advanced and Tailored Assistance BMVA 2007 December 12, 2007.

Similar presentations


Presentation on theme: "BMVA CANTATA – INRIA, December 12, 2007 page 1 JL-1 Content Aware Networked systems Towards Advanced and Tailored Assistance BMVA 2007 December 12, 2007."— Presentation transcript:

1 BMVA CANTATA – INRIA, December 12, 2007 page 1 JL-1 Content Aware Networked systems Towards Advanced and Tailored Assistance BMVA 2007 December 12, 2007 Francois Bremond – INRIA sophia CANTATA

2 BMVA CANTATA – INRIA, December 12, 2007 page 2 JL-2 CANTATA Introduction Problem statement 2 year ITEA Project, ending in December 2008 Large amounts of data for transfer and interpretation 3 MCA challenges  Surveillance  Consumer applications  Medical Solution  High-level descriptions by means of content analysis  Retrieval by Intelligent Indexing

3 BMVA CANTATA – INRIA, December 12, 2007 page 3 JL-3 CANTATA Introduction Long term vision Develop systems  That are aware of the content and understand it  That apply this knowledge to establish an action or autonomously control the environment  That will be a “virtual specialist” as it will apply the knowledge to assist the decision-making security officer Challenges  Video content models for robust analysis and reasoning  Self-learning, context awareness for faithful system performance  Performance quantification of MCA for objective evaluation (standard dataset, well-defined metrics)

4 BMVA CANTATA – INRIA, December 12, 2007 page 4 JL-4 CANTATA the scope The CANTATA platform

5 BMVA CANTATA – INRIA, December 12, 2007 page 5 JL-5 WP4 Validation and classification Objective Validation chain This work package aims at defining an overall objective validation framework that covers the various aspects of MCA systems

6 BMVA CANTATA – INRIA, December 12, 2007 page 6 JL-6 WP4 Validation and classification Organisation Work Package leader: Barco  Task 4.1 State-Of-The-Art : Inria  Task 4.2 Requirements : VDG Security  Task 4.3 Creation of Datasets : Kingston University  Task 4.4 Annotation tool : Traficon  Task 4.5 Ground truth : Multitel  Task 4.6 Validation metrics : Philips medical  Task 4.7 Publication of validation : Codasystem Other partners: Acic, UPF, IBBT, Philips research

7 BMVA CANTATA – INRIA, December 12, 2007 page 7 JL-7 This work gives an overview of projects in performance evaluation and proposed datasets

8 BMVA CANTATA – INRIA, December 12, 2007 page 8 JL-8 Topics:  Surveillance  Consumer applications  Medical Content:  Website: Webpage link (if any)  Description of Dataset: (Content, size, etc)  Description of Ground Truth/Metadata: (if any)  Contextual info:environment conditions (calibration, scene...)  Results from metrics and ground truth:  Comments:  Information on Copyright: Licence, Cost, etc.  Contact person from Cantata:contact person to get more info. Performance evaluation Creation of WEB PAGE with existing VIDEO DATASETS

9 BMVA CANTATA – INRIA, December 12, 2007 page 9 JL-9 ETISEO Surveillance Website: http://www-sop.inria.fr/orion/ETISEO/http://www-sop.inria.fr/orion/ETISEO/ Description of Dataset: 86 video clips. These sequences constitute a representative panel of different video surveillance areas. They merge indoor and outdoor scenes, corridors, streets, building entries, subway station... They also mix different types of sensors and complexity levels. Description of Ground Truth/Metadata: 5 different levels: Object Detection, Object Localization, Object Tracking, Object Classification. Contextual info: zone of interest, calibration matrix Results from metrics and ground truth: bounding box, object class, events Comments: Information on Copyright: Free download but registration and user agreement is required. Contact person from Cantata: francois.bremond@sophia.inria.fr francois.bremond@sophia.inria.fr

10 BMVA CANTATA – INRIA, December 12, 2007 page 10 JL-10 PETS 2001 Surveillance Website: http://www.cvg.cs.rdg.ac.uk/PETS2001/pets2001-dataset.htmlhttp://www.cvg.cs.rdg.ac.uk/PETS2001/pets2001-dataset.html Description of Dataset: Outdoor people and vehicle tracking (two synchronised viewsDescription of Ground Truth/Metadata: Tracking information on image plane and ground plane can be found at: http://www.cvg.cs.rdg.ac.uk/PETS2001/ANNOTATION/ http://www.cvg.cs.rdg.ac.uk/PETS2001/ANNOTATION/ Contextual info: Camera Calibration provided Results from metrics and ground truth: Centroid and bounding box coordinates on image plane, object class (person, vehicle, other), position on ground plane and object orientation. Information on Copyright: Free download from website Contact person from Cantata: Dimitrios Makris, d.makris@kingston.ac.ukd.makris@kingston.ac.uk

11 BMVA CANTATA – INRIA, December 12, 2007 page 11 JL-11 PETS 2002- VISOR BASE: Moving People Surveillance Website: http://www.cvg.cs.rdg.ac.uk/PETS2002/pets2002-db.htmlhttp://www.cvg.cs.rdg.ac.uk/PETS2002/pets2002-db.html Description of Dataset: Indoor people tracking (and counting). Two training and four testing sequences consist of people moving in front of a shop window. Sequences are provided as both MPEG movie format and as individual JPEG images. Description of Ground Truth/Metadata: People tracking, counting and activity recognition. Contextual info: No calibration Results from metrics and ground truth: How many people are passing in front of the shop window, how many people stop and look into the window, how many people are looking into the window at each instant (frame) in time, the trajectories of people passing in front of the store, the time spent per frame (processing time): a histogram of the microseconds spent processing each frame. Information on Copyright: Free download from website Contact person from Cantata: Dimitrios Makris, d.makris@kingston.ac.ukd.makris@kingston.ac.uk

12 BMVA CANTATA – INRIA, December 12, 2007 page 12 JL-12 PETS-ICVS'2003 - FGnet Surveillance Website: http://www.cvg.cs.rdg.ac.uk/PETS-ICVS/pets-icvs-db.htmlhttp://www.cvg.cs.rdg.ac.uk/PETS-ICVS/pets-icvs-db.html Description of Dataset: Smart meeting, that includes facial expressions, gaze and gesture/action. The environment consists of three cameras: one mounted on each of two opposing walls, and an omnidirectional camera positioned at the centre of the room. The dataset consists of four scenarios. Description of Ground Truth/Metadata: a) Eye positions of people in Scenarios A, B and D. (every 10th frame is annotated). b) Facial expression and gaze estimation for Scenarios A and D, Cameras 1-2. c) Gesture/action annotations for Scenarios B and D, Cameras 1-2. Contextual info: Camera Calibration provided. Results from metrics and ground truth: For each frame, the requirement is to perform:face localisation (centre location of eyes), recognition of facial expression, recognition of face/hand gesture, estimation of face/head direction (gaze), recognition of actions. Information on Copyright: Free download Contact person from Cantata: Dimitrios Makris, d.makris@kingston.ac.ukd.makris@kingston.ac.uk

13 BMVA CANTATA – INRIA, December 12, 2007 page 13 JL-13 PETS-ECCV'2004 - CAVIAR Surveillance Website: http://groups.inf.ed.ac.uk/vision/CAVIAR/CAVIARDATA1/ or http://www-prima.inrialpes.fr/PETS04/caviar_data.htmlhttp://groups.inf.ed.ac.uk/vision/CAVIAR/CAVIARDATA1/http://www-prima.inrialpes.fr/PETS04/caviar_data.html Description of Dataset: People walking alone, meeting with others, window shopping, fighting and passing out and leaving a package in a public place. All video clips were filmed with a wide angle camera lens. The resolution is half- resolution PAL standard (384 x 288 pixels, 25 frames per second) and compressed using MPEG2. The file sizes are about 10 MB. Description of Ground Truth/Metadata: Person/Group Tracking, Person/Group Activity Recognition, Scenario/Situation Recognition Contextual info: 3D coordinates of points for calibration purposes provided. Results from metrics and ground truth: For each frame and object/group : bounding box and behaviour label. Also, for each frame, labels for situations/scenarios for the whole image. Information on Copyright: Free download from website. If you publish results using the data, please acknowledge the data as coming from the EC Funded CAVIAR project/IST 2001 37540, found at URL: http://www.dai.ed.ac.uk/homes/rbf/CAVIAR/ http://www.dai.ed.ac.uk/homes/rbf/CAVIAR/ Contact person from Cantata: Dimitrios Makris, d.makris@kingston.ac.ukd.makris@kingston.ac.uk

14 BMVA CANTATA – INRIA, December 12, 2007 page 14 JL-14 PETS'2006 - ISCAPS Surveillance Website: http://pets2006.net/http://pets2006.net/ Description of Dataset: Surveillance of public spaces, detection of left luggage events. Scenarios of increasing complexity, captured using multiple sensors. Description of Ground Truth/Metadata: XML files: Calibration parameters, these are given in the sub-directory 'calibration‘ and configuration and ground- truth information. Contextual info: Calibration provided. Results from metrics and ground truth: The radii distances, luggage location, warning / alarm triggers etc Information on Copyright: Free download from website. The UK Information Commisioner has agreed that the PETS 2006 data-sets described here may be made publicly available for the purposes of academic research. The video sequences are copyright ISCAPS consortium and permission is hereby granted for free download for the purposes of the PETS 2006 workshop. Contact person from Cantata: Dimitrios Makris, d.makris@kingston.ac.ukd.makris@kingston.ac.uk

15 BMVA CANTATA – INRIA, December 12, 2007 page 15 JL-15 PETS'2007 - REASON Surveillance Website: http://pets2007.net/http://pets2007.net/ Description of Dataset: The datasets are multisensor sequences containing the following 3 scenarios, with increasing scene complexity: 1. loitering, 2. attended luggage removal (theft), 3. unattended luggage. Description of Ground Truth/Metadata: Event Detection Contextual info: Calibration provided Results from metrics and ground truth: Event Details (type, location, time) Information on Copyright: Free download from website. The UK Information Commisioner has agreed that the PETS 2007 datasets described here may be made publicly available for the purposes of academic research. The video sequences are copyright UK EPSRC REASON Project consortium and permission is hereby granted for free download for the purposes of the PETS 2007 workshop. Contact person from Cantata: Dimitrios Makris, d.makris@kingston.ac.ukd.makris@kingston.ac.uk

16 BMVA CANTATA – INRIA, December 12, 2007 page 16 JL-16 Level Crossing Surveillance Website: http://www.multitel.be/~va/selcat/http://www.multitel.be/~va/selcat/ Description of Dataset: These datasets are composed of 24 Hours of real sequences, showing a level crossing where some vehicles stop due to its particular configuration: on the right side of the LC, there is an avenue, parallel to the LC. So a traffic light is located just after the LC. Consequently, sometimes, vehicles stopped on the LC due to this traffic light. The Total Amount of data is about 7 GigaBytes. Description of Ground Truth/Metadata: For each video files, there is a corresponding ground truth file in XML that give the timestamp of events "stopped vehicles" Contextual info:environment conditions (calibration, scene...) Contact person from Cantata: Caroline Machy, machy@multitel.be

17 BMVA CANTATA – INRIA, December 12, 2007 page 17 JL-17 SPEVI: Single face dataset Surveillance Website: www.spevi.orgwww.spevi.org Description of Dataset: This is a dataset for single person/face visual detection and tracking. The dataset is composed of five sequences with different illumination conditions and resolutions. Description of Ground Truth/Metadata: The ground truth data is available in the.zip files for the sequences motinas_toni and motinas_emilio_webcam. In the ground truth files each line of text describes the objects' position and size in a frame. The syntax of a line is the following: frame number_of_objects obj_1_name x y half_width half_height angle obj _2_name x y half_width half_height angle... Information on Copyright: Requested citation acknowledgment E. Maggio, A. Cavallaro, "Hybrid particle filter and mean shift tracker with adaptive transition model", in Proc. of IEEE Int. Conference on Acoustics, Speech and Signal Processing (ICASSP 2005), Philadelphia, 19-23 March 2005, pp. 221 - 224. Contact person from Cantata: Xavier Desurmont, desurmont@multitel.be

18 BMVA CANTATA – INRIA, December 12, 2007 page 18 JL-18 SPEVI: Multiple faces dataset Surveillance Website: www.spevi.orgwww.spevi.org Description of Dataset: This is a dataset for multiple people/faces visual detection and tracking. The dataset is composed of 3 sequences (same scenario); 4 targets repeatedly occlude each other while appearing and disappearing from the field of view of the camera. The sequence motinas_multi_face_frontal shows frontal faces only; in motinas_multi_face_turning the faces are frontal and rotated; in motinas_multi_face_fast the targets move faster that in the previous two sequences. Total number of images: 2769, DivX 6 compression,640 x 480 pixels,25 Hz. Description of Ground Truth/Metadata: No Contextual info: No Results from metrics and ground truth: No Comments: No Information on Copyright: Requested citation acknowledgment: E. Maggio, E. Piccardo, C. Regazzoni, A. Cavallaro. "Particle PHD filter for multi-target visual tracking", in Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2007), Honolulu (USA), April 15-20, 2007 Contact person from Cantata: Xavier Desurmont, desurmont\a\multitel.be

19 BMVA CANTATA – INRIA, December 12, 2007 page 19 JL-19 OVVV Surveillance Website: http://development.objectvideo.com/http://development.objectvideo.com/ Description of Dataset: The ObjectVideo Virtual Video provides the ability to generate virtual video sequences. These sequences can then be used to test VCA algorithms. Description of Ground Truth/Metadata: The automatically generated ground truth is generated in a propriety binary format. The format is open, and a conversion program can be created to convert metadata to any format. Contextual info: Virtual environment, the user can make his own environment from the internet. Several camera settings can be changed to simulate real-world cameras. Results from metrics and ground truth: results from metrics and ground truth are not applicable for OVVV. Comments: This is not a dataset as is but using these tools, very powerful and tailored; test videos can be created. Information on Copyright: The ObjectVideo Virtual Video Tool is provided free for non- commercial use, for your own research and development purposes. If you publish or distribute images, videos or derivative results based on this software, you must acknowledge ObjectVideo by including "ObjectVideo Virtual Video Tool". To use the ObjectVideo Virtual Video tool a licence for the commercial game Half-Life 2 is needed (www.steampowered.com).www.steampowered.com Contact person from Cantata: Rick Koeleman, VDG-Security bv. rick@vdg-security.com rick@vdg-security.com

20 BMVA CANTATA – INRIA, December 12, 2007 page 20 JL-20 CANDELA Surveillance Website: http://www.multitel.be/~va/candela/ http://www.multitel.be/~va/candela/ Description of Dataset: "Indoor abandonned object" and "road intersection"  Scenario 1: The detection of abandoned objects  Scenario 2: Street at zebra crossings. Description of Ground Truth/Metadata: no Contextual info: no Results from metrics and ground truth: Criteria for verification/ : -Is the alarm generated (yes/no)? -How correct is the timing of the alarm (start delay, overall time overlap) Position correctness Information on Copyright: public domain Contact person from Cantata: Xavier Desurmont, desurmont\a\multitel.be

21 BMVA CANTATA – INRIA, December 12, 2007 page 21 JL-21 Traffic datasets (Institut fur Algorithmen und Kognitive Systemes) Surveillance Website: http://i21www.ira.uka.de/image_sequences/http://i21www.ira.uka.de/image_sequences/ Description of dataset: Traffic databases Description of Ground Truth/Metadata: No Contextual info: Different context, snow, fogs, etc. Information on Copyright: license (no), cost (free): Contact person from Cantata: Sabri Boughorbel (sabri.boughorbel@philips.com)sabri.boughorbel@philips.com

22 BMVA CANTATA – INRIA, December 12, 2007 page 22 JL-22 VISOR Surveillance Website: http://imagelab.ing.unimore.it/visor/http://imagelab.ing.unimore.it/visor/ Description of Dataset: 4 types of video clips. These sequences constitute a representative panel of different video surveillance areas. They merge indoor and outdoor scenes, such as Indoor Domotic Unimore D.I.I. setup. Description of Ground Truth/Metadata: Object Detection and Tracking. Results from metrics and ground truth: (Viper-GT) bounding box, Comments: mostly simple videos Information on Copyright: Free download Contact person: vezzani.roberto@unimore.itvezzani.roberto@unimore.it

23 BMVA CANTATA – INRIA, December 12, 2007 page 23 JL-23 BEHAVE Surveillance Website: http://groups.inf.ed.ac.uk/vision/BEHAVEDATA/http://groups.inf.ed.ac.uk/vision/BEHAVEDATA/ Description of Dataset: crowd, people acting out various interactions. Description of Ground Truth/Metadata: Object Detection and Tracking. Contextual info: calibration info Results from metrics and ground truth: (Viper-GT) bounding box, object class, Comments: some complex videos Information on Copyright: Free download Contact person: Bob Fisher : rbf@inf.ed.ac.ukrbf@inf.ed.ac.uk

24 BMVA CANTATA – INRIA, December 12, 2007 page 24 JL-24 BEHAVE 2 Surveillance Website: http://groups.inf.ed.ac.uk/vision/BEHAVEDATA/INTERACTIONS/http://groups.inf.ed.ac.uk/vision/BEHAVEDATA/INTERACTIONS/ Description of Dataset: The dataset comprises of two views of various scenario's of people acting out various interactions. Ten basic scenarios were acted out: InGroup, Approach, WalkTogether, Split, Ignore, Following, Chase, Fight, RunTogether, and Meet.The data is captured at 25 frames per second. The resolution is 640x480. The videos are available either as AVI's or as a numbered set of JPEG single image files. Description of Ground Truth/Metadata: Tracking, Event detection. Contextual info: 3D coordinates of points for calibration purposes provided. Results from metrics and ground truth: Bounding boxes (VIPER XML format). Event labels for persons and frame span Comments: The site will be updated when more of the ground truth becomes available. Information on Copyright: Free download from website. Contact person from Cantata: Dimitrios Makris, d.makris@kingston.ac.ukd.makris@kingston.ac.uk

25 BMVA CANTATA – INRIA, December 12, 2007 page 25 JL-25 VS-PETS'2003 - INMOVE Consumer applications Website: http://www.cvg.cs.rdg.ac.uk/VSPETS/vspets-db.htmlhttp://www.cvg.cs.rdg.ac.uk/VSPETS/vspets-db.html Description of Dataset: Outdoor people tracking - football data (three synchronised views). The datasets consists of football players moving around a pitch. Description of Ground Truth/Metadata: Tracking information on image plane for camera 3 can be found at: http://www.cvg.cs.rdg.ac.uk/VSPETS/Camera3Xml.zip. An AVI file of the ground truth for camera view 3 is also available at http://www.cvg.cs.rdg.ac.uk/VSPETS/Cam3_Gt.avi http://www.cvg.cs.rdg.ac.uk/VSPETS/Camera3Xml.zip http://www.cvg.cs.rdg.ac.uk/VSPETS/Cam3_Gt.avi Results from metrics and ground truth: The location of each player on the pitch, for each frame of the sequence. For each player, the bounding box (with origin bottom left) in pixels should be determined. The position of the player is defined as the middle bottom of the bounding box (in pixels). Information on Copyright: Free download from website Contact person from Cantata: Dimitrios Makris, d.makris@kingston.ac.ukd.makris@kingston.ac.uk

26 BMVA CANTATA – INRIA, December 12, 2007 page 26 JL-26 TRICTRAC Consumer Applications Website: http://www.multitel.be/trictrac/http://www.multitel.be/trictrac/ Description of dataset: Multicamera HD progressive image in jpeg for synthetic video sequence of soccer. Description of Ground Truth/Metadata: XML (position is 2D, 3D of objects and camera) Contextual info: No Results from metrics and ground truth : no Comments: the datasets is fully described in "TRICTRAC Video Dataset: Public HDTV Synthetic Soccer Video Sequences With Ground Truth", X. Desurmont, J-B. Hayet, J-F. Delaigle, J. Piater, B. Macq, Workshop on Computer Vision Based Analysis in Sport Environments (CVBASE), 2006. Information on Copyright: Access / licence: All data is publicly available and downloadable. If you publish results using the data, please acknowledge the data as coming from the TRICTRAC project, found at URL: http://www.multitel.be/trictrac. THE DATASET IS PROVIDED WITHOUT WARRANTY OF ANY KIND. http://www.multitel.be/trictrac  Contact person from Cantata: Xavier Desurmont, desurmont\a\multitel.be

27 BMVA CANTATA – INRIA, December 12, 2007 page 27 JL-27 Example of one dataset Medical Dataset

28 BMVA CANTATA – INRIA, December 12, 2007 page 28 JL-28 Example with 2 signals: a mass and a micro calcification

29 BMVA CANTATA – INRIA, December 12, 2007 page 29 JL-29 WEB SITE Conclusion Many application domains (d.makris@kingston.ac.uk)  25 datasets for Surveillance  6 datasets for Comsumer applications  3 datasets for Medical http://www.tudor.lu/cantata http://www.tudor.lu/QuickPlace/cantata/PageLibraryC125725E002AB722.nsf/h_AA BC75AA0B05E5DFC125725E002B5E46/ED93066DB0E340C7C12573A2005 6D789/?OpenDocument User Name : Francois.Bremond@sophia.inria.frFrancois.Bremond@sophia.inria.fr Password :


Download ppt "BMVA CANTATA – INRIA, December 12, 2007 page 1 JL-1 Content Aware Networked systems Towards Advanced and Tailored Assistance BMVA 2007 December 12, 2007."

Similar presentations


Ads by Google