GeoPKDD Geographic Privacy-aware Knowledge Discovery and Delivery Kick-off meeting Pisa, March 14, 2005.

Slides:



Advertisements
Similar presentations
Web Mining.
Advertisements

OLAP Tuning. Outline OLAP 101 – Data warehouse architecture – ROLAP, MOLAP and HOLAP Data Cube – Star Schema and operations – The CUBE operator – Tuning.
Sensor-Based Abnormal Human-Activity Detection Authors: Jie Yin, Qiang Yang, and Jeffrey Junfeng Pan Presenter: Raghu Rangan.
Spatiotemporal Pattern Mining For Travel Behavior Prediction UIC IGERT Seminar 02/14/2007 Chad Williams.
Visual Event Detection & Recognition Filiz Bunyak Ersoy, Ph.D. student Smart Engineering Systems Lab.
Seismo-Surfer a tool for collecting, querying, and mining seismic data Yannis Theodoridis University of Piraeus
CENTRE Cellular Network’s Positioning Data Generator Fosca GiannottiKDD-Lab Andrea MazzoniKKD-Lab Puntoni SimoneKDD-Lab Chiara RensoKDD-Lab.
Privacy Preserving Publication of Moving Object Data Joey Lei CS295 Francesco Bonchi Yahoo! Research Avinguda Diagonal 177, Barcelona, Spain 6/10/20151CS295.
Overview of Databases and Transaction Processing Chapter 1.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Copyright 2002 Prentice-Hall, Inc. Chapter 1 The Systems Development Environment 1.1 Modern Systems Analysis and Design Third Edition Jeffrey A. Hoffer.
The Data Mining Visual Environment Motivation Major problems with existing DM systems They are based on non-extensible frameworks. They provide a non-uniform.
Advanced Topics COMP163: Database Management Systems University of the Pacific December 9, 2008.
SeCoGIS - Bercellona, October 20, 2008 An ontology-based approach for the semantic modeling and reasoning on trajectories Miriam Baglioni Jose de Macedo.
Dieter Pfoser, LBS Workshop1 Issues in the Management of Moving Point Objects Dieter Pfoser Nykredit Center for Database Research Aalborg University, Denmark.
Creating Architectural Descriptions. Outline Standardizing architectural descriptions: The IEEE has published, “Recommended Practice for Architectural.
Needs for Anonymized Mobile Data Discussion Topic / Working Group Seminar
Data Mining – Intro.
1 Introduction Introduction to database systems Database Management Systems (DBMS) Type of Databases Database Design Database Design Considerations.
Advanced Database Applications Database Indexing and Data Mining CS591-G1 -- Fall 2001 George Kollios Boston University.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Understanding Data Analytics and Data Mining Introduction.
Spatial Statistics and Spatial Knowledge Discovery First law of geography [Tobler]: Everything is related to everything, but nearby things are more related.
Copyright 2002 Prentice-Hall, Inc. Chapter 1 The Systems Development Environment 1.1 Modern Systems Analysis and Design.
Mirco Nanni, Roberto Trasarti, Giulio Rossetti, Dino Pedreschi Efficient distributed computation of human mobility aggregates through user mobility profiles.
GEOPKDD - Meeting Venezia 17 Oct 051 Privacy-preserving data warehousing for spatio- temporal data Maria L. Damiani, Università Milano (I)
Time-focused density-based clustering of trajectories of moving objects Margherita D’Auria Mirco Nanni Dino Pedreschi.
Chapter 1 Introduction to Data Mining
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
Knowledge Discovery and Delivery Lab (ISTI-CNR & Univ. Pisa)‏ www-kdd.isti.cnr.it Anna Monreale Fabio Pinelli Roberto Trasarti Fosca Giannotti A. Monreale,
CENTRE CEllular Network Trajectories Reconstruction Environment F. Giannotti, A. Mazzoni, S. Puntoni, C. Renso KDDLab, Pisa.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
The roots of innovation Future and Emerging Technologies (FET) Future and Emerging Technologies (FET) The roots of innovation Proactive initiative on:
Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer Extended RBAC-design and implementation.
Page 1 Alliver™ Page 2 Scenario Users Contents Properties Contexts Tags Users Context Listener Set of contents Service Reasoner GPS Navigator.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
Introduction to Data Mining by Yen-Hsien Lee Department of Information Management College of Management National Sun Yat-Sen University March 4, 2003.
MIS2502: Data Analytics Advanced Analytics - Introduction.
Data Mining and Decision Support
Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.
Efficient OLAP Operations in Spatial Data Warehouses Dimitris Papadias, Panos Kalnis, Jun Zhang and Yufei Tao Department of Computer Science Hong Kong.
Session-2 Participants Cyrus Shahabi Raju Vatsavai Mohamed Mokbel Shashi Shekhar Mubarak Shah Jans Aasman Monika Sester Wei Ding Phil Hwang Anthony Stefanidis.
Data Mining Concepts and Techniques Course Presentation by Ali A. Ali Department of Information Technology Institute of Graduate Studies and Research Alexandria.
Data Warehousing Data Mining Privacy. Reading FarkasCSCE Spring
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 11: BIG DATA AND.
1. ABSTRACT Information access through Internet provides intruders various ways of attacking a computer system. Establishment of a safe and strong network.
Chapter 1 Overview of Databases and Transaction Processing.
CS570: Data Mining Spring 2010, TT 1 – 2:15pm Li Xiong.
Data Mining - Introduction Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Data Mining – Intro.
MIS2502: Data Analytics Advanced Analytics - Introduction
From Development Phase to Implementation
DATA MINING © Prentice Hall.
Chapter 9 Database Systems
SMART APPLICATIONS FOR SMART CITY: A CONTRIBUTION TO INNOVATION
Data Mining.
Introduction C.Eng 714 Spring 2010.
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Mining Dynamics of Data Streams in Multi-Dimensional Space
MBI 630: Systems Analysis and Design
Overview of Databases and Transaction Processing
Data Warehousing and Data Mining
C.U.SHAH COLLEGE OF ENG. & TECH.
About Thetus Thetus develops knowledge discovery and modeling infrastructure software for customers who: Have high value data that does not neatly fit.
Data Warehousing Data Mining Privacy
Yingze Wang and Shi-Kuo Chang University of Pittsburgh
Presentation transcript:

GeoPKDD Geographic Privacy-aware Knowledge Discovery and Delivery Kick-off meeting Pisa, March 14, 2005

Agenda  11:30 –12:00 Introduction  12:00 – 13:00 Pisa  13:00 – 14:00 Lunch  14:00 – 15:00 Venezia  15:00 – 16:00 Cosenza  16:00 – 18:00 Discussion, Planning

GeoPKDD – general project idea Main Innovations  extracting user-consumable forms of knowledge from large amounts of raw geographic data referenced in space and in time.  knowledge discovery and analysis methods for trajectories of moving objects, which change their position in time, and possibly also their shape or other significant features  devising privacy-preserving methods for data mining from sources that typically contain personal sensitive data.

GeoPKDD – specific goals  new models for moving objects, and data warehouse methods to store their trajectories,  new knowledge discovery and analysis methods for moving objects and trajectories,  new techniques to make such methods privacy-preserving,  new techniques to extend such methods to distributed data coming in continuous streams;  new techniques for reasoning on spatio- temporal knowledge and on background knowledge.

GeoPKDD - applications Geographic information coming from mobile devices is expected to enable novel classes of applications In these applications privacy is a concern In particular, how can mobile trajectories be stored and analyzed without infringing personal privacy rights and expectations?

Possible Application Scenario Data source: log data from mobile phones tracking the movements of users from cells –Entering the cell - e.g. (UserID, time, IDcell, in) –Exiting the cell - e.g. (UserID, time, IDcell, out) –Movements inside the cell? Eg (UserID, time, X,Y, Idcell) Trajectory reconstruction Knowledge extraction techniques - emphasis on privacy Description of models – local vs. global

GeoPKDD applications

Possible Application Scenarios Three possible scenarios to exploit the extracted knowledge: 1.Towards the system: adaptive band allocation to cells 2.Towards the society: dynamic traffic monitoring and management for sustainable mobility, urban planning... 3.Towards the individual: personalization of location-based services, car traffic reports, traffic information and predictions

Reconstructing trajectories Scenario 1 In the log entries we have no ID  Log entries become time-stamped events We can extract aggregated info on traffic flow, but not individual trajectories t1t1 t 13 t5t5 t7t7 t2t2 t 10 t8t8 t 11 t9t9 t6t6 t 12 t4t4

Reconstructing trajectories Scenario 2 In the log entries we have (encrypted) IDs  Log entries can be grouped by ID to obtain sequences of time-stamped cells We can extract individual trajectories, with the spatial granularity of a cell: positions of t 5 and t 8 can be distinguished, but not t 5 and t 13 t1t1 t 13 t5t5 t7t7 t2t2 t 10 t8t8 t 11 t9t9 t6t6 t 12 t4t4

Reconstructing trajectories Scenario 3 In the log entries we IDs and (approximated) position in the cell t1t1 t 13 t5t5 t7t7 t2t2 t 10 t8t8 t 11 t9t9 t6t6 t 12 t4t4 We can extract individual trajectories, with a finer spatial granularity: now, positions of t 5 and t 13 can be distinguished.

Which patterns on “trajectories” Clustering Group together similar trajectories For each group produce a summary = cell

Which patterns on “trajectories” Frequent patterns Discover (sub)paths frequently followed

Which patterns on “trajectories” Classification Extract behaviour rules from history Use them to predict behaviour of future users 60% 7% 8% 5% 20% ?

Privacy in GeoPKDD... is a technical issue, besides ethical – social – legal, in the specific context of ST-DM How to formalize privacy constraints over ST data and ST patterns? –E.g., cardinality threshold on clusters of individual trajectories How to transform data to meet privacy constraints? How to design DM algorithms that, by construction, only yield patterns that meet the privacy constraints?

GeoPKDD Spatio- temporal patterns

Why emphasis on privacy? More, better, and new data being gathered, more likely to be sensitive –Increased vulnerability from correlation Data becoming more accessible –Increased opportunity for misuse Need to restrict access to data (patterns) to prevent misuse On the other hand, added data bring new opportunities –Public utility, new markets/paradigms, new services Need to maintain privacy without giving up opportunities

GeoPKDD technologies Spatio-temporal models for moving objects Trajectory warehouses Spatio-temporal data mining methods and data mining query languages Privacy-preserving data mining Distributed and stream data mining Spatio-temporal reasoning

GeoPKDD workpackages

(WP1) Privacy-aware trajectory warehouse (WP2) Privacy-aware spatio-temporal data mining methods (WP3) Geographic knowledge interpretation and delivery (WP4) Harmonization, integration and applications

WP1: Privacy-aware trajectory warehouse Tasks: 1.a trajectory model able to represent moving objects, and to support multiple representations, multiple granularities both in space and in time, and uncertainty; 2.a trajectory data warehouse and associated OLAP mechanisms, able to deal with multi- dimensional trajectory data; 3.support for continuous data streams.

WP2: Privacy-aware spatio-temporal data mining Task: algorithms for spatio-temporal data mining, specifically meant to extract spatio- temporal patterns from trajectories of moving objects, equipped with: 1.methods for provably and measurably protecting privacy in the extracted patterns; 2.mechanisms to express constraints and queries into a data mining query language, in which the data mining tasks can be formulated; 3.distributed and streaming versions.

WP3: Geographic knowledge interpretation and delivery Task: interpretation of the extracted spatio- temporal patterns, by means of ST reasoning mechanisms Issues –uncertainty –georeferenced visualization methods for trajectories and spatio-temporal patterns

WP4: Harmonization, Integration and Applications Tasks: –Harmonization with national privacy regulations and authorities – privacy observatory –Integration of the achieved results into a coherent framework to support the GeoPKDD process –Demonstrators for some selected applications: for public authorities, network operators and/or marketing operators, e.g., in sustainable mobility, network optimization, geomarketing.

Deliverables of Phase 1 (months 1-5) WP1: Privacy-aware trajectory warehouse –[TR1.1] Alignment report and preliminary specification of requirements. WP2: Privacy-aware spatio-temporal data mining –[TR1.2] Alignment report on ST data mining techniques. –[TR1.3] Alignment report on privacy-preserving data mining techniques. –[TR1.4] Alignment report on distributed data mining. WP3: Geographic knowledge interpretation and delivery –[TR1.5] Alignment report on ST reasoning techniques. WP4: Harmonization, Integration and Applications –[TR1.6] Report on characterization of GeoPKDD applications and preliminary feasibility study. –[A1.7] Implantation of the Privacy Regulation Observatory.

Deliverables of Phase 2 (months 6-17) WP1: Privacy-aware trajectory warehouse –[TR2.1] TR on design of the trajectory warehouse. –[P2.2] Prototype of the trajectory warehouse. WP2: Privacy-aware spatio-temporal data mining –[TR2.3] TR on new techniques for ST and trajectory Data Mining. –[TR2.4] TR on new privacy-preserving ST Data Mining. –[TR2.5] TR on distributed data mining –[P2.6] Prototype(s) of privacy-aware ST data mining methods. WP3: Geographic knowledge interpretation and delivery –[TR2.7] TR on ST reasoning techniques and DMQL for geographic knowledge interpretation and delivery. –[P2.8] Prototype(s) of the ST reasoning formalism and DMQL WP4: Harmonization, Integration and Applications –[TR2.9] Requirements of the application demonstrator(s).

Deliverables of Phase 3 (months 18-24) WP4: Harmonization, Integration and Applications –[TR3.1] TR on the design of a system prototype allowing the application of privacy-preserving data mining tools to spatio- temporal and trajectory data. –[P3.2] Prototype implementing the system described in the technical report [TR3.1]. –[P3.3] Prototype extending the system prototype [P3.2] to work on a distributed system. –[TR3.4] TR on the description of the prototypes developed and the results of the experimentation. –[TR3.5] Final report on harmonisation actions and mutual impact between privacy regulations and project results.

Pisa: objectives spatial and spatio-temporal privacy-preserving data mining, with particular focus on –clustering, –constraint-based frequent pattern mining –spatial classification; spatio-temporal logical formalisms to reason on extracted patterns and background knowledge.

Venezia (+ Milano): objectives trajectory model and privacy-preserving data warehouse, within a streamed and distributed context methods to mine sequential and non sequential frequent patterns from trajectories, within a streamed and distributed context postprocessing and interpretation of the extracted spatio-temporal patterns

Cosenza: objectives Trajectory mining –Clustering Privacy-preserving data mining –Probabilistic approach Distributed data mining

Transversal activities experiments, application demonstrators, harmonization with privacy regulations and authorities, dissemination of results.