Presentation is loading. Please wait.

Presentation is loading. Please wait.

SPIN! Michael May, EC-GIS 2000, 29.6.00 1 Spatial Knowledge Discovery - the IST-SPIN!-project Michael May German National Resarch Center for Information.

Similar presentations


Presentation on theme: "SPIN! Michael May, EC-GIS 2000, 29.6.00 1 Spatial Knowledge Discovery - the IST-SPIN!-project Michael May German National Resarch Center for Information."— Presentation transcript:

1 SPIN! Michael May, EC-GIS 2000, 29.6.00 1 Spatial Knowledge Discovery - the IST-SPIN!-project Michael May German National Resarch Center for Information Technology (GMD)

2 SPIN! Michael May, EC-GIS 2000, 29.6.00 2 1. Introduction: Spatial Knowledge Discovery SPIN!

3 SPIN! Michael May, EC-GIS 2000, 29.6.00 3 Motivation GIS revolution brought an explosion of geographically referenced data, yet few tools to automatically extract useful information One very interesting development are interactive thematic maps (CDV (Dykes 1997), SAGE (Haining 1998), Descartes (Andrienko & Andrienko 1999)) Leave room for complementary methods: –Hard to visualize multi-variate dependencies –visual identification of patterns is subjective

4 SPIN! Michael May, EC-GIS 2000, 29.6.00 4 Knowledge Discovery in Databases (KDD) Characterization: Knowledge discovery in databases is the non- trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data. (Fayyad, Piatetsky-Shapiro, Smyth 1996)

5 SPIN! Michael May, EC-GIS 2000, 29.6.00 5 Data Mining: example task Predicting crop yield using data on past yield and data about soil condition, climate detecting credit fraud by spotting unusual transaction patterns classifying stars by using spectral data

6 SPIN! Michael May, EC-GIS 2000, 29.6.00 6 Knowledge Discovery Cycle Selection & Transformation Data Mining Visualization & Interpretation

7 SPIN! Michael May, EC-GIS 2000, 29.6.00 7 Kepler Preprocessing Data selection Cleaning Transformation

8 SPIN! Michael May, EC-GIS 2000, 29.6.00 8 Kepler Exploration Drill down: exploring the data at different levels of aggregation Descriptive statistics & visualization

9 SPIN! Michael May, EC-GIS 2000, 29.6.00 9 Decision Trees (DTI) Regression Trees (RT) k-NN Subgroup (Midos) ILP (Foil) Kepler Data Mining Decision tree Regression tree Subgroup discovery k-nearest neighbor ILP (Foil)

10 SPIN! Michael May, EC-GIS 2000, 29.6.00 10 Kepler Visualization Rules Subgroups Decision Trees

11 SPIN! Michael May, EC-GIS 2000, 29.6.00 11 Data Mining vs. GIS Data Mining system generates hypothesis search (and visualization) in abstract space inductive generalizations exceeding content of database GIS user generates hypothesis visualization in geographical space shows what’s inside the data Both techniques are exploratory

12 SPIN! Michael May, EC-GIS 2000, 29.6.00 12 2. SPIN!: Spatial Mining for Data of Public Interest SPIN!

13 SPIN! Michael May, EC-GIS 2000, 29.6.00 13 SPIN! Spatial Mining for data of public interest  German National Research Center for Information Technology (GMD),  University of Bari, Italy;  School of Geography, University of Leeds, UK  Dialogis Software & Services GmbH, Bonn, Germany;  Professional GeoSystems (PGS), Amsterdam, Holland;  Metropolitan and Victoria Univ., Manchester, MIMAS,  IITP, Russian Academy of Sciences, Moscow;  GeoForschungszentrum Potsdam, Germany. IST-1999-10536 SPIN!, Duration: 1/2000-12/2002 Coordination: GMD, michael.may@gmd.de

14 SPIN! Michael May, EC-GIS 2000, 29.6.00 14 SPIN! Objectives Develop a system architecture integrating state of the art GIS and Data Mining functionality in an open, extensible, internet-enabled architecture Adapt - inductive logic programming learning methods and - Bayesian Markov Chain Monte Carlo to spatial data Develop new visualization for Data Mining in GIS Develop new visualization of temporal and spatial data. Apply system to - seismic and volcano data analysis (with GFZ) - web-based dissemination of census data (with ONS and MIMAS)

15 SPIN! Michael May, EC-GIS 2000, 29.6.00 15 Level 1: Data access and management Provided by data mining platform Kepler data access to heterogeneous and distributed data sources (RDBMS, flat file, spatial data) data query and transformation (restriction, projection, union, join, calculated rows) exploratory non-spatial visualization organizing and documenting analysis tasks

16 SPIN! Michael May, EC-GIS 2000, 29.6.00 16 Level 2: Internet-enabled map viewer Lava/Magma Java-based internet GIS developed by Professional GeoSystems (PGS) support for zooming, panning etc. Excellent scalability through client-side caching

17 SPIN! Michael May, EC-GIS 2000, 29.6.00 17 Level 1: Data access and management Provided by data mining platform Kepler data access to heterogeneous and distributed data sources (RDBMS, flat file, spatial data) data query and transformation (restriction, projection, union, join, calculated rows) exploratory non-spatial visualization organizing and documenting analysis tasks

18 SPIN! Michael May, EC-GIS 2000, 29.6.00 18 Level 2: Internet-enabled map viewer Lava/Magma Java-based internet GIS developed by Professional GeoSystems (PGS) support for zooming, panning etc. Excellent scalability through client-side caching

19 SPIN! Michael May, EC-GIS 2000, 29.6.00 19 Knowledge-based map design (Andrienko & Andrienko 1999) Dynamic maps allowing interactive manipulation Rule base on map design Selected data subset Data characterization: types and relationships Map designer  Level 3: Interactive thematic maps

20 SPIN! Michael May, EC-GIS 2000, 29.6.00 20 searching for localised spatial clustering examining circles of varying sizes that cover the region of interest compare relative frequency with expected value retain significant circles apply kernel smoothing Openshaw 1998, 2000 Level 4: Automated cluster detection:GAM/K

21 SPIN! Michael May, EC-GIS 2000, 29.6.00 21 Level 5: Explaining clusters and spatial phenomena Assume we have produced a classification or clustering using either Descartes or GAM: What attributes are associated with a cluster and could potentially explain it?

22 SPIN! Michael May, EC-GIS 2000, 29.6.00 22 Example: GIS & decision trees Decision TreeThematic Map

23 SPIN! Michael May, EC-GIS 2000, 29.6.00 23 Using Inductive Logic Programming Learning approach based on first-order predicate logic can express relations between instances greater representational power compared to attribute- value learners using ‘single-table data’ Topological relations such as adjacent_to, close_to, inside can be included search for explanations crime_hotspot(X) :- city(Z), high_unemployment(Z), train_station(Y), inside(Y,Z), close_to(X,Y). Topological predicates

24 SPIN! Michael May, EC-GIS 2000, 29.6.00 24

25 SPIN! Michael May, EC-GIS 2000, 29.6.00 25 Application challenge

26 SPIN! Michael May, EC-GIS 2000, 29.6.00 26 Eruption of Merapi volcano, Java, Indonesia

27 SPIN! Michael May, EC-GIS 2000, 29.6.00 27 Merapi volcano in central Java

28 SPIN! Michael May, EC-GIS 2000, 29.6.00 28 Tasks for Merapi application  Estimation of possible future eruption  Combining information about land use/land cover, infrastructure and population in order to make a damage assessment.  Dissemination of information for volcano risk mitigation over the Internet.

29 SPIN! Michael May, EC-GIS 2000, 29.6.00 29 Web-based dissemination of census data MIMAS disseminates UK census data to the UK academic sector Using SPIN!-technology for providing additional value to the mere distribution of data UK Unitary development plans, selected application area: Manchester Stockport –Forecasting numbers of houses needed –Allocation of land –Development control

30 SPIN! Michael May, EC-GIS 2000, 29.6.00 30 Conclusion Integration of Data Mining and GIS is a logical progression of spatial data analysis technology Integrating interactive statistical maps in the knowledge discovery process improves the visualization and interpretation step of KDD Map based data analysis can be supplemented by data mining methods for potential explanations of patterns First prototype expected by the end of the year 2000!


Download ppt "SPIN! Michael May, EC-GIS 2000, 29.6.00 1 Spatial Knowledge Discovery - the IST-SPIN!-project Michael May German National Resarch Center for Information."

Similar presentations


Ads by Google