Albany New York (1) G. P. Patil

Slides:



Advertisements
Similar presentations
Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
Advertisements

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
WFM 6202: Remote Sensing and GIS in Water Management © Dr. Akm Saiful IslamDr. Akm Saiful Islam WFM 6202: Remote Sensing and GIS in Water Management Akm.
Cluster Analysis.  What is Cluster Analysis?  Types of Data in Cluster Analysis  A Categorization of Major Clustering Methods  Partitioning Methods.
Presenting information
GI Systems and Science January 23, Points to Cover  What is spatial data modeling?  Entity definition  Topology  Spatial data models Raster.
Radial Basis Function Networks
An overview of a few of the methods used in landscape ecology studies.
Thematic Maps Choropleth, Proportional/Graduated Symbol, Digital Image, Isoline/Isopleth and Dot Distribution Maps.
Spatial data models (types)
Accuracy Assessment. 2 Because it is not practical to test every pixel in the classification image, a representative sample of reference points in the.
How do we represent the world in a GIS database?
Raster Data Model.
Texture. Texture is an innate property of all surfaces (clouds, trees, bricks, hair etc…). It refers to visual patterns of homogeneity and does not result.
Spatial Interpolation III
Extent and Mask Extent of original data Extent of analysis area Mask – areas of interest Remember all rasters are rectangles.
Digital Image Processing
CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.
Defining Landscapes Forman and Godron (1986): A
Geotechnology Geotechnology – one of three “mega-technologies” for the 21 st Century Global Positioning System (Location and navigation) Remote Sensing.
Introduction to Models Lecture 8 February 22, 2005.
Chapter 2: Frequency Distributions. Frequency Distributions After collecting data, the first task for a researcher is to organize and simplify the data.
Tutorial I: Missing Value Analysis
Educational Research Descriptive Statistics Chapter th edition Chapter th edition Gay and Airasian.
Nationwide Sustainability Indicators and Their Integration, Evaluation, and Visualization Worldwide - UNEP Initiative - Sustainability Indicators Indicators.
INDIAN SCIENCE CONGRESS Mumbai 2015 Actuarial Science Symposium G. P. Patil Penn State University, University Park, PA USA.
Spatial Scan Statistic for Geographical and Network Hotspot Detection C. Taillie and G. P. Patil Center for Statistical Ecology and Environmental Statistics.
DC G. P. Patil. This report is very disappointing. What kind of software are you using?
This report is very disappointing. What kind of software are you using?
Multiscale Raster Map Analysis for Sustainble Environment and Development A Research and Outreach Prospectus of Advanced Mathematical, Statistical and.
Geographic and Network Surveillance for Arbitrarily Shaped Hotspots Overview Geospatial Surveillance Upper Level Set Scan Statistic System Spatial-Temporal.
Comparative Knowledge Discovery with Partial Order and Composite Indicator Partial Order Ranking of Objects with Weights for Indicators and Its Representability.
1 Multi-criterion Ranking and Poset Prioritization G. P. Patil December 2004 – January 2005.
Logistic Regression: Regression with a Binary Dependent Variable.
4.6.1 Upper Echelons of Surfaces
26. Classification Accuracy Assessment
Virtual University of Pakistan
Chapter 2 Summarizing and Graphing Data
GEOGRAPHICAL INFORMATION SYSTEM
EPA Presentation March 13,2003 G. P. Patil
Dept of Biostatistics, Emory University
Data, Tables & Graphs “…the kind of question you ask determines the kind of data you collect” Handbook, p11.
HIERARCHICAL CLASSIFICATION OF DIFFERENT CROPS USING
Summary of Prev. Lecture
NSF Digital Government surveillance geoinformatics project, federal agency partnership and national applications for digital governance.
Analyzing and Interpreting Quantitative Data
Mean Shift Segmentation
Incorporating Ancillary Data for Classification
Spatial Data Models Raster uses individual cells in a matrix, or grid, format to represent real world entities Vector uses coordinates to store the shape.
University College London (UCL), UK
Multidimensional Scaling and Correspondence Analysis
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
REMOTE SENSING Multispectral Image Classification
REMOTE SENSING Multispectral Image Classification
Levelled Assessment Success Criteria
Elementary Statistics
Geographic and Network Surveillance for Arbitrarily Shaped Hotspots
Special Topics in Geo-Business Data Analysis
Spatial interpolation
Cluster Validity For supervised classification we have a variety of measures to evaluate how good our model is Accuracy, precision, recall For cluster.
University College London (UCL), UK
Multidimensional Scaling
An Introduction to Correlational Research
Nearest Neighbors CSC 576: Data Mining.
Nicholas A. Procopio, Ph.D, GISP
The Normal Curve Section 7.1 & 7.2.
Correspondence Analysis
Essentials of Statistics 4th Edition
Prepared by S Krishna Kumar
Landscape ecology methods
Presentation transcript:

Albany New York (1) G. P. Patil

This report is very disappointing. What kind of software are you using?

Space Age and Stone Age Syndrome Data: Space Age/Stone Age Analysis: Space Age/Stone Age Data Space Age Stone Age Analysis Space Age + + Stone Age +

Geospatial Cell-based Data Kinds of Data Cell as a Unit (Regular grid layout) Categorical Ordinal Numerical Multivariate Numerical Cell as an Object (Irregular cell sizes and shapes) Partially Ordered

Consider a 21st century digital government scenario of the following nature: What message does a remote sensing-derived land cover land use map have about the large landscape it represents? And at what scale and at what level of detail? Does the spatial pattern of the map reveal any societal, ecological, environmental condition of the landscape? And therefore can it be an indicator of change?

Consider a 21st century digital government scenario of the following nature: How do you automate the assessment of the spatial structure and behavior of change to discover critical areas, hot spots, and their corridors? Is the map accurate? How accurate is it? How do you assess the accuracy of the map? Of the change map over time for change detection?

Consider a 21st century digital government scenario of the following nature What are the implications of the kind and amount of change and accuracy on what matters, whether climate change, carbon emission, water resources, urban sprawl, biodiversity, indicator species, or early warning, or others. And with what confidence, even with a single map/change-map?

The needed partnership research is expected to find answers to these questions and a few more that involve multicategorial raster maps based on remote sensing and other geospatial data. It is also expected to design a prototype and user-friendly advanced raster map analysis system for digital governance.

National Mortality Maps and Health Statistics Health Service Areas, Counties, Zip Codes, … Geographical Patterns for Health Resource Allocation Study Areas for Putative Sources of Health Hazard Balance between dilution effect and edge effect Case Event Analysis and Ecological Analysis Thresholds, contours, corresponding data Regional Comparisons and Rankings with Multiple Indicators/Criteria Choices of Reference/Control Areas

National Mortality Maps and Statistics Geographic Patterns—1 Mortality rate due to a specific cause of death Elevated rates areas, patterns Ordinal thematic maps Transition pattern, transitionogram Transition matrices; spatial association with varying distance Comparatives with different causes of death

National Mortality Maps and Statistics Geographic Patterns—2 Surface topology and spatial structure High mortality area delineation Hotspots, clusters, outbreaks, corridors Surface smoothing Masking of true geographic patterns? Echelon analysis, original surface, smoothed surface

Multiple Criteria Analysis Multiple Indicators Partial Ordering Procedures Cells are objects of primary interest, such as countries, states, watersheds, counties, etc. Cell comparisons and rankings are the goals Suite of indicators are available on each cell Different indicators have different comparative messages, i.e., partial instead of linear ordering Hasse diagrams for visualization of partial orders. Multi-level diagram whose top level of nodes consists of all maximal elements in the partially ordered set of objects. Next level consists of all maximal elements when top level is removed from the partially ordered set, etc. Nodes are joined by segments when they are immediately comparable.

Multiple Criteria Analysis Multiple Indicators Partial Ordering Procedures Issues to be addressed: Crisp rankings, interval rankings, fuzzy rankings Fuzzy comparisons Echelon analysis of partially ordered sets with ordinal response levels determined by successive levels in the Hasse diagram Hasse diagram metrics: height, width, dimension, ambiguity (departure from linear order), etc. Hasse diagram stochastics (random structure on the indicators or random structure on Hasse diagram) Hasse diagram comparisons, e.g., compare Hasse diagrams for different regions

Figure 1. Example of perfect positive and perfect negative correlation between two coordinates (variables).

Hasse Diagram (all countries)

Ranking Partially Ordered Sets – 1 S = partially ordered set (poset) with elements a, b, c, …. How can we rank the elements of S consistent with the partial order? Such rankings are called linear extensions of the partial order. Different people with different perceptions and priorities may choose different rankings. How many rankings assign rank 1 to element a ? Rank 2 ? Rank 3 ? , etc. If rankings are chosen randomly (equal probability), what is the likelihood that element x receives rank i ?

Ranking Partially Ordered Sets – 2 An Example Poset (Hasse Diagram) Some linear extensions a a a b b a b c c b a a c d b e c c c e b d d e e f d d e e d f f f f f Jump Size: 3 1 5 4 2 Jump or Imputed Link (-------) is a link in the ranking that is not implied by the partial order

Ranking Partially Ordered Sets – 5 Poset (Hasse Diagram) Linear extension decision tree a b a b c d c b a d e b c d c d a e f b e d e d c e d c c d d e f d e f e d e f e f e f f f f f e f f e f e f f e f e f e Jump Size: 1 3 3 2 3 5 4 3 3 2 4 3 4 4 2 2

Ranking Partially Ordered Sets – 3 In the example from the preceding slide, there are a total of 16 linear extensions, giving the following frequency table. Rank Element 1 2 3 4 5 6 Totals a 9 16 b 7 c d e f 10 Each (normalized) row gives the rank-frequency distribution for that element Each (normalized) column gives a rank-assignment distribution across the poset

Ranking Partially Ordered Sets – 3a Rank-Frequency Distributions Element a Element b Element c Element d Element e Element f Rank Rank

Cumulative Rank Frequency Operator – 5 An Example of the Procedure In the example from the preceding slide, there are a total of 16 linear extensions, giving the following cumulative frequency table. Rank Element 1 2 3 4 5 6 a 9 14 16 b 7 12 15 c 10 d e f Each entry gives the number of linear extensions in which the element (row label) receives a rank equal to or better that the column heading

Cumulative Rank Frequency Operator – 6 An Example of the Procedure 16 The curves are stacked one above the other and the result is a linear ordering of the elements: a > b > c > d > e > f

Original Poset (Hasse Diagram) Cumulative Rank Frequency Operator – 8 An example where F results in ties Original Poset (Hasse Diagram) a c b d a b, c (tied) d F Ties reflect symmetries among incomparable elements in the original Hasse diagram Elements that are comparable in the original Hasse diagram will not become tied after applying F operator

Original Poset (Hasse Diagram) Cumulative Rank Frequency Operator – 7 An example where F must be iterated a f e b d c h g F F 2 Original Poset (Hasse Diagram) a f e b d c h g a f e b c g d h

Ranking Possible Disease Clusters in the State of New York Data Matrix

Composite Indexes -- 1 I1, I2, . . ., Ip indicators for ranking elements of some set G(I1, I2, . . ., Ip) = composite index Many possible choices for G: Only general requirement is that G must be increasing in each indicator separately

Composite Indexes -- 2 Each choice of G determines a set of G-contours in indicator space and thereby determines a set of substitution or trade-off rules among the indicators Contour of constant G x substitutes for y  y Indicator 2 (y)  x  y  x Indicator 1 (x)

Comparison of Ranking Methods Composite Indicator Approach Requires choice of composite index G Implicitly or explicitly, requires choice of substitution rules among different indicators (this is often like comparing apples and oranges) Difficult to achieve a consensus on choice of G. Final decision is often made on basis of mathematical simplicity instead of scientific substance Once G is chosen, future elements are easily incorporated into the ranking without changing relative ranks of earlier elements

Comparison of Ranking Methods Poset Cumulative Rank Frequency Approach Entirely objective---no arbitrary choices involved Computationally challenging (typically requires combinatorial MCMC) Final ranking applies only to the given set of elements and reflects overall structure of entire Hasse diagram If new elements are added to the collection to be ranked, all computations must be redone and relative rankings of earlier elements may change

Incorporating Judgment Poset Cumulative Rank Frequency Approach Certain of the indicators may be deemed more important than the others Such differential importance can be accommodated by the poset cumulative rank frequency approach Instead of the uniform distribution on the set of linear extensions, we may use an appropriately weighted probability distribution  , e.g.,

Second stage screening Multiple Criteria Analysis Multiple Indicators and Choices Health Statistics Disease Etiology, Health Policy, Resource Allocation First stage screening Significant clusters by SaTScan and/or upper surface level echelon sets Second stage screening Multicriteria noteworthy clusters by partially ordered sets and Hass diagrams Final stage screening Follow up clusters for etiology, intervention based on multiple criteria using Hass diagrams