Mining Statistically Significant Co-location and Segregation Patterns.

Slides:



Advertisements
Similar presentations
On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach Author: Steven L. Salzberg Presented by: Zheng Liu.
Advertisements

Introduction to Hypothesis Testing
There are two statistical tests for mean: 1) z test – Used for large samples (n ≥ 30) 1) t test – Used for small samples (n < 30)
Hotspot/cluster detection methods(1) Spatial Scan Statistics: Hypothesis testing – Input: data – Using continuous Poisson model Null hypothesis H0: points.
Lecture 3 Outline: Thurs, Sept 11 Chapters Probability model for 2-group randomized experiment Randomization test p-value Probability model for.
Presented by: GROUP 7 Gayathri Gandhamuneni & Yumeng Wang.
Decision Errors and Power
STAT 135 LAB 14 TA: Dongmei Li. Hypothesis Testing Are the results of experimental data due to just random chance? Significance tests try to discover.
Statistical Significance What is Statistical Significance? What is Statistical Significance? How Do We Know Whether a Result is Statistically Significant?
Detecting Differentially Expressed Genes Pengyu Hong 09/13/2005.
© 2001 Prentice-Hall, Inc.Chap 9-1 BA 201 Lecture 15 Test for Population Mean Known.
Hypothesis Testing Steps of a Statistical Significance Test. 1. Assumptions Type of data, form of population, method of sampling, sample size.
Statistical Significance What is Statistical Significance? How Do We Know Whether a Result is Statistically Significant? How Do We Know Whether a Result.
SSCP: Mining Statistically Significant Co-location Patterns Sajib Barua and Jörg Sander Dept. of Computing Science University of Alberta, Canada.
Chapter 8 Introduction to Hypothesis Testing
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 8 Introduction to Hypothesis Testing.
Co-location pattern mining (for CSCI 5715) Charandeep Parisineti, Bhavtosh Rath Chapter 7: Spatial Data Mining [1]Yan Huang, Shashi Shekhar, Hui Xiong.
Chapter 9 Hypothesis Testing.
Statistics 03 Hypothesis Testing ( 假设检验 ). When we have two sets of data and we want to know whether there is any statistically significant difference.
Copyright c 2001 The McGraw-Hill Companies, Inc.1 Chapter 7 Sampling, Significance Levels, and Hypothesis Testing Three scientific traditions critical.
Choosing Statistical Procedures
June 19, 2008Stat Lecture 12 - Testing 21 Introduction to Inference More on Hypothesis Tests Statistics Lecture 12.
1 © Lecture note 3 Hypothesis Testing MAKE HYPOTHESIS ©
Hypothesis Testing.
8 - 1 © 2003 Pearson Prentice Hall Chi-Square (  2 ) Test of Variance.
Estimation and Hypothesis Testing Now the real fun begins.
Chapter 8 Introduction to Hypothesis Testing
Claims about a Population Mean when σ is Known Objective: test a claim.
Means Tests Hypothesis Testing Assumptions Testing (Normality)
The Probability of a Type II Error and the Power of the Test
Chapter 8 Hypothesis Testing I. Chapter Outline  An Overview of Hypothesis Testing  The Five-Step Model for Hypothesis Testing  One-Tailed and Two-Tailed.
Hypothesis Testing: One Sample Cases. Outline: – The logic of hypothesis testing – The Five-Step Model – Hypothesis testing for single sample means (z.
Chapter 10 Hypothesis Testing
Lecture 7 Introduction to Hypothesis Testing. Lecture Goals After completing this lecture, you should be able to: Formulate null and alternative hypotheses.
Learning Objectives In this chapter you will learn about the t-test and its distribution t-test for related samples t-test for independent samples hypothesis.
Testing Hypothesis That Data Fit a Given Probability Distribution Problem: We have a sample of size n. Determine if the data fits a probability distribution.
Inferential Statistics Body of statistical computations relevant to making inferences from findings based on sample observations to some larger population.
Detecting Group Differences: Mining Contrast Sets Author: Stephen D. Bay Advisor: Dr. Hsu Graduate: Yan-Cheng Lin.
Using Inference to MAKE DECISIONS The Type I and Type II Errors in Hypothesis Testing.
Chapter 8 Hypothesis Testing I. Significant Differences  Hypothesis testing is designed to detect significant differences: differences that did not occur.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 7 Sampling, Significance Levels, and Hypothesis Testing Three scientific traditions.
Logic and Vocabulary of Hypothesis Tests Chapter 13.
Hypothesis Testing Errors. Hypothesis Testing Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean.
© Copyright McGraw-Hill 2004
Formulating the Hypothesis null hypothesis 4 The null hypothesis is a statement about the population value that will be tested. null hypothesis 4 The null.
AP Statistics Chapter 21 Notes
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
Hypothesis Testing Steps for the Rejection Region Method State H 1 and State H 0 State the Test Statistic and its sampling distribution (normal or t) Determine.
Hypothesis Testing. Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean μ = 120 and variance σ.
The Idea of the Statistical Test. A statistical test evaluates the "fit" of a hypothesis to a sample.
If we fail to reject the null when the null is false what type of error was made? Type II.
PEP-PMMA Training Session Statistical inference Lima, Peru Abdelkrim Araar / Jean-Yves Duclos 9-10 June 2007.
More about tests and intervals CHAPTER 21. Do not state your claim as the null hypothesis, instead make what you’re trying to prove the alternative. The.
Tests of hypothesis Statistical hypothesis definition: A statistical hypothesis is an assertion or conjecture on or more population.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
Chapter 9: Hypothesis Tests for One Population Mean 9.5 P-Values.
Ex St 801 Statistical Methods Part 2 Inference about a Single Population Mean (HYP)
Learning Objectives Describe the hypothesis testing process Distinguish the types of hypotheses Explain hypothesis testing errors Solve hypothesis testing.
More on Inference.
Part Four ANALYSIS AND PRESENTATION OF DATA
More on Inference.
P-value Approach for Test Conclusion
Statistical inference
Chapter 9: Hypothesis Tests Based on a Single Sample
P-VALUE.
Chapter 7: Statistical Issues in Research planning and Evaluation
Inference as Decision Section 10.4.
AP STATISTICS LESSON 10 – 4 (DAY 2)
Statistical Power.
Statistical inference
Presentation transcript:

Mining Statistically Significant Co-location and Segregation Patterns

Outline Motivation Key concepts Problem definition Related works Challenges Contribution Validation

Motivation Finding collocated events provides insightful evidences in decision making and scientific research: –Ecology –Biology –Epidemiology –… Colocation patterns caused by randomness need attention: –Presence of spatial autocorrelation –Abundance of feature instances –…

Outline Motivation Key concepts Problem definition Related works Challenges Contribution Validation

Key Concept (1)

Key Concept (2) Null hypothesis –A hypothesis that one tries to disprove given the observation from the dataset. Alternative hypothesis –The opposite of null hypothesis, which is true when null hypothesis is rejected.

Key Concept (2) Null hypothesis –For a colocation pattern C, a higher participation index can be obtained in a random feature distribution(spatial autocorrelation is considered). –For a segregation pattern C, a lower participation index can be obtained in a random feature distribution.

Key Concept (3) Statistical significance –Significance is determined by significance level α (or Type I error), which is the probability of rejecting the null hypothesis given that it is true. –For each observed pattern, this probability is called p-value.

Key Concept (4)

Outline Motivation Key concepts Problem definition Related works Challenges Contribution Validation

Problem Definition

Outline Motivation Key concepts Problem definition Related works Challenges Contribution Validation

Related work Co-location Patterns Segregation Patterns Significance Test Spatial Co-location Patterns Detection √ Spatial Segregation Patterns Detection √ Mining Statistically Significant Co-location and Segregation Patterns √√√

Outline Motivation Key concepts Problem definition Related works Challenges Contribution Validation

Challenge The co-location/segregation patterns determined by a manually set threshold will raise false positives and are sensitive to dataset No probability model is available to compute the significance level (p-value) in a closed-form fashion; Computation is expensive to test the significance through Monte Carlo simulation.

Outline Motivation Key concepts Problem definition Related works Challenges Contribution Validation

Contributions Incorporates statistical significance test with colocation and segregation pattern detection which reduces spurious patterns caused by randomness; Proposes three approaches for algorithm acceleration: –a subset-based filter –a grid-based sampling framework –a spatial-join based pruning technique

Subset-based Filter

Grid-based Sampling

Spatial-join Based Pruning

Outline Motivation Key concepts Problem definition Related works Challenges Contribution Validation

Quality of Approximation – Grid-based Participation Index

Inhibition (synthetic data set)

Auto-correlation (synthetic data set)

Mixed Spatial Interactions (synthetic data set)

Runtime Comparison (1) Fixed total cluster number of each auto-correlated feature

Runtime Comparison (2) Various total cluster number of each auto-correlated feature

Experiments (real data set) –Ants –Bramble Canes –Lansing Woods –Toronto address repository

Ants Data