Jong Gyu Han, Keun Ho Ryu, Kwang Hoon Chi and Yeon Kwang Yeon

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

Los Padres National Forest
Inferential Statistics and t - tests
GIS IN GEOLOGY Miloš Marjanović Lesson
Predictive Model of Mountain Goat Summer Habitat Suitability in Glacier National Park, Montana, USA Don White, Jr. 1 and Steve Gniadek 2 1 University of.
Statistical Decision Making
Bayesian Decision Theory
Basic geostatistics Austin Troy.
Estimating Tree Failure Risk Along Connecticut Utility Right-of-Ways Helen Poulos Wesleyan University Ann Camp Yale School of Forestry and Environmental.
Assuming normally distributed data! Naïve Bayes Classifier.
QUANTITATIVE DATA ANALYSIS
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
Statistics for Managers Using Microsoft® Excel 5th Edition
Statistical Methods Chichang Jou Tamkang University.
1 Spatially Explicit Burn Probability across A Landscape in Extreme Fire Weather Year Wenbin Cui, David L. Martell Faculty of Forestry, University of Toronto.
Yan Y. Kagan Dept. Earth and Space Sciences, UCLA, Los Angeles, CA , Global.
Prima M. Hilman Head of Information Division – Centre for Geological Resources REPUBLIC OF INDONESIA MINISTRY OF ENERGY AND MINERAL RESOURCES GEOLOGICAL.
Using GIS in Forestry c 7 H
The Firewise Virginia Program John Miller - Chief, Resource Protection Virginia Department of Forestry.
2. Probabilistic Mineral Resource Potential Mapping The processing of geo-scientific information for the purpose of estimating probabilities of occurrence.
Area Objects and Spatial Autocorrelation Chapter 7 Geographic Information Analysis O’Sullivan and Unwin.
Correlation and Linear Regression
CHAPTER 05 RISK&RETURN. Formal Definition- RISK # The variability of returns from those that are expected. Or, # The chance that some unfavorable event.
Mapping the future Converting storylines to maps Nasser Olwero GMP, Bangkok April
Grid-based Analysis in GIS
CFR 250/590 Introduction to GIS © Phil Hurvitz, intro_overview.ppt Introduction-Overview Why use a GIS? What can a GIS do? How does a GIS work?
by B. Zadrozny and C. Elkan
BraMBLe: The Bayesian Multiple-BLob Tracker By Michael Isard and John MacCormick Presented by Kristin Branson CSE 252C, Fall 2003.
Lecture Four RISK & RETURN.
An-Najah National University Civil Engineering Department Analysis of the Water Distribution Network of howara- Nablus Submitted by: Rami Ahmad Mohammed.
Fuzzy Entropy based feature selection for classification of hyperspectral data Mahesh Pal Department of Civil Engineering National Institute of Technology.
Advanced Topics in GIS. Natural Hazards Landslide Susceptibility.
“PREDICTIVE MODELING” CoSBBI, July Jennifer Hu.
Spatial Association Defining the relationship between two variables.
1 6. Reliability computations Objectives Learn how to compute reliability of a component given the probability distributions on the stress,S, and the strength,
Remote Sensing Supervised Image Classification. Supervised Image Classification ► An image classification procedure that requires interaction with the.
Lesson 1.7 Dividing Integers Standards: NS 1.2 and AF 2.1 Objectives: Dividing Integers Evaluate variable expressions.
LOGISTIC REGRESSION A statistical procedure to relate the probability of an event to explanatory variables Used in epidemiology to describe and evaluate.
Learning Theory Reza Shadmehr Linear and quadratic decision boundaries Kernel estimates of density Missing data.
Figure 2 – Annual burned area recorded in Portugal during the period Results Parameter Period 1 ( )Period 2 ( )Period 3 ( )
A Model for Learning the Semantics of Pictures V. Lavrenko, R. Manmatha, J. Jeon Center for Intelligent Information Retrieval Computer Science Department,
ISQS 6347, Data & Text Mining1 Ensemble Methods. ISQS 6347, Data & Text Mining 2 Ensemble Methods Construct a set of classifiers from the training data.
Correlation Assume you have two measurements, x and y, on a set of objects, and would like to know if x and y are related. If they are directly related,
Pollution and Human Health
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Indian Institute of Technology Bombay Bayesian Probabilistic or Weights of Evidence Model for Mineral Prospectivity Mapping.
Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.
Indian Institute of Technology Bombay Bayesian Probabilistic or Weights of Evidence Model for Mineral Prospectivity Mapping.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Chapter 7 Measuring of data Reliability of measuring instruments The reliability* of instrument is the consistency with which it measures the target attribute.
Interfacing Vegetation Databases with ecological theory and practical analysis. Mike Austin, Margaret Cawsey and Andre Zerger CSIRO Sustainable Ecosystems.
Bayesian decision theory: A framework for making decisions when uncertainty exit 1 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e.
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
Chapter 13 Understanding research results: statistical inference.
Copyright ©2004 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 4-1 Probability and Counting Rules CHAPTER 4.
Why Is It There? Chapter 6. Review: Dueker’s (1979) Definition “a geographic information system is a special case of information systems where the database.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Lecture 1.31 Criteria for optimal reception of radio signals.
Achievements in Wildland Fire Risk Mapping
Basic Estimation Techniques
STATISTICAL TOOLS FOR AUDITING
Basic Estimation Techniques
Introduction to Geographic Information Science
PRAKASH CHOCKALINGAM, NALIN PRADEEP, AND STAN BIRCHFIELD
Map Types Study Guide Physical Map
Think about the area of Towson.
Mathematical Foundations of BME
Evolutionary Ensembles with Negative Correlation Learning
Regression Part II.
Presentation transcript:

Jong Gyu Han, Keun Ho Ryu, Kwang Hoon Chi and Yeon Kwang Yeon Statistics Based Predictive Geo-Spatial Data Mining: Forest Fire Hazardous Area Mapping Application Jong Gyu Han, Keun Ho Ryu, Kwang Hoon Chi and Yeon Kwang Yeon 26-10-2003

Problem Definition -Forrest Fire Prevention Finding spatial-temporal distribution of forest fires Predicting forest fire hazardous areas from large spatial data sets Leads to a forest fire hazard prediction model 26-10-2003

Problem Definition 2 Youngdong Region of Kangwan Province, Republic of Korea Using: -Historical data on fire ignition point locations -Grid-based multi-layer GIS 26-10-2003

Prediction Methods Depends on relationship of spatial data sets relevant to forest fire with respect to areas of previous forest fire ignition N[S] = all area N[F] = fire ignition areas N[A] = forest type A N[E] = area of fire ignition on forest type A 26-10-2003

Conditional Probability Prediction Model Average density of ignition areas: P(F) = N[F]/N[S] Without other information this is the probability of a forest fire ignition area Favourability of finding a forest ignition area given the presence of forest type A: CondP(F\A) = P(A\F) · P(F) P(A) P(A\F) = P(A ∩ F) P(F) P(A ∩ F) = N[A ∩ F] / N[S] = N[E] / N[S] 26-10-2003

Conditional Probability Prediction Model Example N[S] = 100.000 N[F] = 500 N[A] = 2500 N[E] = 100 P(F) = N[F]/N[S] -> 500/100.000 = 0,005 P(A\F) = N[E]/N[S] -> 100/500 = 0.2 P (A) = N[A]/N[S] -> 2500/100.000 = 0.025 CondP(F\A) = ((N[F]/N[S]) · (N[E]/N[S])) / (N[A]/N[S]) -> 0,005 × 0.2/0.025 = 0.04 Given the presence of forest type A, the probability of a forest fire occurrence is 8 times greater than the prior probability 26-10-2003

Likelihood Ratio Prediction Model Represents the ratio of two spatial distribution functions: one with forest fire and one without occurrences LR(A\F) = P(A\F) P (A\F) LR(A\F) = N[E] · (N[S] – N[F]) N[F] · (N[A] – N[E]) N[E] · (N[S] – N[F]) = 100 * (100.000 - 500) = 9.950.000 N[F] · (N[A] – N[E]) = 500 * (2500 - 100) = 1.200.000 LR(A\F) = 9.950.000/1.200.000 = 8,2916 >1: positive evidence for forest ignition 1: uncorrelated <1: negatively correlated 26-10-2003

Prediction Procedure Forestry Maps Topography Maps Human Activities -Fire History Data -A large number of thematic layers can be suitable related to forest fire occurrences -Relevance filter is subjective -> Thematic layers are user-selected 26-10-2003

Forest Fire Hazard Rate Multiple Layer integration shares intermediate analysis with other levels FHR: Forest Fire Hazard Rate: FHR(p)CondP = CondP(V1(p)) ×…× CondP(Vm(p)), i=1,…,m FHR(p)LR = LR(V1(p)) ×…× LR(Vm(p)), i=1,…,m Vi(p) = Attribute value at the point thematic map (i) CondP = Conditional Probability LR = Likelihood Ratio For each local area, a FHR can be computed, and fire ignition danger can be analysed 26-10-2003

Experiment: Attribute selection For practical use, thematic layers must be selected, based on relative importance for explaining fire ignition Condition: chosen layers have to be conditionally independent Layers for Experiment: -Forest Type -Elevation Slope Road Network Farms Building Boundaries 26-10-2003

Experiment: Data sets -It is assumed the time of study was 1996: All spatial data in 1996 are compiled, including distribution of fire ignition locations which occurred prior to that year -Cross Validation: Predictions based on those relationships are evaluated by comparing the estimated hazard classes with the distribution of forest fire ignition locations that occurred after 1996, during the period 1997 to 2001 - Evaluation of Conditional Probability and Likelihood Ratio can expressed in a Prediction Rate Curve 26-10-2003

Expiriment: Evaluation Prediction rate curve of both models Conclusion: Likelihood Ratio is a more powerful method than Conditional Probability. The effectiveness of the model estimated are acceptable Prediction Rates with respect to the ‘future’ 1997 to 2001 forest fire occurrences 26-10-2003

Expiriment: Visualisation Using Forest Fire Hazard Index (FHI) -Sort estimated probabilities of all pixels in descending order -ordered pixels are divided into 11 classes: Pixels with the highest 5% estimated probability are classified as the first class, the next 5% as second class and so on. Remaining low 50% is assigned to the last class Add color to classes 26-10-2003

Conclusion Statistics based Forest Fire prediction works well. The Likelihood ratio method is more powerful than the Conditional probability method. Prediction of the forest fire hazardous area could be helpful to increase the efficiency of forest fire management: The ability to quantify the ignition risk could lead to a more informed allocation of fire prevention resources. 26-10-2003

Questions 26-10-2003