2. Probabilistic Mineral Resource Potential Mapping The processing of geo-scientific information for the purpose of estimating probabilities of occurrence.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

Brief introduction on Logistic Regression
Logistic Regression Psy 524 Ainsworth.
ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.
Probability: Review The state of the world is described using random variables Probabilities are defined over events –Sets of world states characterized.
2 – In previous chapters: – We could design an optimal classifier if we knew the prior probabilities P(wi) and the class- conditional probabilities P(x|wi)
EPI 809/Spring Probability Distribution of Random Error.
Chapter 4: Linear Models for Classification
Visual Recognition Tutorial
x – independent variable (input)
Probability Notation Review Prior (unconditional) probability is before evidence is obtained, after is posterior or conditional probability P(A) – Prior.
SLIDE 1IS 240 – Spring 2010 Logistic Regression The logistic function: The logistic function is useful because it can take as an input any.
Today Linear Regression Logistic Regression Bayesians v. Frequentists
Log-linear modeling and missing data A short course Frans Willekens Boulder, July
Linear Methods for Classification
CHAPTER 4: Parametric Methods. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Parametric Estimation X = {
CHAPTER 4: Parametric Methods. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Parametric Estimation Given.
Discriminant Analysis Testing latent variables as predictors of groups.
Review of Lecture Two Linear Regression Normal Equation
Chapter Two Probability Distributions: Discrete Variables
1 Linear Methods for Classification Lecture Notes for CMPUT 466/551 Nilanjan Ray.
1 G Lect 11W Logistic Regression Review Maximum Likelihood Estimates Probit Regression and Example Model Fit G Multiple Regression Week 11.
ECSE 6610 Pattern Recognition Professor Qiang Ji Spring, 2011.
BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS.
Model Inference and Averaging
Spatial Analysis.
Logistic Regression Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata September 1, 2014.
MINERAL PROSPECTIVITY MAPPING Alok Porwal
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Linear vs. Logistic Regression Log has a slightly better ability to represent the data Dichotomous Prefer Don’t Prefer Linear vs. Logistic Regression.
Ch 4. Linear Models for Classification (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized and revised by Hee-Woong Lim.
BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS.
Machine Learning CUNY Graduate Center Lecture 4: Logistic Regression.
Spatial Interpolation Chapter 13. Introduction Land surface in Chapter 13 Land surface in Chapter 13 Also a non-existing surface, but visualized as a.
1 Continuous Probability Distributions Continuous Random Variables & Probability Distributions Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering.
Probability Course web page: vision.cis.udel.edu/cv March 19, 2003  Lecture 15.
Indian Institute of Technology Bombay Bayesian Probabilistic or Weights of Evidence Model for Mineral Prospectivity Mapping.
Logistic Regression. Linear Regression Purchases vs. Income.
Confidence Interval & Unbiased Estimator Review and Foreword.
Generalized Linear Models (GLMs) and Their Applications.
Indian Institute of Technology Bombay Bayesian Probabilistic or Weights of Evidence Model for Mineral Prospectivity Mapping.
LOGISTIC REGRESSION Binary dependent variable (pass-fail) Odds ratio: p/(1-p) eg. 1/9 means 1 time in 10 pass, 9 times fail Log-odds ratio: y = ln[p/(1-p)]
Regress-itation Feb. 5, Outline Linear regression – Regression: predicting a continuous value Logistic regression – Classification: predicting a.
Machine Learning 5. Parametric Methods.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: MLLR For Two Gaussians Mean and Variance Adaptation MATLB Example Resources:
Dates Presentations Wed / Fri Ex. 4, logistic regression, Monday Dec 7 th Final Tues. Dec 8 th, 3:30.
Maximum likelihood estimators Example: Random data X i drawn from a Poisson distribution with unknown  We want to determine  For any assumed value of.
Linear Models (II) Rong Jin. Recap  Classification problems Inputs x  output y y is from a discrete set Example: height 1.8m  male/female?  Statistical.
Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.
Environmental Modeling Weighting GIS Layers Weighting GIS Layers.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.
Nonparametric Statistics
R. Kass/W03 P416 Lecture 5 l Suppose we are trying to measure the true value of some quantity (x T ). u We make repeated measurements of this quantity.
Applied statistics Usman Roshan.
Nonparametric Statistics
Oliver Schulte Machine Learning 726
Random Variable 2013.
Probability Theory and Parameter Estimation I
Logistic Regression.
Probability theory retro
Model Inference and Averaging
Ch3: Model Building through Regression
CH 5: Multivariate Methods
Special Topics In Scientific Computing
Review of Probability and Estimators Arun Das, Jason Rebello
Nonparametric Statistics
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Recap: Naïve Bayes classifier
Logistic Regression [Many of the slides were originally created by Prof. Dan Jurafsky from Stanford.]
Presentation transcript:

2. Probabilistic Mineral Resource Potential Mapping The processing of geo-scientific information for the purpose of estimating probabilities of occurrence for various types of mineral deposits was made easier when Geographic Information Systems became available. Weights-of-Evidence modeling and logistic regression are examples of GIS implementations.

Weights of Evidence (WofE)

BAYES’ RULE P(D on A) = P(D and A)/P(A) P(A on D) = P(A and D)/P(D) P(D on A) = P(A on D) * P(D)/P(A)

ODDS & LOGITS O = P/(1-P); P = O/(1+O); logit = ln O ln O(D on A) = W + (A) + ln O(D) W + (A) = ln {P(A on D)/P(A not on D)}

VARIANCE OF WEIGHT s 2 = n -1 (A and D) + n -1 (A and not D)

Negative Weight & Contrast W - (A) = W + (not A) Contrast: C = W + (A) - W - (A)

PRESENT, ABSENT or MISSING add W +, W - or 0 to prior logit

TWO or MORE LAYERS Add Weight(s) assuming Conditional Independence P( on D) = P(A on D) * P(B on D)

UNCERTAINTY DUE TO MISSING DATA P(D) = E X {P(D on X)} =  P(D on A i ) * P(A i ) or  P(D on ) * P( ) etc.

VARIANCE (MISSING DATA)  2 {P(D)} =  {P(D on A i ) - P(D)} 2 * P(A i ) or  {P(D on ) - P(D)} 2 * P( ) etc.

TOTAL UNCERTAINTY Var (Posterior Logit) = Var (Prior Logit) + +  Var (Weights) + Var (Missing Data)

Uncertainty in Logits and Probabilities D {Logit (P)} = 1/P(1-P)  (P) ~ P(1-P)  Logit (P)}

Meguma Terrain Example

Table 1. Number of gold deposits, area in km 2, weights, contrast (C) with standard deviations (s). In total: 68 deposits on 2945 km 2

Logistic Regression Logit (  i ) =  0 + x i1  1 + x i2  2 + … + x im  m

Newton-Raphson Iteration  (t+1) =  (t) + {X T V(t)X)} -1 X T r(t), t = 1, 2, … r(t) = y(t) - p(t)

Seafloor Example

NEW CONDITIONAL INDEPENDENCE TEST FOR WEIGHTS OF EVIDENCE METHOD

Definitions N = Number of unit cells N A = Number of unit cells on map layer A n = Number of deposits n A = Number of deposits on map layer A P(d |A) = Probability that unit cell on A contains a deposit X A = Binary random variable for occurrence of deposit in unit cell on A with EX A = P(d |A) = n A / n T = Random variable for number of deposits in study area

Single binary pattern A (~A = not A)   Posterior Probabilities = N A P(d |A) + N ~A P(d |~A) = = N A {n A / N A } + N ~A {n ~A / N ~A } = n  2 (T) = N A 2  2 (X A ) + N ~A 2  2 (X ~A )

Two binary patterns (A and B):   Posterior Probabilities = N AB P(d |AB) + N A~B P(d |A~B) + + N ~AB P(d |~AB) + N ~A~B P(d |~A~B) = = n AB + n A~B + n ~AB + n ~A~B = = n A. n B / n + n A. n ~B / n + n ~A. n B / n + n ~A. n ~B / n = = n A.{n B + n ~B }/ n + n ~A.{n B + n ~B }/ n = n  2 (T) = N AB 2  2 (X AB ) + N A~B 2  2 (X A~B ) + N ~AB 2  2 (X ~AB ) + N ~A~B 2  2 (X ~A~B )

New conditional independence test applied to ocean floor hydrothermal vent example Total number of vents n = 13 3-map layer model predicts (s.d. = 6.45) P(T = N) > 99% (c.l. = 28.03) 5-map layer model predicts (s.d. = 10.47) P(T > N) > 99% (c.l. = 37.40)

Application of Weights of Evidence Method for Assessment of Flowing Wells in the Greater Toronto Area, Canada By Qiuming Cheng, Natural Resources Research, vol. 13, no. 2, June 2004

ORM Study Area and Surficial Geology of Southern Ontario

Oak Ridges Moraine

Digital Elevation Model Southern Ontario

DEM and Location of ORM

Geology of ORM

Flowing Wells and Springs

Spatial Decision Support System (SDSS) GIS Data Integration for Prediction Aquifers Drift Thickness Slope Lithology IntegrationPotential Evidential Layers (X) Modeling (F) Output Data S Processing DBMS GIS Database GIS Data Preprocessing Interpreting Define correlated patterns using training points Integrated correlated patterns to estimate unknown points Modeling Prediction

Flowing Wells vs. Distance from ORM Spatial Correlation Distance

Flowing Wells vs. Distance From High Slope Zone Spatial Correlation Distance

Flowing Wells vs. Thickness of Drift

Flowing vs. Distance from Thick Drift Spatial Correlation Distance

Posterior Probability Map calculated by Arc-WofE from buffer zones around ORM and steep slope zones

Mapping potential groundwater discharges using Multivariate Logistic Regression

Modelling Uncertainty in Weights due to Kriging variance

Linear Regression with Missing Data Y =  0 +  1 x +  b 1 =  (x i -m x )(y i -m y )/  (x i -m x ) 2

Table 2. Comparison of 4 logistic regression solutions: A. Layer deleted; B. Absences set to 0; C. Cells deleted; D. Use of Weighted Mean.

Logistic Regression & Maximum Likelihood P(Y=1|x) =  (x) = e f(x) /{1+ e f(x) } P(Y=0|x) = 1-  (x)  (x i ) =  (x i ) y i {1-  (x i )} 1- y i l(  ) =   (x i )

Bivariate Logistic Regression Logit (  i ) =  0 + x i1  1  = [  0  1 ]

Log Likelihood Function L(  ) = ln{l(  )} = =  [y i  ln{  (x i )}+(1- y i )  ln{1-  (x i )}]

Differentiate with respect to  0 and  1 to obtain likelihood equations:  {y i -  (x i )} = 0  x i {y i -  (x i )} = 0

Total number of discrete events = Sum of estimated probabilities  y i =  p(x i )

Weighted logistic regression convergence experiments (Level of convergence = 0.01) Seafloor Example (N = 13): Unit cell of 0.01 km 2  12.72; km 2  12.97; km 2  Meguma Terrane Example (N = 68) Unit cell of 1 km 2  64.71; 0.1 km 2  67.96