1 Peter Fox GIS for Science ERTH 4750 (98271) Week 8, Tuesday, March 20, 2012 Analysis and propagation of errors.

Slides:



Advertisements
Similar presentations
Welcome to PHYS 225a Lab Introduction, class rules, error analysis Julia Velkovska.
Advertisements

1 Overview of Simulation When do we prefer to develop simulation model over an analytic model? When not all the underlying assumptions set for analytic.
11 Simulation. 22 Overview of Simulation – When do we prefer to develop simulation model over an analytic model? When not all the underlying assumptions.
Simulation Operations -- Prof. Juran.
1 CPSC 695 Data Quality Issues M. L. Gavrilova. 2 Decisions…
1 Statistical Inference H Plan: –Discuss statistical methods in simulations –Define concepts and terminology –Traditional approaches: u Hypothesis testing.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Point estimation, interval estimation
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
An Overview of Today’s Class
1 Business 90: Business Statistics Professor David Mease Sec 03, T R 7:30-8:45AM BBC 204 Lecture 22 = More of Chapter “Confidence Interval Estimation”
1 Business 90: Business Statistics Professor David Mease Sec 03, T R 7:30-8:45AM BBC 204 Lecture 21 = Start Chapter “Confidence Interval Estimation” (CIE)
Two Population Means Hypothesis Testing and Confidence Intervals With Unknown Standard Deviations.
Independent Sample T-test Classical design used in psychology/medicine N subjects are randomly assigned to two groups (Control * Treatment). After treatment,
The t Tests Independent Samples.
Modern Navigation Thomas Herring
Standard error of estimate & Confidence interval.
Elec471 Embedded Computer Systems Chapter 4, Probability and Statistics By Prof. Tim Johnson, PE Wentworth Institute of Technology Boston, MA Theory and.
1 D r a f t Life Cycle Assessment A product-oriented method for sustainability analysis UNEP LCA Training Kit Module k – Uncertainty in LCA.
Sampling error Error that occurs in data due to the errors inherent in sampling from a population –Population: the group of interest (e.g., all students.
Computer Simulation A Laboratory to Evaluate “What-if” Questions.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
Overview G. Jogesh Babu. Probability theory Probability is all about flip of a coin Conditional probability & Bayes theorem (Bayesian analysis) Expectation,
1 Peter Fox GIS for Science ERTH 4750 (98271) Week 9, Tuesday, March 27, 2012 Using uncertainties, analysis and use of discrete entities.
 1  Outline  stages and topics in simulation  generation of random variates.
1 Peter Fox GIS for Science ERTH 4750 (98271) Week 6, Tuesday, February 28, 2012 Kriging, variograms, term project discussion/ definition.
PROBABILITY & STATISTICAL INFERENCE LECTURE 3 MSc in Computing (Data Analytics)
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Monte Carlo Simulation CWR 6536 Stochastic Subsurface Hydrology.
Chapter 14 Monte Carlo Simulation Introduction Find several parameters Parameter follow the specific probability distribution Generate parameter.
Why Is It There? Getting Started with Geographic Information Systems Chapter 6.
Physics 114: Exam 2 Review Lectures 11-16
THE MANAGEMENT AND CONTROL OF QUALITY, 5e, © 2002 South-Western/Thomson Learning TM 1 Chapter 9 Statistical Thinking and Applications.
Chapter 7 Random-Number Generation
1 SMU EMIS 7364 NTU TO-570-N Inferences About Process Quality Updated: 2/3/04 Statistical Quality Control Dr. Jerrell T. Stracener, SAE Fellow.
Approximate letter grade assignments ~ D C B 85 & up A.
1 Review from previous class  Error VS Uncertainty  Definitions of Measurement Errors  Measurement Statement as An Interval Estimate  How to find bias.
Fundamentals of Data Analysis Lecture 3 Basics of statistics.
Sampling Error.  When we take a sample, our results will not exactly equal the correct results for the whole population. That is, our results will be.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
POOLED DATA DISTRIBUTIONS GRAPHICAL AND STATISTICAL TOOLS FOR EXAMINING COMPARISON REFERENCE VALUES Alan Steele, Ken Hill, and Rob Douglas National Research.
5-1 ANSYS, Inc. Proprietary © 2009 ANSYS, Inc. All rights reserved. May 28, 2009 Inventory # Chapter 5 Six Sigma.
Introduction to Digital Signals
1 Peter Fox GIS for Science ERTH 4750 (98271) Week 5, Tuesday, February 21, 2012 Introduction to geostatistics. Interpolation techniques continued (regression,
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
INTRODUCTORY LECTURE 3 Lecture 3: Analysis of Lab Work Electricity and Measurement (E&M)BPM – 15PHF110.
For starters - pick up the file pebmass.PDW from the H:Drive. Put it on your G:/Drive and open this sheet in PsiPlot.
Lab for Remote Sensing Hydrology and Spatial Modeling Dept of Bioenvironmental Systems Engineering National Taiwan University 1/45 GEOSTATISTICS INTRODUCTION.
Sampling Theory and Some Important Sampling Distributions.
Statistics Presentation Ch En 475 Unit Operations.
1 Peter Fox GIS for Science ERTH 4750 (98271) Week 4, Tuesday, February 14, 2012 Geocoding, Simple Interpolation, Sampling.
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005 Dr. John Lipp Copyright © 2005 Dr. John Lipp.
Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson0-1 Supplement 2: Comparing the two estimators of population variance by simulations.
1 Collecting and Interpreting Quantitative Data Deborah K. van Alphen and Robert W. Lingard California State University, Northridge.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
In the past two years, after the first three lectures, the topics of “fundamental constants”, “basic physical concepts”, “random and system errors”, “error.
Statistics for Business and Economics 8 th Edition Chapter 7 Estimation: Single Population Copyright © 2013 Pearson Education, Inc. Publishing as Prentice.
Why Is It There? Chapter 6. Review: Dueker’s (1979) Definition “a geographic information system is a special case of information systems where the database.
Overview G. Jogesh Babu. R Programming environment Introduction to R programming language R is an integrated suite of software facilities for data manipulation,
Confidence Intervals Cont.
Statistical Inference
CPSC 531: System Modeling and Simulation
Success Criteria: I will be able to analyze data about my classmates.
Statistics in Applied Science and Technology
Statistical Methods Carey Williamson Department of Computer Science
Statistical Methods For Engineers
Filtering and State Estimation: Basic Concepts
Sampling Distribution of a Sample Proportion
Hypothesis Testing and Confidence Intervals
Carey Williamson Department of Computer Science University of Calgary
Presentation transcript:

1 Peter Fox GIS for Science ERTH 4750 (98271) Week 8, Tuesday, March 20, 2012 Analysis and propagation of errors

Contents Error!!! Projects Lab assignment on Friday 2

Spatial analysis of continuous fields Possibly more important than our answer is our confidence in the answer. Our confidence is quantified by uncertainties as discussed earlier. Once we combine numbers, we need to be able to assess how the uncertainties change for the combination. This is called propagation of errors or more correctly the propagation of our understanding/ estimate of errors in the result we are looking at… 3

Types of errors Mistakes Natural variation Systematic and random equipment problems Data collection methods Observer diligence Locations errors/accuracy Rasterizing and digitizing Mismatch of data collected by different methods (e.g., seafloor bathymetry) 4

Bathymetry 5

Cause of errors? 6

Resolution 7

Reliability Changes in data over time Non-uniform coverage Map scales Observation density Sampling theorem (aliasing) Surrogate data and their relevance Round-off errors in computers 8

Error propagation Errors arise from data quality, model quality and data/model interaction. We need to know the sources of the errors and how they propagate through our model. Simplest representation of errors is to treat observations/attributes as statistical data – use mean and standard deviation. 9

Analytic approaches 10 Addition and subtraction

Multiply, divide, exponent, log 11

Monte Carlo simulation If a new attribute U is given by U = f (A1, A2, A3, …. An) where the A’s are attributes and f represents some function combining them, then we want to know what is the standard deviation of the combination U and how does the standard deviation of each A contribute to it? By MC simulation we look at the statistical distribution of a lot of realizations (random samples) of U. 12

MC (ctd) A single realization of U is Ui = f (R1, R2, R3, …. Rn) where each Rn is a random sample of its corresponding attribute An based on the statistical properties (mean and standard deviation, for example) of An. The probability functions for the attributes themselves need not be Gaussian and could even be taken from histograms of observed values. 13

Recall… The mean and standard deviation of U is estimated by –m = N -1 SUM i=1,N (U i ) –s 2 = (N-1) -1 SUM i=1,N (U i - m) 2 where N is a very large number of realizations (hundreds or thousands). 14

When to use? MC simulation is most useful when the function relating the attributes is complex or the statistical distribution is known only empirically (from a histogram, for example). For simpler combinations of attributes, there are easier, direct (analytical) ways to estimate the new uncertainties from the attribute uncertainties. 15

Generating pseudo random numbers For the Monte Carlo simulation, you will want to generate a series of random numbers with a normal (bell-curve) distribution. There are 2 ways to do this in Excel. In older versions of Excel, you can use the Tools > Data Analysis > Random number generation > Normal distribution to generate a sequence of random numbers. 16

Second way Or, you can take advantage of the central limit theorem that states that under certain conditions, random samples of any distribution will have a normal distribution. The Excel function RAND generates a uniformly distributed random number, that is, the probability is the same for any number between 0 and 1 to be generated. To get a normally distributed random sample with mean of 0 and standard deviation of 1 we can simply add 12 uniformly distributed random numbers and subtract 6. 17

To get a normally distributed random sample with mean of m and standard deviation of s we use: [ SUM i=1,12 RAND() - 6 ] * s + m In Matlab – RAND In R – randu, seed, sample 18

Tip Because this expression is quite long in Excel you can create a macro to facilitate using it again and again. To record a macro, select Tools > Macro > Record new macro > give name to the macro > ok > type in expression > Stop recording. You can refer to re-named cells from within a macro, so you might want to use variable names for the mean and standard deviation to keep your macro general. 19

Shortcuts You can also specify a Control-key to run the macro from the worksheet. Otherwise, to run the macro, go to Tools > Macro > Macros > select the macro name and press Run. Once the macro is run in a cell, you can drag the expression to other cells using the drag handle in the lower-right corner of the cell. 20

Statistical ‘tests’ F-test: test if two distributions with the same mean are the same or different based on their variances and degrees of freedom. T-test: test if two distributions with different means are the same or different based on their variances and degrees of freedom 21

F-test 22 F = S 1 2 / S 2 2 where S 1 and S 2 are the sample variances. The more this ratio deviates from 1, the stronger the evidence for unequal population variances.

T-test 23

Variability 24

Dealing with errors In analyses: –report on the statistical properties –does it pass tests at some confidence level? On maps: –exclude data that are not reliable (map only subset of data) –show additional map of some measure of confidence 25

Elevation map 26 meters

Larger errors ‘whited out’ 27 m

Elevation errors 28 meters

Contaminants 29

Regions with errors ‘whited out’ 30

Map of errors 31

Summary Topics for GIS (for Science) –Estimating, propagating and displaying error considerations For learning purposes remember: –Demonstrate proficiency in using geospatial applications and tools (commercial and open-source). –Present verbally relational analysis and interpretation of a variety of spatial data on maps. –Demonstrate skill in applying database concepts to build and manipulate a spatial database, SQL, spatial queries, and integration of graphic and tabular data. –Demonstrate intermediate knowledge of geospatial analysis methods and their applications. 32

Friday Mar. 23 Lab assignment session – three problems, up on ~ Wednesday Complete them in class, get signed off before leaving 10% of grade 33

Reading for this week

Next classes Friday, March 23 – lab with material from week 7 (lab assignment 10%) Tuesday, March 27, Using uncertainties, working with discrete entity types Note March 30 – open lab (no assignment, work on your projects, get help from Max), attendance will be taken 35