Analyzing the Results of a Simulation and Estimating Errors Jason Cooper.

Slides:



Advertisements
Similar presentations
Chapter 3 Properties of Random Variables
Advertisements

Introduction Simple Random Sampling Stratified Random Sampling
© copyright 2013-William A. Goddard III, all rights reservedCh120a-Goddard-L06 Ch121a Atomic Level Simulations of Materials and Molecules William A. Goddard.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
Estimation in Sampling
Sampling: Final and Initial Sample Size Determination
QUANTITATIVE DATA ANALYSIS
Relationship Between Sample Data and Population Values You will encounter many situations in business where a sample will be taken from a population, and.
Evaluating Hypotheses
Lecture 4 Measurement Accuracy and Statistical Variation.
Significance Tests P-values and Q-values. Outline Statistical significance in multiple testing Statistical significance in multiple testing Empirical.
Violations of Assumptions In Least Squares Regression.
STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%
Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.
Today Today: Chapter 8, start Chapter 9 Assignment: Recommended Questions: 9.1, 9.8, 9.20, 9.23, 9.25.
Sampling Designs Avery and Burkhart, Chapter 3 Source: J. Hollenbeck.
QUIZ CHAPTER Seven Psy302 Quantitative Methods. 1. A distribution of all sample means or sample variances that could be obtained in samples of a given.
STAT 572: Bootstrap Project Group Members: Cindy Bothwell Erik Barry Erhardt Nina Greenberg Casey Richardson Zachary Taylor.
V. Rouillard  Introduction to measurement and statistical analysis ASSESSING EXPERIMENTAL DATA : ERRORS Remember: no measurement is perfect – errors.
Statistical Methods For Engineers ChE 477 (UO Lab) Larry Baxter & Stan Harding Brigham Young University.
1 CE 530 Molecular Simulation Lecture 7 David A. Kofke Department of Chemical Engineering SUNY Buffalo
Simulation of Random Walk How do we investigate this numerically? Choose the step length to be a=1 Use a computer to generate random numbers r i uniformly.
1 Statistical Mechanics and Multi- Scale Simulation Methods ChBE Prof. C. Heath Turner Lecture 11 Some materials adapted from Prof. Keith E. Gubbins:
Normal Distribution u Note: other distributions –hypergoemetric - sampling with replacement –beta –bimodal –VanGenuchten u Normal Probability Density Function.
1 Sampling and Sampling Distributions Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND STATISTICS.
Free energies and phase transitions. Condition for phase coexistence in a one-component system:
1 1 Slide Chapter 7 (b) – Point Estimation and Sampling Distributions Point estimation is a form of statistical inference. Point estimation is a form of.
Chapter Twelve Census: Population canvass - not really a “sample” Asking the entire population Budget Available: A valid factor – how much can we.
1 Sampling Distributions Lecture 9. 2 Background  We want to learn about the feature of a population (parameter)  In many situations, it is impossible.
Chapter 5 Errors In Chemical Analyses Mean, arithmetic mean, and average (x) are synonyms for the quantity obtained by dividing the sum of replicate measurements.
Basic Monte Carlo (chapter 3) Algorithm Detailed Balance Other points.
1 Chapter 7 Sampling and Sampling Distributions Simple Random Sampling Point Estimation Introduction to Sampling Distributions Sampling Distribution of.
For a new configuration of the same volume V and number of molecules N, displace a randomly selected atom to a point chosen with uniform probability inside.
The Ideal Monatomic Gas. Canonical ensemble: N, V, T 2.
Lecture 2 Forestry 3218 Lecture 2 Statistical Methods Avery and Burkhart, Chapter 2 Forest Mensuration II Avery and Burkhart, Chapter 2.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
July, 2000Guang Jin Statistics in Applied Science and Technology Chapter 7 - Sampling Distribution of Means.
Statistical Methods II: Confidence Intervals ChE 477 (UO Lab) Lecture 4 Larry Baxter, William Hecker, & Ron Terry Brigham Young University.
8 Sampling Distribution of the Mean Chapter8 p Sampling Distributions Population mean and standard deviation,  and   unknown Maximal Likelihood.
BUS216 Spring  Simple Random Sample  Systematic Random Sampling  Stratified Random Sampling  Cluster Sampling.
Sampling Error SAMPLING ERROR-SINGLE MEAN The difference between a value (a statistic) computed from a sample and the corresponding value (a parameter)
Chapter Thirteen Copyright © 2004 John Wiley & Sons, Inc. Sample Size Determination.
1 Topic 5 - Joint distributions and the CLT Joint distributions –Calculation of probabilities, mean and variance –Expectations of functions based on joint.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Molecular Modelling - Lecture 2 Techniques for Conformational Sampling Uses CHARMM force field Written in C++
Sampling and estimation Petter Mostad
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
1 Sampling Distribution of Arithmetic Mean Dr. T. T. Kachwala.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 6-4 Sampling Distributions and Estimators.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Chapter 5 Sampling Distributions. The Concept of Sampling Distributions Parameter – numerical descriptive measure of a population. It is usually unknown.
1/18/2016Atomic Scale Simulation1 Definition of Simulation What is a simulation? –It has an internal state “S” In classical mechanics, the state = positions.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Sampling Theory and Some Important Sampling Distributions.
G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate.
Chapter 9 Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis.
Basic Monte Carlo (chapter 3) Algorithm Detailed Balance Other points non-Boltzmann sampling.
Dr.N.K.Tyagi, SAMPLE SIZE The average in the form of estimate ‘p’ or mean should be of known along with its precision and tolerable error,
6/11/2016Atomic Scale Simulation1 Definition of Simulation What is a simulation? –It has an internal state “S” In classical mechanics, the state = positions.
WARM UP: Penny Sampling 1.) Take a look at the graphs that you made yesterday. What are some intuitive takeaways just from looking at the graphs?
Data analysis Gatut Yudoyono Physics department Physical Measurement Method ( Metode Pengukuran Fisika) SF
Lecture Slides Essentials of Statistics 5th Edition
ESTIMATION.
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
Introduction to Sampling Distributions
Psychology 202a Advanced Psychological Statistics
Chapter 5 Sampling Distributions
Introduction to Sampling Distributions
ANalysis Of VAriance Lecture 1 Sections: 12.1 – 12.2
Presentation transcript:

Analyzing the Results of a Simulation and Estimating Errors Jason Cooper

Types of Error Big and obvious errors Systematic error Statistical (random) error

Big, Obvious Errors Arise from gross error, often in the particle configuration. Examine intermediate conformations (MD or MC) for obvious problems, regardless of the focus of the study. Conformations typically stored every 5-25 steps.

Systematic Error Characterization Results in a constant bias or skew from the expected result. Expected distribution Biased distribution Skewed distribution

Systematic Error Characterization Calculated values for simple thermodynamic properties should be normally distributed:

Systematic Error Characterization 1.Sort data into bins of approximately equal number. Expected number is given by: 2.Calculate chi-squared statistic: (  2 > 1 indicates a poor match)

Systematic Error Sources Four main sources of systematic error: –The model (limitations of the basis set, functional, etc.) –The algorithms used (drift in Euler integration of a DE) –Numerical precision (round-off and quantization error) –Implementation (programming error)

Systematic Error The Fix Systematic errors are most easily isolated when several algorithms are applied: –to several different chemical systems, –on several different computers, –using several different compilers, –etc…

Statistical Error Characterization Characteristic normal distribution of values about the set average: M is the number of independent data values

Statistical Error Relaxation Time and Statistical Inefficiency Successive data values are well correlated, and not independent. To find the effective M, we need to know the statistical inefficiency of the system.

Statistical Error Relaxation Time and Statistical Inefficiency We begin by dividing our M sequential configurations into b blocks each containing n b values of the property A:

Statistical Error Relaxation Time and Statistical Inefficiency The variance of the block averages is then given by: Where  A  i is the average for the i th block and  A  total is the average calculated only over those values covered in the blocks.

Statistical Error Relaxation Time and Statistical Inefficiency For large n b,  A  i become uncorrelated and: Next, define the statistical inefficiency s: and, finally... so that

Statistical Error Relaxation Time and Statistical Inefficiency We solve for s: Where s can be visualized in two ways: –The factor by which the variance exceeds a naïve estimate (statistical inefficiency); or –The number of steps per block required to give uncorrelated block averages (relaxation time).

Statistical Error Relaxation Time and Statistical Inefficiency In practice, s is calculated from a plot similar to the following:

Statistical Error Relaxation Time and Statistical Inefficiency Care must be taken to avoid boundary effects:

Statistical Error Application of Statistical Inefficiency: Sampling Simulation is divided into blocks of size n b ≥ s Blocks may be sampled in one of three ways: –Stratified systematic sampling –Stratified random sampling –Coarse graining Coarse graining most commonly applied for scalar properties. Sampling applied otherwise.

Statistical Error Sources Arises from the finite nature of the simulation: –Finite number of atoms or molecules considered –Finite number of sequential values taken –Finite precision retained in intermediate values

Statistical Error The Fix Three main approaches: –Increase the number of atoms or molecules considered in the simulation; –Increase the duration of the simulation (number of samples taken); or –Reduce the statistical inefficiency of the algorithms used.