ES 07 These slides can be found at optimized for Windows)

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

Chapter 9: Simple Regression Continued
Estimation of Means and Proportions
“Students” t-test.
Previous Lecture: Distributions. Introduction to Biostatistics and Bioinformatics Estimation I This Lecture By Judy Zhong Assistant Professor Division.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Sampling: Final and Initial Sample Size Determination
Confidence Intervals This chapter presents the beginning of inferential statistics. We introduce methods for estimating values of these important population.
Statistical inference form observational data Parameter estimation: Method of moments Use the data you have to calculate first and second moment To fit.
Chapter Topics Confidence Interval Estimation for the Mean (s Known)
5. Estimation 5.3 Estimation of the mean K. Desch – Statistical methods of data analysis SS10 Is an efficient estimator for μ ?  depends on the distribution.
Inference about a Mean Part II
Standard error of estimate & Confidence interval.
Leon-Guerrero and Frankfort-Nachmias,
Review of normal distribution. Exercise Solution.
1 1 Slide Interval Estimation Chapter 8 BA Slide A point estimator cannot be expected to provide the exact value of the population parameter.
Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal
1 Level of Significance α is a predetermined value by convention usually 0.05 α = 0.05 corresponds to the 95% confidence level We are accepting the risk.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
F OUNDATIONS OF S TATISTICAL I NFERENCE. D EFINITIONS Statistical inference is the process of reaching conclusions about characteristics of an entire.
880.P20 Winter 2006 Richard Kass 1 Maximum Likelihood Method (MLM) Does this procedure make sense? The MLM answers this question and provides a method.
Topic 5 Statistical inference: point and interval estimate
1 Chapter 6. Section 6-1 and 6-2. Triola, Elementary Statistics, Eighth Edition. Copyright Addison Wesley Longman M ARIO F. T RIOLA E IGHTH E DITION.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Statistics for Data Miners: Part I (continued) S.T. Balke.
Confidence Intervals for Means. point estimate – using a single value (or point) to approximate a population parameter. –the sample mean is the best point.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Ch9. Inferences Concerning Proportions. Outline Estimation of Proportions Hypothesis concerning one Proportion Hypothesis concerning several proportions.
Confidence intervals and hypothesis testing Petter Mostad
STA291 Statistical Methods Lecture 18. Last time… Confidence intervals for proportions. Suppose we survey likely voters and ask if they plan to vote for.
8 Sampling Distribution of the Mean Chapter8 p Sampling Distributions Population mean and standard deviation,  and   unknown Maximal Likelihood.
1 Chapter 6 Estimates and Sample Sizes 6-1 Estimating a Population Mean: Large Samples / σ Known 6-2 Estimating a Population Mean: Small Samples / σ Unknown.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
1 Chapter 6. Section 6-1 and 6-2. Triola, Elementary Statistics, Eighth Edition. Copyright Addison Wesley Longman M ARIO F. T RIOLA E IGHTH E DITION.
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved. Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and.
Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman 1 Estimates and Sample Sizes Chapter 6 M A R I O F. T R I O L A Copyright © 1998,
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 8 Interval Estimation Population Mean:  Known Population Mean:  Known Population.
The final exam solutions. Part I, #1, Central limit theorem Let X1,X2, …, Xn be a sequence of i.i.d. random variables each having mean μ and variance.
Confidence Interval & Unbiased Estimator Review and Foreword.
: An alternative representation of level of significance. - normal distribution applies. - α level of significance (e.g. 5% in two tails) determines the.
Descriptive Statistics Used to describe a data set –Mean, minimum, maximum Usually include information on data variability (error) –Standard deviation.
Review of fundamental 1 Data mining in 1D: curve fitting by LLS Approximation-generalization tradeoff First homework assignment.
Chapter 11: Estimation of Population Means. We’ll examine two types of estimates: point estimates and interval estimates.
 A Characteristic is a measurable description of an individual such as height, weight or a count meeting a certain requirement.  A Parameter is a numerical.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: In a recent poll, 70% of 1501 randomly selected adults said they believed.
- We have samples for each of two conditions. We provide an answer for “Are the two sample means significantly different from each other, or could both.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
8.1 Estimating µ with large samples Large sample: n > 30 Error of estimate – the magnitude of the difference between the point estimate and the true parameter.
G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate.
Class 5 Estimating  Confidence Intervals. Estimation of  Imagine that we do not know what  is, so we would like to estimate it. In order to get a point.
1 Chapter 8 Interval Estimation. 2 Chapter Outline  Population Mean: Known  Population Mean: Unknown  Population Proportion.
1 Probability and Statistics Confidence Intervals.
ESTIMATION OF THE MEAN. 2 INTRO :: ESTIMATION Definition The assignment of plausible value(s) to a population parameter based on a value of a sample statistic.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: In a recent poll, 70% of 1501 randomly selected adults said they believed.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
10.1 Estimating with Confidence Chapter 10 Introduction to Inference.
Chapter 9 Introduction to the t Statistic
Inference: Conclusion with Confidence
More on Inference.
Inference for the Mean of a Population
ESTIMATION.
Inference: Conclusion with Confidence
Statistics in Applied Science and Technology
More on Inference.
CONCEPTS OF ESTIMATION
Determining Which Method to use
Statistical Inference for the Mean: t-test
How Confident Are You?.
Presentation transcript:

ES 07 These slides can be found at optimized for Windows)

ES 07 The Gaussian distribution f(x)  dx is the probability that an observation will fall in between x – dx/2 and x + dx/2

ES 07 Normally the Gaussian distribution is standardized by putting  = 0 and  = 1  (  ) is called the frequency function note that  (  ) =  (–  )

ES 07 The distribution function is the primitive function to the frequency function The distribution function cannot be calculated analytically, but is tabulated in most standard books. Or it can be approximated. Note that  (–  ) = 1 –  (  ) and that F(x) =  (  ) The probability to obtain a value between  1 and  2 (  1 <  2 ) is given by  (  2 ) –  (  1 )

ES 07 An approximation which can be used for  (  ) is:  (  ) = 1 –  (  ) (a 1 t + a 2 t 2 +a 3 t 3 ) +  (  ) (   0) where t = (1 + p  ) -1 with p = a 1 = a 2 = a 3 = giving |  (  )| < 1  10 -5

ES 07 Expectation value, variance and covariance The sum is over the whole population Standard deviation:

ES 07 Variance of the population mean value V[  X  ] = V[X]/N

ES 07 Expectation value and variance from a sample Estimates with correct expectation value are thus given by: and

ES 07 The variance of the variance

ES 07 This leads to an estimate of the “error” in the estimate of the standard deviation of a distribution Beware! V[V[  X  ]] is normally a small positive number, but the terms used for its calculation are normally very large. High precision is needed in the calculations.

ES 07 Parameter fitting with the maximum likelihood method If we know that the sample we want to study comes from a certain distribution, e. g. a Gaussian with unknown parameters, we can fit those using he maximum likelihood method. Calculate the probability to obtain exactly the sample you have as a function of the parameters and maximize this probability L ( ,  ) =  f(x i ) or l ( ,  ) =  ln(f(x i )) The “error”  of a parameter p is estimated by l (p   p ) = l max – ½

ES 07 The l -function is usually close to a parabola l (p   p ) = l max – ½

ES 07  2 fitting and  2 testing This method needs binning of the data. In each bin we have (x i ) min, (x i ) max, y i = n i /N and  i which can be taken as  (n i )/N as long as n i  5 (no less than five observations in a bin) and n i  N. Minimize the sum S

ES 07 y th is calculated from the tested distribution. If this is a Gaussian with parameters  G and  G we have

ES 07 S can now be minimized with respect to the parameters to be fitted. When S min is found the “error” of the parameter can be estimated from (c. f. maximum likelihood method) S(p   p ) = S min + 1 S is in many cases approximately of parabolic shape close to the minimum.

ES 07 S is  2 distributed with degrees of freedom. The number of degrees of freedom are the number of bins we have minus the number of parameters that are fitted. In the previous example we had 7 bins and two parameters giving = 5. S( =5) Tablemeaning in only about 10 % of the cases a smaller S-value would be obtained in about 33 % of the cases a smaller S-value would be obtained in about 42 % of the cases a larger S-value would be obtained in about 17 % of the cases a larger S-value would be obtained in only about 8 % of the cases a larger S-value would be obtained Generally we expect S/ to be close to 1 if the fluctuations in the data are of purely statistical origin and if the data is described by the distribution in question.

ES 07 Confidence levels and confidence intervals Assume that we have estimated a parameter p and found that p = 1.23 with  p = Lets say that we want to construct an interval that covers the true value of p with 90 % confidence. This means that we leave out 5 % on each side. Start by finding , so that  (  ) = 0.95   =  p max =  = 1.41 and p min =  = 1.05 We have found the two sided confidence interval of our estimate of p on the 90 % confidence level to be 1.05 – 1.41

ES 07 If we want to state that p < x with some confidence we can construct a one sided confidence region. Lets say that we want to construct an region that covers the true value of p with 99 % confidence. Start by finding , so that  (  ) = 0.99   =  p max =  = 1.49 We have found the one sided confidence region of our estimate of p on the 99 % confidence level to be p < 1.49

ES 07 Hypothesis testing (simple case) Lets again assume that we have estimated a parameter p and found that p = 1.23 with  p = No we have a hypothesis stating that p = 1.4 We now ask our selves with what probability the hypothesis is wrong. We calculate  = (1.4 – 1.23)/0.11 = and the probability is given by  (  ) = 0.939, i. e. we can state with 94 % confidence that the hypothesis is wrong.

ES 07