Download presentation
Presentation is loading. Please wait.
Published byClement Davidson Modified over 9 years ago
1
Short Resume of Statistical Terms Fall 2013 By Yaohang Li, Ph.D.
2
Review Last Class –Introduction to Monte Carlo This Class –Important Statistics Terms Random Events –Independence of Random Events –Axioms on Random Events Random Variables –Independence of Random Variables CDF PDF Expectation –Characteristics of Expectation Moments of a Distribution –rth moment –rth central moment Mean Variance Standard Deviation Covariance –Characteristics of covariance Review of Statistics and Probability Terms Important Distribution Central Limit Theorem Estimand and Estimator Next Class –Monte Carlo for Integration
3
Random Events and Probability Random Event –An event which has a chance of happening Probability –A numerical measure of that chance –Lying between 0 and 1, both inclusive Terminology –P(A) The probability that an event A occurs –P(A+B+…) The probability that at least one of the events A, B, … occurs –P(AB…) The probability that all the events A, B, … occur –P(A|B) The probability that the event A occurs when it known that the event B occurs Conditional probability of A given B
4
Axioms in Probability P(A+B+…) P(A)+P(B)+… –If only one of the events A, B, … can occur, they are called exclusive. The equality holds –If at least one of the events A, B, … must occur, they are called exhaustive. P(A+B+…)=1 P(AB)=P(A|B)P(B) –If P(A|B)=P(A), A and B are independent The chance of A occurring is uninfluenced by the occurrence of B
5
Random Variables and Distributions Random variable ( ) –A number to characterize a set of exclusive and exhaustive events Cumulative Distribution Function (CDF) –F(y)=P( y) –The probability that the event which occurs has a value not exceeding a prescribed y –F(+ )=1 and F(- )=1 –F(y) is a non-decreasing function of y
6
Expectation If g( ) is a function of , the expectation (or mean value) of g is denoted and defined by –Stieltjes integral –The integral is taken over all values of y Explanation –Continuous random events F(y) is continuous and f(y) is a derivative –Discrete random events F(y) is a step function and f i is the step of height at the points of y i Probability Density Function (pdf) –f(y) and y i are the probability density functions
7
More on Expectation The statistical physicist uses another notation for expectation –Suppose p i is the probability density function How about if g(x) is a constant function?
8
Linear Combination of the Expectation Values
9
Multi-dimensional Distribution Multi-dimensional Random Variable –Represented used a vector Multi-dimensional CDF –F(y)=P( y) y means that each coordinate of is not greater than the corresponding coordinate of y Expectation –Continuous multidimensional events where
10
Independence of Random Variables Consider a set of exhaustive and exclusive events, each characterized by a pair of numbers and , for which F(y,z) is the distribution. G(y) is an CDF for and H(z) is an CDF for . –F(y,z) = P( y, z) –G(y) = P( y) –H(z) = P( z) If it so happens that –F(y,z)=G(y)H(z) for all y and z –the random variables and are called independent
11
Characteristics of Expectations Hold regardless whether or not the random variables i are independent or not Hold only i are mutual independent
12
Moments of Distribution rth moment of a distribution –E( r ) Principle moment – = E( ) rth central moment – r = E{( - ) r } Most important moments – = E( ), known as the mean of Measure of location of a random variable – 2, known as the variance of (usually used abbreviation of “var”) Measure of dispersion about the mean –standard deviation –coefficients of variation /
13
Covariance Definition of covariance (usually abbreviation of cov) –If and are random variables with means and v, respectively, the quantity E{( - )( -v)} is called the covariance of and –If and are independent, the covariance is 0 Why? –Also, cov( , )=var( ) Why?
14
Important Formula of Covariance
15
Correlation Coefficient Definition –Always between +1 and -1 –If =0, they are not correlated –If <0, they are negatively correlated –If >0, they are positively correlated
16
Important Distributions Uniform Distribution Exponential Distribution Binomial Distribution Poison Distribution Normal Distribution
17
Uniform Distribution Uniform Distribution (Rectangle Distribution) –A distribution has constant probability –Mean? –Variance?
18
Exponential Distribution –mean 1/ –variance 1/ 2
19
Binomial Distribution –Discrete probability distribution Pp(n|N) of obtaining exactly n successes out of N Bernoulli trials –Each Bernoulli trial is true with probability p and false with probability q=1-p = =
20
Poisson Distribution –The limit of the Binomial Distribution –Mean is v –Variance is v
21
Normal Distribution Normal Distribution (Gaussian Distribution) –Bell curve –De Moivre developed the normal distribution as an approximation to the binomial distribution
22
Normal Distribution in Data Analysis 68.26% of the data will be found within one SD either side of the mean (±1SD) 95.44% of the data will be found within two SD either side of the mean(±2SD) 99.74% of the data will be found within three SD either side of the mean (±3SD)
23
Central Limit Theorem –The sum of n independent random variables has an approximately normal distribution when n is large Random variables conform to arbitrary distribution
24
Central Limit Theorem in Practice In practice –n = 10 is reasonably large number –n = 25 is rather large (effective infinite)
25
Estimation Monte Carlo Computation –Goal: estimating the unknown numerical value of some parameter of some distribution The parameter is called an estimand Sample The available data (may consist of a number of observed random variables) The number of observations in the sample is called the sample size Estimand –mean ( 1+ 2+…+ n)/n –weighted average (w1 1+w2 2+…+wn n)/(w1+w2+…+wn) May be a better estimator Connection between the sample and the estimand –The estimand is a parameter of the distribution of the random variables constituting the sample
26
Sampling Distribution Parent Distribution –We can represent the sample by a vector with coordinates 1, 2, 3,…, n –The distribution of 1, 2, 3,…, n is called the Parent Distribution –To estimate the estimand (a parameter of the Parent Distribution), we use some function t( ) t is an estimator Sampling Distribution – is a random variable, so is t( ) if we repeated the experiment, we should expect to get a different value of –Since varies from experiment, t( ) has a distribution, called sampling distribution –If t( ) is to be close to , then the sampling distribution ought to be closely concentrated around
27
Measuring Sampling Distribution The bias of t –The difference between and the average value of t( ) – =E{t( )- } –t is an unbiased estimator if =0 The sampling variance of t – 2 t =var{t( )}=E{[t( )-Et( )] 2 }=E{[t- - ] 2 } If and 2 t are small, t is a good estimator
28
Important Estimators Mean of the parent distribution –standard error Variance of the parent distribution –standard error
29
Efficiency Goal of Monte Carlo Work –Obtain a respectably small standard error in the final result –More random samples can lead to better accuracy Not very rewarding –Variance Reduction Method
30
Summary Important Statistics Terms –Random Events Independence of Random Events Axioms on Random Events –Random Variables Independence of Random Variables –CDF –PDF –Expectation Characteristics of Expectation –Moments of a Distribution rth moment rth central moment –Mean –Variance –Standard Deviation –Covariance Characteristics of covariance –Correlation Coefficient
31
Summary (Cont.) Important Distributions –Uniform Distribution –Exponential Distribution –Binomial Distribution –Poison Distribution –Normal Distribution Estimation –Sample –Estimand –Parent Distribution –Sampling Distribution –Estimator Important estimators –Buffon’s Needle
32
What I want you to do? Review Slides Review basic probability/statistics concepts Work on your Assignment 1
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.