Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistics. Large Systems Macroscopic systems involve large numbers of particles.  Microscopic determinism  Macroscopic phenomena The basis is in mechanics.

Similar presentations


Presentation on theme: "Statistics. Large Systems Macroscopic systems involve large numbers of particles.  Microscopic determinism  Macroscopic phenomena The basis is in mechanics."— Presentation transcript:

1 Statistics

2 Large Systems Macroscopic systems involve large numbers of particles.  Microscopic determinism  Macroscopic phenomena The basis is in mechanics from individual molecules.  Classical and quantum Statistical thermodynamics provides the bridge between levels. Consider 1 g of He as an ideal gas.  N = 1.5  10 23 atoms Use only position and momentum.  3 + 3 = 6 coordinates / atom  Total 9  10 23 variables  Requires about 4  10 9 PB Find the total kinetic energy.  K = ( p x 2 + p y 2 + p z 2 )/2 m  About 100 ops / collision  At 100 GFlops, 9  10 14 s  1 set of collisions in 3  10 7 yr

3 Ensemble Computing time averages for large systems is infeasible. Imagine a large number of similar systems.  Prepared identically  Independent This ensemble of systems can be used to derive theoretical properties of a single system.

4 Probability Probability is often made as a statement before the fact.  A priori assertion - theoretical  50% probability for heads on a coin Probability can also reflect the statistics of many events.  25% probability that 10 coins have 5 heads  Fluctuations where 50% are not heads Probability can be used after the fact to describe a measurement.  A posteriori assertion - experimental  Fraction of coins that were heads in a series of samples

5 Head Count Take a set of experimental trials.  N number of trials  n number of values (bins)  i a specific trial (1 … N)  j a specific value (1 … n) Use 10 coins and 20 trials. trial#headstrial#heads 15115 28121 36135 45145 56156 66166 71172 85184 97196 104206

6 Distribution Sorting trials by value forms a distribution.  Distribution function f counts occurrences in a bin The mean is a measure of the center of the distribution.  Mathematical average Coin distribution = 4.95  Median - midway value Coin median = 5  Mode - most frequent value Coin mode = 6 012345678910 x f(x)f(x) 0 1 2 3 4 5 6 7

7 Probability Distribution The distribution function has a sum equal to the number of trials N. A probability distribution p normalizes the distribution function by N.  Sum is 1 The mean can be expressed in terms of the probability. 012345678910 x P(x)P(x) 0 0.1 0.2 0.3

8 Subsample Subsamples of the data may differ on their central value.  First five trials  Mean 6.0  Median 6  Mode 5 and 6, not unique Experimental probability depends on the sample. Theoretical probability predicts for an infinitely large sample. trial#headstrial#heads 15115 28121 36135 45145 56156 66166 71172 85184 97196 104206

9 Deviation Individual trials differ from the mean. The deviation is the difference of a trial from the mean.  mean deviation is zero The fluctuation is the mean of the squared deviations.  Fluctuation is the variance  Standard deviation squared

10 Correlation Events may not be random, but related to other events.  Time measured by trial The correlation function measures the mean of the product of related deviations.  Autocorrelation C 0 Different variables can be correlated.

11 Independent Trials Autocorrelation within a sample is the variance.  Coin experiment C 0 = 3.147 Nearest neighbor correlation tests for randomness.  Coin experiment C 1 = -0.345  Much less than C 0  Ratio C 1 / C 0 = -0.11 Periodic systems have C  peak for some period . trial#headstrial#heads 15115 28121 36135 45145 56156 66166 71172 85184 97196 104206

12 Correlation Measure Independent trials should peak strongly at 0.  No connection to subsequent events  No periodic behavior “This sample autocorrelation plot shows that the time series is not random, but rather has a high degree of autocorrelation between adjacent and near- adjacent observations.” nist.gov

13 Continuous Distribution Data that is continuously distributed is treated with an integral.  Probability still normalized to 1 The mean and variance are given as the moments.  First moment mean  Second moment variance Correlation uses a time integral.

14 Joint Probability The probabilities of two systems may be related. The intersection A  B indicates that both conditions are true.  Independent probability →  P(A  B) = P(A)P(B) The union A  B indicates that either condition is true.  P(A  B) =P(A)+P(B)-P(A  B)  P(A) + P(B), if exclusive A B C C = A  B

15 Joint Tosses Define two classes from the coin toss experiment.  A = { x < 5 }  B = { 2 < x < 8 } Individual probabilities are a union of discrete bins.  P(A) = 0.25, P(B) = 0.80  P(A  B) = 0.95 Dependent sets don’t follow product rule.  P(A  B) = 0.1  P(A)P(B) xP(x)P(x) 00 10.10 20.05 30 40.10 50.30 60.35 70.05 8 90 100

16 Conditional Probability The probability of an occurrence on a subset is a conditional probability.  Probability with respect to subset.  P(A | B) =P(A  B) / P(B) Use the same subsets for the coin toss example.  P(A | B) = 0.10 / 0.80 = 0.13 A B C C = A | B

17 Combinatorics The probability that n specific occurrences happen is the product of the individual occurrences.  Other events don’t matter.  Separate probability for negative events Arbitrary choice of events require permutations. Exactly n specific events happen at p: No events happen except the specific events: Select n arbitrary events from a pool of N identical types.

18 Binomial Distribution Treat events as a Bernoulli process with discrete trials.  N separate trials  Trials independent  Binary outcome of trial  Probability same for all trials The general form is the binomial distribution.  Terms same as binomial expansion  Probabilities normalized mathworld.wolfram.com

19 Mean and Standard Deviation The mean  of the binomial distribution: Consider an arbitrary x, and differentiate, and set x = 1. The standard deviation  of the binomial distribution:

20 Poisson Distribution Many processes are marked by rare occurrences.  Large N, small n, small p This is the Poisson distribution.  Probability depends on only one parameter Np  Normalized when summed from n =0 to .

21 Poisson Properties The mean and standard deviation are simply related.  Mean  = Np, standard deviation  2 = , Unlike the binomial distribution the Poisson function has values for n > N.

22 Poisson Away From Zero The Poisson distribution is based on the mean  = Np.  Assumed N >> 1, N >> n. Now assume that n >> 1,  large and P n >> 0 only over a narrow range. This generates a normal or Gaussian distribution. Let x = n – . Use Stirling’s formula.

23 Normal Distribution The full normal distribution separates mean  and standard deviation  parameters. Tables provide the integral of the distribution function. Useful benchmarks:  P (| x -  | < 1  = 0.683  P (| x -  | < 2  = 0.954  P (| x -  | < 3  = 0.997 0 x P(x)P(x) 


Download ppt "Statistics. Large Systems Macroscopic systems involve large numbers of particles.  Microscopic determinism  Macroscopic phenomena The basis is in mechanics."

Similar presentations


Ads by Google