Presentation is loading. Please wait.

Presentation is loading. Please wait.

Basic statistics Usman Roshan.

Similar presentations


Presentation on theme: "Basic statistics Usman Roshan."— Presentation transcript:

1 Basic statistics Usman Roshan

2 Basic probability and stats
Random variable Probability of an event Coin toss example Independent random variables Mean and variance of a random variable Correlation between random variables Probability distributions Central limit theorem

3 Random variable A variable normally takes on different values
Random variable has values with different probabilities Coin toss example Dice example Probabilities must sum to 1

4 Probability of event Sample space: set of total possible outcomes
Event space: set of outcomes of interest Probability of an event is (size of event space)/(size of sample space) Counting: how many ways to pick k unique items from a set of n items? Probability and counting Bernoulli trials Coin tossing example R function: rbinom

5 Basic stats Independent events: coin toss example
Expected value of a random variable – example of Bernoulli and Binomal Variance of a random variable Correlation coefficient (same as Pearson correlation coefficient) Formulas: Covariance(X,Y) = E((X-μX)(Y-μY)) Correlation(X,Y)= Covariance(X,Y)/σXσY Pearson correlation

6 Probability distributions
Binomial distribution sum of Bernoulli trials converges to Gaussian as number of trials approaches infinity R functions Reference card rbinom Repeated executions of rbinom Lists for loops sum, length Gaussian distribution Chi-square distribution

7 Limit theorems Law of large numbers: empirical mean converges to true mean as we do more trials (follows from Chebyshev’s and Markov’s inequalities)

8 Law of large numbers Markov’s inequality Chebyshev’s inequality
Law of large numbers: sample mean of n i.i.d. random variables Xi converges to true one in probability Can be proved by applying Chebyshev’s inequality

9 Limit theorems Central limit theorem: average of sampling distribution converges to a normal distribution as we do more trials. Specifically, it is normally distributed with mean equal to the true mean μ and standard deviation equal to σ/sqrt(n) where n is number of trials and σ is true standard deviation How is this useful? Consider modeling the mean height of NJ residents. Can we assume it is normally distributed due to Central Limit Theorem?


Download ppt "Basic statistics Usman Roshan."

Similar presentations


Ads by Google