Teaching Basic Statistics with R: An Introduction to Interactive Packages Shuen-Lin Jeng National Cheng Kung University.

Slides:



Advertisements
Similar presentations
Chapter 18 Sampling distribution models
Advertisements

Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
Statistics review of basic probability and statistics.
THE CENTRAL LIMIT THEOREM The “World is Normal” Theorem.
Probability & Statistical Inference Lecture 6
Chapter 18 Sampling Distribution Models
Econ 140 Lecture 61 Inference about a Mean Lecture 6.
AP Statistics Section 9.2 Sample Proportions
1 Bernoulli and Binomial Distributions. 2 Bernoulli Random Variables Setting: –finite population –each subject has a categorical response with one of.
Descriptive statistics Experiment  Data  Sample Statistics Sample mean Sample variance Normalize sample variance by N-1 Standard deviation goes as square-root.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Overview Parameters and Statistics Probabilities The Binomial Probability Test.
Introduction to Probability and Statistics Chapter 7 Sampling Distributions.
Statistical Inference Lab Three. Bernoulli to Normal Through Binomial One flip Fair coin Heads Tails Random Variable: k, # of heads p=0.5 1-p=0.5 For.
Statistical inference Population - collection of all subjects or objects of interest (not necessarily people) Sample - subset of the population used to.
R. Kass/S07 P416 Lec 3 1 Lecture 3 The Gaussian Probability Distribution Function Plot of Gaussian pdf x p(x)p(x) Introduction l The Gaussian probability.
Standard error of estimate & Confidence interval.
Hypothesis Testing. Central Limit Theorem Hypotheses and statistics are dependent upon this theorem.
Probability theory 2 Tron Anders Moger September 13th 2006.
Binomial Distributions Calculating the Probability of Success.
Permutations & Combinations and Distributions
Vegas Baby A trip to Vegas is just a sample of a random variable (i.e. 100 card games, 100 slot plays or 100 video poker games) Which is more likely? Win.
AP STATS: Take 10 minutes or so to complete your 7.1C quiz.
Chapter 10 – Sampling Distributions Math 22 Introductory Statistics.
Bernoulli Trials Two Possible Outcomes –Success, with probability p –Failure, with probability q = 1  p Trials are independent.
COMP 170 L2 L17: Random Variables and Expectation Page 1.
1.3 Simulations and Experimental Probability (Textbook Section 4.1)
Chapter 7: Sample Variability Empirical Distribution of Sample Means.
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
Week 21 Conditional Probability Idea – have performed a chance experiment but don’t know the outcome (ω), but have some partial information (event A) about.
Lecture 2 Review Probabilities Probability Distributions Normal probability distributions Sampling distributions and estimation.
Chapter 7 Sampling and Sampling Distributions ©. Simple Random Sample simple random sample Suppose that we want to select a sample of n objects from a.
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
1 Since everything is a reflection of our minds, everything can be changed by our minds.
8 Sampling Distribution of the Mean Chapter8 p Sampling Distributions Population mean and standard deviation,  and   unknown Maximal Likelihood.
7.2 Means and variances of Random Variables (weighted average) Mean of a sample is X bar, Mean of a probability distribution is μ.
40S Applied Math Mr. Knight – Killarney School Slide 1 Unit: Statistics Lesson: ST-5 The Binomial Distribution The Binomial Distribution Learning Outcome.
Week 121 Law of Large Numbers Toss a coin n times. Suppose X i ’s are Bernoulli random variables with p = ½ and E(X i ) = ½. The proportion of heads is.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Point Estimation of Parameters and Sampling Distributions Outlines:  Sampling Distributions and the central limit theorem  Point estimation  Methods.
1 Probability and Statistical Inference (9th Edition) Chapter 5 (Part 2/2) Distributions of Functions of Random Variables November 25, 2015.
Psychology 202a Advanced Psychological Statistics September 29, 2015.
Hypothesis Testing. Central Limit Theorem Hypotheses and statistics are dependent upon this theorem.
Section 7.2 P1 Means and Variances of Random Variables AP Statistics.
CLASSICAL NORMAL LINEAR REGRESSION MODEL (CNLRM )
Chapter 5 Joint Probability Distributions and Random Samples  Jointly Distributed Random Variables.2 - Expected Values, Covariance, and Correlation.3.
+ Chapter 5 Overview 5.1 Introducing Probability 5.2 Combining Events 5.3 Conditional Probability 5.4 Counting Methods 1.
Chapter 8: Probability: The Mathematics of Chance Probability Models and Rules 1 Probability Theory  The mathematical description of randomness.  Companies.
Central Limit Theorem Let X 1, X 2, …, X n be n independent, identically distributed random variables with mean  and standard deviation . For large n:
Chapter 6 Large Random Samples Weiqi Luo ( 骆伟祺 ) School of Data & Computer Science Sun Yat-Sen University :
Sums of Random Variables and Long-Term Averages Sums of R.V. ‘s S n = X 1 + X X n of course.
R. Kass/W04 P416 Lec 3 1 Lecture 3 The Gaussian Probability Distribution Function Plot of Gaussian pdf x p(x)p(x) Introduction l The Gaussian probability.
Basic statistics Usman Roshan.
Sampling and Sampling Distributions
Sampling Distributions
Jiaping Wang Department of Mathematical Science 04/22/2013, Monday
Supplemental Lecture Notes
Chapter 5 Joint Probability Distributions and Random Samples
The Gaussian Probability Distribution Function
Psychology 202a Advanced Psychological Statistics
Sample Mean Distributions
Combining Random Variables
Sampling Distribution Models
ASV Chapters 1 - Sample Spaces and Probabilities
C14: The central limit theorem
Using the Tables for the standard normal distribution
Introduction to Probability & Statistics The Central Limit Theorem
Introduction to Statistics
CHAPTER 15 SUMMARY Chapter Specifics
Statistical Inference
CHAPTER 5 REVIEW.
Presentation transcript:

Teaching Basic Statistics with R: An Introduction to Interactive Packages Shuen-Lin Jeng National Cheng Kung University

Outline Teaching the basic Statistics – Law of Large Numbers – Central Limit Theorem The R interactive packages – LargeSample – LargeSampleV2.1 – C. Joseph Lu Associate Professor National Cheng Kung University

An probability / statistics event seen in daily lives

Questions Could the past number frequency help for winning the Jackpot? If the lottery is “fair”, should the frequency of each number be getting closer after years? ANS: By the Law of Large Numbers Does the lottery favor or not favor to certain numbers? Is the lottery “fair”? ANS: By the Central Limit Theorem

Simplify the question: Is the coin fair? Toss a coin 1 to 10 times and calculate the ratio of head appearing

Keep tossing to 50 times

Keep Tossing to 1000 Times

The Law of Large Numbers Bernoulli (1713) “The Art of Guessing” proved that for X1 … Xn independent and binomial distributed B(1,  ) , then for all ε > 0 Actually the result holds for independent identical distributed random variables with finite expectation. Loosely speaking, for the sample collected under a repeating manner, the sample mean will be close to the population mean when the sample size is large.

How large ? Toss 30 times ? Simulations to see the size effect.

50 Simulations. Each tossing 1000times We may conclude that it is not a fair coin

For a fair coin , will the frequency be closer to 0.5n ? Simulate 100 times

A closer look

Question If the lottery is “fair”, should the frequency of each number be getting closer after years of the games? Answer: not necessary true. The law of large numbers claims that for a fair experiment, the sample mean (ratio of head count) will closer to the expected value (population mean). So the frequencies may or may not be getting closer.

Actually In the long run, the probability that we see the frequency far away from the mean number is 1!

Mice under certain dosage of a treatment. The average life in weeks ?

Increases sample size to 30 mice

Increases sample size to 100 mice (Money?). What is the sampling distribution of the average life?

Sampling dist. of sample mean: simulation 200 times. Suppose population form exponential(rate=0.1)(mean=10)

Look at the sampling distribution with sample size 5

Look at the sampling distribution with sample size 30

Look at the sampling distribution with sample size 50

The Central Limit Theorem Lindeberg Central Limit Theorem : If a sequence of independent random variables has zero means and finite variances (may different), and distribution functions satisfying Lindeberg condition, then the distribution functions of the normalized sums tend to the standard normal. (Probability Theory, Yuan Shih Chow, Henry Teicher, 1988) Lindeberg condition? Light tail condition

The Central Limit Theorem When sample size is large, That is For the power ball number μ= p =1/39, σ=sqrt(p(1-p)) , n=231

Lottery Numbers Does the lottery favor or not favor to certain numbers? Is the lottery “fair”? ANS : – By CLT, under the assumption of fair game, the reasonable range can be approximated. – The range can also be calculated by Binomial distribution. – In the case with numbers far beyond the reasonable range after a long period of games, we will suspect the fairness of the game.

Will the sampling dist. of sample mean always goes to normal? Population Cauchy(0,1), 200 simulations

Sampling dist. of sample variance Population U(0,1), Sample size 30

Sampling dist. of sample maximum Population U(0,1), Sample size 30

How about the censored data? LargeSampleV2.1 – Single right censoring – Random right censoring – Estimation of mean and median by Kaplan-Meier estimator of survival function KMmean and KMmedian

50% right censoring from Exp(1) Sample distribution of sample mean

50% right censoring from Exp(1) Sample distribution of sample median

50% right censoring from Exp(1) Sample distribution of sample mean from Kaplan-Meier survival estimation

50% right censoring from Exp(1) Sample distribution of sample median from Kaplan-Meier survival estimation

Exp(1) with random right censoring from Exp(1) Sample distribution of sample median from Kaplan-Meier survival estimation