BIOS 501 Lecture 3 Binomial and Normal Distribution

Slides:



Advertisements
Similar presentations
Topic 3 The Normal Distribution. From Histogram to Density Curve 2 We used histogram in Topic 2 to describe the overall pattern (shape, center, and spread)
Advertisements

Statistics Lecture 14. Example Consider a rv, X, with pdf Sketch pdf.
The Normal Distributions
Objectives (BPS 3) The Normal distributions Density curves
Examples of continuous probability distributions: The normal and standard normal.
Looking at Data - Distributions Density Curves and Normal Distributions IPS Chapter 1.3 © 2009 W.H. Freeman and Company.
Statistics: Concepts and Controversies Normal Distributions
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. 1 PROBABILITIES FOR CONTINUOUS RANDOM VARIABLES THE NORMAL DISTRIBUTION CHAPTER 8_B.
The Normal distributions BPS chapter 3 © 2006 W.H. Freeman and Company.
NOTES The Normal Distribution. In earlier courses, you have explored data in the following ways: By plotting data (histogram, stemplot, bar graph, etc.)
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 6 Probability Distributions Section 6.2 Probabilities for Bell-Shaped Distributions.
Density Curves Section 2.1. Strategy to explore data on a single variable Plot the data (histogram or stemplot) CUSS Calculate numerical summary to describe.
1 The Normal Distribution William P. Wattles Psychology 302.
NORMAL DISTRIBUTION Chapter 3. DENSITY CURVES Example: here is a histogram of vocabulary scores of 947 seventh graders. BPS - 5TH ED. CHAPTER 3 2 The.
Random Variables Presentation 6.. Random Variables A random variable assigns a number (or symbol) to each outcome of a random circumstance. A random variable.
IPS Chapter 1 © 2012 W.H. Freeman and Company  1.1: Displaying distributions with graphs  1.2: Describing distributions with numbers  1.3: Density Curves.
Section 2.1 Density Curves. Get out a coin and flip it 5 times. Count how many heads you get. Get out a coin and flip it 5 times. Count how many heads.
Ch 2 The Normal Distribution 2.1 Density Curves and the Normal Distribution 2.2 Standard Normal Calculations.
The Normal Distribution Chapter 2 Continuous Random Variable A continuous random variable: –Represented by a function/graph. –Area under the curve represents.
Density Curves & Normal Distributions Textbook Section 2.2.
Theoretical distributions: the Normal distribution.
Chapter 2 The Normal Distributions. Section 2.1 Density curves and the normal distributions.
Section 2.1 Density Curves
The Rule In any normal distribution:
Continuous random variables
Continuous Distributions
CHAPTER 2 Modeling Distributions of Data
Interpreting Center & Variability.
CHAPTER 2 Modeling Distributions of Data
Chapter 6 The Normal Curve.
Good Afternoon! Agenda: Knight’s Charge-please wait for direction
Interpreting Center & Variability.
AP Statistics Empirical Rule.
The Normal Distribution
Section 2.2: Normal Distributions
Chapter 6 The Normal Distribution
CHAPTER 3: The Normal Distributions
Density Curves and Normal Distribution
CHAPTER 2 Modeling Distributions of Data
Interpreting Center & Variability.
Part A: Concepts & binomial distributions Part B: Normal distributions
2.1 Density Curve and the Normal Distributions
Continuous Distributions
5.4 Finding Probabilities for a Normal Distribution
Chapter 2 Data Analysis Section 2.2
Data Analysis and Statistical Software I ( ) Quarter: Autumn 02/03
Section 2.1 Density Curves & the Normal Distributions
CHAPTER 2 Modeling Distributions of Data
The Normal Distribution
Data Analysis and Statistical Software I Quarter: Spring 2003
Measuring location: percentiles
CHAPTER 2 Modeling Distributions of Data
Warmup Normal Distributions.
Continuous Random Variables
CHAPTER 2 Modeling Distributions of Data
Section 2.1 Density Curves & the Normal Distributions
Chapter 2: Modeling Distributions of Data
S.M .JOSHI COLLEGE ,HADAPSAR
The Normal Distribution
The Normal Distribution
CHAPTER 3: The Normal Distributions
CHAPTER 2 Modeling Distributions of Data
CHAPTER 2 Modeling Distributions of Data
CHAPTER 2 Modeling Distributions of Data
CHAPTER 2 Modeling Distributions of Data
CHAPTER 3: The Normal Distributions
Basic Practice of Statistics - 3rd Edition The Normal Distributions
CHAPTER 2 Modeling Distributions of Data
Warm-Up Honors Algebra 2 3/20/19
CHAPTER 2 Modeling Distributions of Data
Presentation transcript:

BIOS 501 Lecture 3 Binomial and Normal Distribution Roderick Little

The binomial and normal distributions Density curves Binomial distribution for counts Normal distributions The 68-95-99.7 rule The standard normal distribution Normal distribution calculations Standardizing observations Normal quantile plots IPS Section 1.3 Biostat 501 Lecture 3

Inference for a population based on a sample Statistical inference: the process of making inferences about parameters of a population based on sample data The distribution of values of X in the population (assumed large) is called the sampling distribution of X. Two important sampling distribution are the binomial distribution and the normal distribution. Sample Mean SD s Population Mean SD Biostat 501 Lecture 3

Sampling distributions A random variable is a variable whose value is a numerical outcome of a random phenomenon. Outcomes can be made into random variables by coding outcomes as numerical values; e.g. in coin tossing define the random variable X = 1 for a head, X = 0 for a tail. Then the mean of X is the proportion of heads. A discrete random variable X has a finite (or countable) number of possible values. The probability distribution of X lists the values and their probabilities. A continuous random variable X takes all the values in an interval of numbers. The probability distribution of X is described by a probability density function (pdf). The probability of any event is the area under the pdf for the set of values of X that make up the event. Biostat 501 Lecture 3

Binomial Distribution Consider a discrete random variable with just two outcomes (S,F) and Pr(S)=1-Pr(F)=p. Let the random variable x be the number of S’s in n independent trials. The sample space of outcomes is S = {0,1,2,…,n}. x follows a Binomial sampling distribution, and we write x~ Bin(n, p) Examples: x = number of heads in 15 tosses of a fair coin x~Bin(15,0.5) crossover trial with 30 subjects, x = number of subjects for which new treatment A is better than control treatment. x~Bin(30, p), p = Pr(A better than B) Biostat 501 Lecture 3

Two binomial distributions .3 .2 .1 0 1 2 3 4 5 6 x Pr(x) .3 .2 .1 0 1 2 3 4 5 6 x Pr(x) n=6, p=0.5 n=6, p=0.7 Biostat 501 Lecture 3

Probability density function for a continuous random variable As sample size n increases, histogram gets closer and closer to the density curve Biostat 501 Lecture 3

Probability density function As sample size increases, histogram tends to a probability density function (density curve), reflecting the distribution in the population. The density curve lies on or above the horizontal axis. Has area exactly 1 underneath it Area under this curve for a given range is the probability of a random observation falling in that range. Biostat 501 Lecture 3

Normal distribution Good description of many real continuous variables (test scores, crop yields, height) Symmetric, unimodal, bell-shaped Characterized by mean μ and s.d. σ . Mean= median is the center s.d. measures spread Approximates many other distributions well Biostat 501 Lecture 3

Normal distributions Biostat 501 Lecture 3

Normal distribution We write X~N(µ,σ) if X follows a normal distribution with mean µ, standard deviation σ N(0,1) is called the standard normal distribution (mean 0, sd 1) Formula for density (not important in this class): Biostat 501 Lecture 3

The 68-95-99.7 rule for normal distribution / data Approximately 68% of the observations fall within σ of the mean μ. Approximately 95% of the observations fall within 2σ of the mean μ. Approximately 99.7% of the observations fall within 3σ of the mean μ. Suppose mean is 0 and standard deviation =1. Biostat 501 Lecture 3

Biostat 501 Lecture 3

Example Heights of young women aged 18 to 24. Mean μ = 64.5 s.d σ=2.5 μ +σ=67 , μ- σ=62 :68% in (62,67) μ +2σ=69.5 , μ-2σ=59.5 : 95% in (59.5,69.5) μ +3σ=72 , μ-3σ=57 : 99.7% in (57,72) Biostat 501 Lecture 3

Biostat 501 Lecture 3

How short are the shortest 2.5% women? Less than 59.5 inches How tall are the tallest 2.5% women? Taller than 69.5 inches If data are normal, the full distribution is determined by the mean and s.d. How about this question: What percent of women are taller than 61 inches? Need more detailed tables. Biostat 501 Lecture 3

Finding probabilities for normal data Tables for standard normal distribution (N(0,1)) are available (See T-2 and T-3 at the back of the text) We will first learn how to find probabilities for standard normal data. Then learn to find probabilities for a normal distribution with any mean and s.d. Biostat 501 Lecture 3

Biostat 501 Lecture 3

Biostat 501 Lecture 3

Biostat 501 Lecture 3

Examples What proportion of observations on a standard normal variable Z take values less than 2.2 ? greater than -2.05 ? Find the 25th percentile of the N(0,1) curve. We will work on many examples other than the ones posted here. Biostat 501 Lecture 3

Standardizing z-score – standardized value of x (how many standard deviations from the mean). Subtract the mean and divide by the standard deviation: Biostat 501 Lecture 3

Finding probabilities for normal data The standardized values for any distribution always have mean 0 and standard deviation 1. If the original distribution is normal, the standardized values have normal distribution with mean 0 and standard deviation 1. This is called the standard normal distribution. general normal: N(µ, σ) standard normal: N(0,1) Biostat 501 Lecture 3

Standardizing to find probabilities for normal data X ~ N(µ,σ); what proportion of population is less than x*? Convert x* to standardized value z* = (x* - µ)/ σ Find P = Pr(Z < z*) from N(0,1) table Biostat 501 Lecture 3

Example In Y2K the scores of students taking SATs were approximately normal with mean 1019 and standard deviation 209. What percent of all students had the SAT scores of at least 820? (limit for Division I athletes to compete in their first college year) Biostat 501 Lecture 3

Standardizing to find percentiles for normal data Biostat 501 Lecture 3

Example In Y2K the scores of students taking SATs were approximately normal with mean 1019 and standard deviation 209. How high must a student score in order to place in the top 20 % of all students taking the SAT? Ans: the required value x* is the 80th percentile of the distribution Biostat 501 Lecture 3

Normal quantile plots Arrange the data from smallest to largest and record corresponding sample percentiles. Find z-scores for these percentiles (for example z-score for 5-th percentile is z=-1.645). Plot each data point (Y) against the corresponding z (X). If the data distribution is close to normal the plotted points will lie close to a straight line. Deviations from a straight line are evidence that the data are not normal. Biostat 501 Lecture 3

Newcomb’s data Biostat 501 Lecture 3

Newcomb’s data without outliers Biostat 501 Lecture 3

IQ scores of seventh-grade students Biostat 501 Lecture 3

Summary Hypothesized mathematical models for distributions: Probability Density curves Normal Distribution: 68-95-99.7 Empirical rule Evaluating probabilities for standard normal distribution Evaluating probabilities for normal distribution with any mean and s.d. by using standardized Z-scores Normal Probability plot Biostat 501 Lecture 3