Presentation is loading. Please wait.

Presentation is loading. Please wait.

Modeling Continuous Variables

Similar presentations


Presentation on theme: "Modeling Continuous Variables"— Presentation transcript:

1 Modeling Continuous Variables
Lecture 19 Section Fri, Oct 6, 2006

2 Models Mathematical model – An abstraction and, therefore, a simplification of a real situation, one that retains the essential features. Real situations are usually much to complicated to deal with in all their details.

3 Example The “bell curve” is a model (an abstraction) of many populations. Real populations have all sorts of bumps and twists and irregularities. The bell curve is smooth and perfectly symmetric. In statistics, the bell curve is called the normal curve, or normal distribution.

4 Models Our models will be models of distributions, presented either as histograms or as continuous distributions.

5 Histograms and Area In a histogram, frequency is represented by area.
Consider the following distribution of test scores. Grade Frequency 60 – 69 3 70 – 79 8 80 – 89 9 90 – 99 5

6 Histograms and Area Frequency 10 8 6 4 2 Grade 60 70 80 90 100

7 Histograms and Area What is the total area of this histogram?
We will rescale the vertical scale so that the total area equals 1, representing 100%.

8 Histograms and Area To achieve this, we divide the frequencies by the original area to get the density. Grade Frequency Density 60 – 69 3 0.012 70 – 79 8 0.032 80 – 89 9 0.036 90 – 99 5 0.020

9 Histograms and Area Density 0.040 0.030 0.020 0.010 Grade 60 70 80 90
60 70 80 90 100

10 Histograms and Area Density 0.040 Total area = 1 0.030 0.020 0.010
Grade 60 70 80 90 100

11 Histograms and Area This histogram has the special property that the proportion can be found by computing the area of the rectangle. For example, what proportion of the grades are less than 80? Compute: (10  0.012) + (10  0.032) = = 0.44 = 44%.

12 Density Functions This is the fundamental property that connects the graph of a continuous model to the population that it represents, namely: The area under the graph between two numbers a and b on the x-axis represents the proportion of the population that lies between a and b. AREA = PROPORTION

13 Density Functions Now consider an arbitrary distribution.
The area under the curve between a and b is the proportion of the values of x that lie between a and b. x a b

14 Density Functions Now consider an arbitrary distribution.
The area under the curve between a and b is the proportion of the values of x that lie between a and b. x a b

15 Density Functions Now consider an arbitrary distribution.
The area under the curve between a and b is the proportion of the values of x that lie between a and b. x a b Area = Proportion

16 Density Functions Again, the total area under the curve must be 1, representing a proportion of 100%. x a b

17 Density Functions Again, the total area under the curve must be 1, representing a proportion of 100%. 100% x a b

18 The Normal Distribution
Normal distribution – The statistician’s name for the bell curve. It is a density function in the shape of a “bell.” Symmetric. Unimodal. Extends over the entire real line (no endpoints). “Main part” lies within 3 of the mean.

19 The Normal Distribution
The curve has a bell shape, with infinitely long tails in both directions.

20 The Normal Distribution
The mean  is located in the center, at the peak.

21 The Normal Distribution
The width of the “main” part of the curve is 6 standard deviations wide (3 standard deviations each way from the mean).  – 3  + 3

22 The Normal Distribution
The area under the entire curve is 1. (The area outside of 3 st. dev. is approx ) Area = 1  – 3  + 3

23 The Normal Distribution
The normal distribution with mean  and standard deviation  is denoted N(, ). For example, if X is a variable whose distribution is normal with mean 30 and standard deviation 5, then we say that “X is N(30, 5).”

24 The Normal Distribution
If X is N(30, 5), then the distribution of X looks like this: 15 30 45

25 Some Normal Distributions
1 2 3 4 5 6 7 8

26 Some Normal Distributions
1 2 3 4 5 6 7 8

27 Some Normal Distributions
1 2 3 4 5 6 7 8

28 Some Normal Distributions
1 2 3 4 5 6 7 8

29 Bag A vs. Bag B Suppose we have two bags, Bag A and Bag B.
Each bag contains millions of vouchers. In Bag A, the values of the vouchers have distribution N(50, 10). Normal with  = $50 and = $10. In Bag B, the values of the vouchers have distribution N(80, 15). Normal with  = $80 and  = $15.

30 Bag A vs. Bag B H0: Bag A H1: Bag B 30 40 50 60 70 80 90 100 110

31 Bag A vs. Bag B We are presented with one of the bags.
We select one voucher at random from that bag. H0: Bag A H1: Bag B 30 40 50 60 70 80 90 100 110

32 Bag A vs. Bag B If its value is less than or equal to $65, then we will decide that it was from Bag A. H0: Bag A H1: Bag B 30 40 50 60 65 70 80 90 100 110

33 Bag A vs. Bag B If its value is less than or equal to $65, then we will decide that it was from Bag A. H0: Bag A H1: Bag B 30 40 50 60 65 70 80 90 100 110 Acceptance Region

34 Bag A vs. Bag B If its value is less than or equal to $65, then we will decide that it was from Bag A. H0: Bag A H1: Bag B 30 40 50 60 65 70 80 90 100 110 Acceptance Region Rejection Region

35 Bag A vs. Bag B What is ? H0: Bag A H1: Bag B 30 40 50 60 65 70 80 90
100 110

36 Bag A vs. Bag B What is ? H0: Bag A H1: Bag B  30 40 50 60 65 70 80
90 100 110

37 Bag A vs. Bag B What is ? H0: Bag A H1: Bag B 30 40 50 60 65 70 80 90
100 110

38 Bag A vs. Bag B What is ? H0: Bag A H1: Bag B  30 40 50 60 65 70 80
90 100 110

39 Bag A vs. Bag B If the distributions are very close together, then  and  will be large. H0: Bag A H1: Bag B 30 40 50 60 65 70 80 90 100 110

40 Bag A vs. Bag B If the distributions are very similar, then  and  will be large. H0: Bag A H1: Bag B 30 40 50 60 65 70 80 90 100 110

41 Bag A vs. Bag B If the distributions are very similar, then  and  will be large. H0: Bag A H1: Bag B 30 40 50 60 65 70 80 90 100 110

42 Bag A vs. Bag B Similarly, if the distributions are far apart, then  and  will both be very small. H0: Bag A H1: Bag B 30 40 50 60 65 70 80 90 100 110

43 Bag A vs. Bag B Similarly, if the distributions are far apart, then  and  will both be very small. H0: Bag A H1: Bag B 30 40 50 60 65 70 80 90 100 110

44 Bag A vs. Bag B Similarly, if the distributions are far apart, then  and  will both be very small. H0: Bag A H1: Bag B 30 40 50 60 65 70 80 90 100 110


Download ppt "Modeling Continuous Variables"

Similar presentations


Ads by Google