Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Picturing Distributions with Graphs Stat 1510 Statistical Thinking & Concepts.

Similar presentations


Presentation on theme: "1 Picturing Distributions with Graphs Stat 1510 Statistical Thinking & Concepts."— Presentation transcript:

1 1 Picturing Distributions with Graphs Stat 1510 Statistical Thinking & Concepts

2 2 Statistics Statistics is a science that involves the extraction of information from numerical data obtained during an experiment or from a sample. It involves the design of the experiment or sampling procedure, the collection and analysis of the data, and making inferences (statements) about the population based upon information in a sample.

3 3 Individuals and Variables u Individuals –the objects described by a set of data –may be people, animals, or things u Variable –any characteristic of an individual –can take different values for different individuals Example – Temperature, Pressure, Weight Height, Sex, Major Course, etc.

4 4 Variables u Categorical –Places an individual into one of several groups or categories Examples – Sex, Grade (A, B, C..), Number of Defects, Type of Defects, Status of application u Quantitative (Numerical) –Takes numerical values for which arithmetic operations such as adding and averaging make sense Examples – Height, Weight, Pressure, etc.

5 5 Case Study The Effect of Hypnosis on the Immune System reported in Science News, Sept. 4, 1993, p. 153

6 6 Case Study Weight Gain Spells Heart Risk for Women “Weight, weight change, and coronary heart disease in women.” W.C. Willett, et. al., vol. 273(6), Journal of the American Medical Association, Feb. 8, 1995. (Reported in Science News, Feb. 4, 1995, p. 108)

7 7 Case Study Weight Gain Spells Heart Risk for Women Objective: To recommend a range of body mass index (a function of weight and height) in terms of coronary heart disease (CHD) risk in women.

8 8 Case Study u Study started in 1976 with 115,818 women aged 30 to 55 years and without a history of previous CHD. u Each woman’s weight (body mass) was determined. u Each woman was asked her weight at age 18.

9 9 Case Study u The cohort of women were followed for 14 years. u The number of CHD (fatal and nonfatal) cases were counted (1292 cases).

10 10 Case Study u Age (in 1976) u Weight in 1976 u Weight at age 18 u Incidence of coronary heart disease u Smoker or nonsmoker u Family history of heart disease quantitative categorical Variables measured

11 11 Study on Laptop u Objective is to identify the type of laptop computers used by university students. u A random sample of 1000 university students selected for this study u Each student is asked the question whether s/he have a laptop and if yes, the type of laptop (brand name) u Variables ?

12 12 Distribution u Tells what values a variable takes and how often it takes these values u Can be a table, graph, or function

13 13 Displaying Distributions u Categorical variables –Pie charts –Bar graphs u Quantitative variables –Histograms –Stemplots (stem-and-leaf plots)

14 14 YearCountPercent Freshman1841.9% Sophomore1023.3% Junior614.0% Senior920.9% Total43100.1% Data Table Class Make-up on First Day

15 15 Pie Chart Class Make-up on First Day

16 16 Class Make-up on First Day Bar Graph

17 17 Example: U.S. Solid Waste (2000) Data Table MaterialWeight (million tons)Percent of total Food scraps25.911.2 % Glass12.85.5 % Metals18.07.8 % Paper, paperboard86.737.4 % Plastics24.710.7 % Rubber, leather, textiles15.86.8 % Wood12.75.5 % Yard trimmings27.711.9 % Other7.53.2 % Total231.9100.0 %

18 18 Example: U.S. Solid Waste (2000) Pie Chart

19 19 Example: U.S. Solid Waste (2000) Bar Graph

20 20 Time Plots u A time plot shows behavior over time. u Time is always on the horizontal axis, and the variable being measured is on the vertical axis. u Look for an overall pattern (trend), and deviations from this trend. Connecting the data points by lines may emphasize this trend. u Look for patterns that repeat at known regular intervals (seasonal variations).

21 21 Class Make-up on First Day (Fall Semesters: 1985-1993)

22 22 Average Tuition (Public vs. Private)

23 23 Examining the Distribution of Quantitative Data u Observe overall pattern u Deviations from overall pattern u Shape of the data u Center of the data u Spread of the data (Variation) u Outliers

24 24 Shape of the Data u Symmetric –bell shaped –other symmetric shapes u Asymmetric –right skewed –left skewed u Unimodal, bimodal

25 25 Symmetric Bell-Shaped

26 26 Symmetric Mound-Shaped

27 27 Symmetric Uniform

28 28 Asymmetric Skewed to the Left

29 29 Asymmetric Skewed to the Right

30 30 Color Density of SONY TV

31 31 Outliers u Extreme values that fall outside the overall pattern –May occur naturally –May occur due to error in recording –May occur due to error in measuring –Observational unit may be fundamentally different

32 32 Histograms u For quantitative variables that take many values u Divide the possible values into class intervals (we will only consider equal widths) u Count how many observations fall in each interval (may change to percents) u Draw picture representing distribution

33 33 Histograms: Class Intervals u How many intervals? –One idea: Square root of the sample size ( round the value) u Size of intervals? –Divide range of data (max  min) by number of intervals desired, and round to convenient number u Pick intervals so each observation can only fall in exactly one interval (no overlap)

34 34 Usefulness of Histograms u To know the central value of the group u To know the extent of variation in the group u To estimate the percentage non- conformance, if some specified values are available u To see whether non-conformance is due to shift In mean or large variability

35 35 Case Study Weight Data Introductory Statistics class Spring, 1997 Virginia Commonwealth University

36 36 Weight Data

37 37 Weight Data: Frequency Table sqrt(53) = 7.2, or 8 intervals; range (260  100=160) / 8 = 20 = class width

38 38 Weight Data: Histogram 100120140160180200220240260280 Weight * Left endpoint is included in the group, right endpoint is not. Number of students

39 39

40 40

41 41 Histogram of Soft Drink Weight

42 42 Histogram of Soft Drink Weight

43 43 Stemplots (Stem-and-Leaf Plots) u For quantitative variables u Separate each observation into a stem (first part of the number) and a leaf (the remaining part of the number) u Write the stems in a vertical column; draw a vertical line to the right of the stems u Write each leaf in the row to the right of its stem; order leaves if desired

44 44 Weight Data 1212

45 45 Weight Data: Stemplot (Stem & Leaf Plot) 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 Key 20 | 3 means 203 pounds Stems = 10’s Leaves = 1’s 192 2 152 2 5 135

46 46 Weight Data: Stemplot (Stem & Leaf Plot) 10 0166 11 009 12 0034578 13 00359 14 08 15 00257 16 555 17 000255 18 000055567 19 245 20 3 21 025 22 0 23 24 25 26 0 Key 20 | 3 means 203 pounds Stems = 10’s Leaves = 1’s

47 47 Extended Stem-and-Leaf Plots If there are very few stems (when the data cover only a very small range of values), then we may want to create more stems by splitting the original stems.

48 48 Extended Stem-and-Leaf Plots Example: if all of the data values were between 150 and 179, then we may choose to use the following stems: 15 15 16 16 17 17 Leaves 0-4 would go on each upper stem (first “15”), and leaves 5-9 would go on each lower stem (second “15”).


Download ppt "1 Picturing Distributions with Graphs Stat 1510 Statistical Thinking & Concepts."

Similar presentations


Ads by Google