Download presentation
Presentation is loading. Please wait.
Published byLesley Shepherd Modified over 8 years ago
1
ISE 390 Chapter 1 Introduction Spring 2005 Probability and Statistics for Modern Engineering, Second Edition, Lawrence L. Lapin
2
Class Objectives Develop an understanding of the terms used in the study of Statistics. Understand the difference in populations and samples. Demonstrate capability to collect, organize, present, analyze, and interpret numerical data. Understand Statistical Measures used to describe populations and samples including: Location Central Tendency Mean Median Mode Position Percentiles, Fractiles, and Quartiles Variability Range Variance Standard Deviation Coefficient of Variation
3
Definitions Statistic (Singular) – A data point (Boeing closed at $45.00.) Statistics (Plural) – The Dow was down 25 points today. (Aggregate of 30 stock prices) Statistics (Discipline) - The collection, organization, presenting, analyzing, and interpreting of numerical data for the purpose of providing information with which to reach a decision or communicate information in the face of uncertainty. Statistical Population – The collection of ALL possible members of a specific group of interest. Sample – A subset of the statistical population used to represent the population from which information is inferred regarding the population. Quantitative Population – Observations are numerical values. Qualitative Population – Observations are attributes described in terms of color, shape, sex, etc.
4
Types of Quantitative Data Ratio Data – Operations of addition, subtraction, multiplication, and division are applicable, Examples include time or physical data such as minutes, height, weight, etc. Interval Data – Only addition or subtraction is applicable. Temperature scales are examples. Ordinal Data – No arithmetic operations apply. Some relative value is associated with the scale. This includes scales such as hurricane or tornado ratings or the Beaufort wind scale. Nominal Data – Numbers that represent arbitrary code. No relative meaning. Codes for offices or colors are examples.
5
Analysis of Data Sample Times to Inspect Test Devices for Calibration 0 10 20 30 Number of Inspections 101520 Histogram Frequency Polygon Frequency Curve Frequency Distribution
6
Stem and Leaf Plot Advantage Over Other Displays of Data: No data is lost.
7
Music preferences in 200 young adults 14 to 19 The pie chart quickly tells you that the majority of students like rap best (50%), and the remaining students prefer alternative (25%), rock and roll (13%), country (10%) and classical (2%). Tip! When drawing a pie chart, ensure that the segments are ordered by size (largest to smallest) and in a clockwise direction. Source (or "Adapted from", if appropriate): Statistics Canada's Internet Site, full URL of source pages, and date of extraction. Statistics Canada information is used with the permission of Statistics Canada. Users are forbidden to copy the data and redisseminate them, in an original or modified form, for commercial purposes, without the expressed permission of Statistics Canada. Information on the availability of the wide range of data from Statistics Canada can be obtained from Statistics Canada's Regional Offices, its World Wide Web site at http://www.statcan.ca, and its toll-free access number 1-800-263-1136.http://www.statcan.ca Rap 100 Alternative 50 Rock and Roll 26 Country 20 Classical 4 Relative and Cumulative Frequency Distributions
8
Ogive for Cumulative Frequency Distribution
9
Probability Density Function Cumulative Probability Density Function ba Normal Probability Distributions x
11
SPSS Scatter Diagram for Concrete Strength vs. Pulse Velocity
12
Scatter Diagram
13
Scatter Diagrams
14
Statistical Measures Population descriptors are called Parameters Sample descriptors are called Sample Statistics Location Central Tendency Mean Median Mode Position Percentiles, Fractiles, and Quartiles Variability Range Variance Standard Deviation Coefficient of Variation
15
Central Tendency – Mean Population Mean lower case Greek letter Mu) N = number of members in population Sample Mean - n = number of members in sample Sample Mean Approximated from Grouped Data
16
Central Tendency – Median and Mode Median – That value above and below which an equal number of observations lie. Denoted by m 584 633 693 755 In the case where there is an even number of samples or observations, the Median is the average of the middle two samples. 584 633 640 680 693 755 The Median is (680 + 640)/2 = 660. 663 Mode – The most frequently occurring value 0 0 3 7 1 0 2 10 27 15 0 0 1 2 0 is the Mode Where the data is taken from a continuum, the Mode is taken to be the midpoint of the class interval with the highest frequency.
17
Frequency Distribution Forms and Summary Measures Positional Comparisons of the Three Measures of Central Tendency for Symmetrical and Skewed distributions Bimodal Frequency Distributions
18
Median – The median will be between the 50 th and 51 st observation. Adding 2 + 16 + 29 = 47, we know that the Median will be on the stem for 14. Counting in on the 14 stem, we find that the 50 th and 51 st observations are both 14.2. The Median is (14.2 + 14.2)/2 = 14.2. Mode – Look for the fraction that occurs most frequently on one stem. In this data, that is the 2 in the 14 stem. The Mode is 14.2. Using Stem and Leaf Plot to Find Median and Mode
19
Finding Percentiles from Ungrouped Data 1.Sort the data in ascending order. 2.Establish the decimal allowable range..1 < d <.9 (d = decimal equivalent of desired percentile; n = sample size) 3.Find the relative position of the desired percentile. ( n + 1) d (10 + 1).1 Let k be the largest integer such that: k < ( n + 1) d 1 < (1.11) k = 1 The desired percentile will lie between X k and X k + 1 or X 1 and X 2 4.Compute the percentile value. Q d = X k + [( n + 1) d - k](X k + 1 – X k ) = 5.3 + [(10 + 1).1 – 1] (5.4 – 5.3) Q.1 = 5.3 + [(11).1 – 1](.1) = 5.3 + (.1)(.1) = 5.31 k < (11).25 = 2.75 k = 2 Q.25 = 5.4 + [(11).25 – 2](5.7 - 5.4) = 5.4 + (0.75)(.3) = 5.4 +.225 = 5.63 k < (11).5 = 5.5 k = 5 Q..50 = 6.10 + [(11).5 – 5](6.1 – 6.1) = 6.10 + (0.5)(0) = 6.10 + 0 = 6.10 k < (11).75 =8. 25 k = 8 Q.75 = 6.4 + [(11).75 – 8](6.5 - 6.4) = 6.4 + (0. 25)(.1) = 6.4 +.025 = 6.43 1 2 3 4 5 6 7 8 9 10 5.3 5.4 5.7 6.0 6.1 6.1 6.2 6.4 6.5 6.6
20
Percentiles, Fractiles, and Quartiles Percentile – The point below which the stated percentage lies. Example: 17 of 20 observations fall below 23. What percentile is 23? 17/20 =.85. 23 is the 85 th percentile. Fractile - The point below which the stated fraction lies. Example: 17 of 20 observations fall below 23. What fractile is 23? 17/20 =.85. 23 is the.85-fractile. Quartile – Divides data into four groups of equal frequency First Quartile – Same as the 25 th percentile (.25-fractile) Second Quartile - Same as the 50 th percentile (.50-fractile) and also the Median. Third Quartile - Same as the 75 th percentile (.75-fractile)
21
First Quartile Second Quartile Third Quartile Fourth Quartile (100 + 1).25 = 25.25 Q.25 = 13.0 + (25.25 – 25.0)(13.1 – 13.0) = 13.025 (100 + 1).5 = 50.5 Q.5 = 14.2 + (50.5 – 50.0)(14.2 – 14.2) = 14.2 (100 + 1).75 = 75.75 Q.75 = 15.0 + (75.75 – 75.0)(15.2 – 15.0) = 15.15 Finding Percentiles from Ungrouped Data
22
Finding Percentiles Graphically 14.2 Q.50
23
Finding Percentiles from Grouped Data (Raw data unavailable, but grouped into frequency distribution) Q d = X k + [( n + 1) d - k] ( ) (73 + 1)(.25) = 18.5 The greatest cumulative frequency not exceeding 18.5 is 7 for the first interval, so k = 7. Q.25 will fall in the second interval, and its cumulative frequency is 30. X k + 1 – X k h - k Q.25 = 100.0 + [( 18.5 - 7] ( ) = 102.5 105.0 – 100.0 30 - 7
24
Variability* *The term “Dispersion” is also used.
25
1 2 3 4 5 6 7 8 9 10 5.3 5.4 5.7 6.0 6.1 6.1 6.2 6.4 6.5 6.6 Variability – Interquartile Range (Difference in the third quartile, Q.75 and the first quartile, Q.25 ) From Slide 16; Q.25 = 5.63, Q.50 = 6.10, Q.75 = 6.43 Interquartile Range = 6.43 – 5.63 = 0.80 Interquartile Range Range Boxplot
26
Variance and Standard Deviation (Measures deviation of individual observations around the mean.) Population Variance Standard Deviation Sample Calculation by Hand Approximated from Grouped Data – Lower case Greek letter Sigma The sum of the deviations of the samples about the sample mean = 0.
27
Composite Summary Measures Coefficient of Variation Population – Sample - Coefficient of Skewness – Provides direction and degree of skewness of frequency distribution m = Median SK > 0, Positively Skewed SK < 0, Negatively Skewed SK = 0, No Skew Kurtosis – Measures the “peakedness”of a distribution A.Which has the greatest Coefficient of Variation? B.Mesokurtic – Meso means intermediate. C.Leptokurtic – Lepto means slender. D.Platykurtic – Platy means flat.
28
Proportion Population Proportion Sample Proportion
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.