Fundamental Sampling Distributions and Data Descriptions

Slides:



Advertisements
Similar presentations
EGR Ch. 8 Part 1 and 2 Spring 2009 Slide 1 Fundamental Sampling Distributions  Introduction to random sampling and statistical inference  Populations.
Advertisements

CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
9.1 confidence interval for the population mean when the population standard deviation is known
Random Sampling and Data Description
© 2003 Prentice-Hall, Inc.Chap 5-1 Business Statistics: A First Course (3 rd Edition) Chapter 5 Probability Distributions.
1 Chapter 1: Sampling and Descriptive Statistics.
Chapter 7 Sampling and Sampling Distributions
1 The Islamic University of Gaza Civil Engineering Department Statistics ECIV 2305 ‏ Chapter 6 – Descriptive Statistics.
ISE 261 PROBABILISTIC SYSTEMS. Chapter One Descriptive Statistics.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter Two Treatment of Data.
1 Pertemuan 06 Sebaran Normal dan Sampling Matakuliah: >K0614/ >FISIKA Tahun: >2006.
Slides by JOHN LOUCKS St. Edward’s University.
Class notes for ISE 201 San Jose State University
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
Inferences About Process Quality
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 4: The Normal Distribution and Z-Scores.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution Business Statistics: A First Course 5 th.
Chapter 2 Describing Data with Numerical Measurements
Describing distributions with numbers
Chapter 2 Describing Data with Numerical Measurements General Objectives: Graphs are extremely useful for the visual description of a data set. However,
REPRESENTATION OF DATA.
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
Chap 6-1 Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall Chapter 6 The Normal Distribution Business Statistics: A First Course 6 th.
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
CHAPTER SIX FUNCTIONS OF RANDOM VARIABLES SAMPLING DISTRIBUTIONS.
2011 Summer ERIE/REU Program Descriptive Statistics Igor Jankovic Department of Civil, Structural, and Environmental Engineering University at Buffalo,
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 1 Overview and Descriptive Statistics.
Continuous Probability Distributions Continuous random variable –Values from interval of numbers –Absence of gaps Continuous probability distribution –Distribution.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
Chapter 3 Descriptive Statistics: Numerical Methods Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Review Measures of central tendency
Chapter 2 Describing Data.
6-1 Numerical Summaries Definition: Sample Mean.
Biostatistics Class 1 1/25/2000 Introduction Descriptive Statistics.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
1 CHAPTER 3 NUMERICAL DESCRIPTIVE MEASURES. 2 MEASURES OF CENTRAL TENDENCY FOR UNGROUPED DATA  In Chapter 2, we used tables and graphs to summarize a.
1 Elementary Statistics Larson Farber Descriptive Statistics Chapter 2.
Larson/Farber Ch 2 1 Elementary Statistics Larson Farber 2 Descriptive Statistics.
Biostatistics Unit 5 – Samples. Sampling distributions Sampling distributions are important in the understanding of statistical inference. Probability.
© 2003 Prentice-Hall, Inc. Chap 5-1 Continuous Probability Distributions Continuous Random Variable Values from interval of numbers Absence of gaps Continuous.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
Engineering Statistics KANCHALA SUDTACHAT. Statistics  Deals with  Collection  Presentation  Analysis and use of data to make decision  Solve problems.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
LIS 570 Summarising and presenting data - Univariate analysis.
© 2002 Prentice-Hall, Inc.Chap 5-1 Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions.
Section 6.4 Inferences for Variances. Chi-square probability densities.
INEN 270 ENGINEERING STATISTICS Fall 2011 Introduction.
Chap 6-1 Chapter 6 The Normal Distribution Statistics for Managers.
Chapter 5: The Basic Concepts of Statistics. 5.1 Population and Sample Definition 5.1 A population consists of the totality of the observations with which.
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
© 2003 Prentice-Hall, Inc. Chap 5-1 Continuous Probability Distributions Continuous Random Variable Values from interval of numbers Absence of gaps Continuous.
6-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Confidence Intervals. Point Estimate u A specific numerical value estimate of a parameter. u The best point estimate for the population mean is the sample.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Descriptive Statistics
Virtual University of Pakistan
Figure 2-7 (p. 47) A bar graph showing the distribution of personality types in a sample of college students. Because personality type is a discrete variable.
Other confidence intervals
Chapter 6 – Descriptive Statistics
Chapter 8: Fundamental Sampling Distributions and Data Descriptions:
STAT 206: Chapter 6 Normal Distribution.
Descriptive Statistics
CONCEPTS OF ESTIMATION
The normal distribution
Review on Modelling Process Quality
Chapter 8: Fundamental Sampling Distributions and Data Descriptions:
The Normal Distribution
Descriptive Statistics Civil and Environmental Engineering Dept.
Fundamental Sampling Distributions and Data Descriptions
Presentation transcript:

Fundamental Sampling Distributions and Data Descriptions ENGSTAT Notes of AM Fillone, De La Salle University-Manila

ENGSTAT Notes of AM Fillone, De La Salle University-Manila Population – the totality of observations with which we are concerned, whether their number be finite or infinite Statisticians uses the term to refer to observations relevant to anything of interest, whether it be groups of people, animals, or all possible outcomes from some complicated biological or engineering system Definition 8.1 A population consists of the totality of the observations with which we are concerned. Definition 8.2 A sample is a subset of a population. Definition 8.3 Let X1, X2, …, Xn be n independent random variables, each having the same probability distribution f(x). Define X1, X2, …, Xn to be a random sample of size n from the population f(x) and write its joint probability distribution as f(x1, x2, …, xn) = f(x1)f(x2) … f(xn) ENGSTAT Notes of AM Fillone, De La Salle University-Manila

ENGSTAT Notes of AM Fillone, De La Salle University-Manila Some Important Statistics Definition 8.4: Any function of the random variables constituting a random sample is called a statistic. Definition 8.5: If X1, X2, …, Xn represent a random sample of size n, then the sample mean is defined by the statistic. Definition 8.6: If X1, X2, …, Xn represent a random sample of size n, then the sample variance is defined by the statistic Theorem 8.1: If S2 is the variance of a random sample of size n, we may write Definition 8.7: The sample standard deviation, denoted by S, is the positive square root of the sample variance. ENGSTAT Notes of AM Fillone, De La Salle University-Manila

ENGSTAT Notes of AM Fillone, De La Salle University-Manila Other Statistics The sample median – reflects the central tendency of the sample in such a way that it is uninfluenced by extreme values or outliers. Given that the observations in a sample are x1, x2, …, xn, arranged in increasing order of magnitude, the sample median is ENGSTAT Notes of AM Fillone, De La Salle University-Manila

Example: Mean, median, mode, and standard deviation According to ecology writer Jacqueline Killeen, phosphates contained in household detergents pass right through our sewer systems, causing lakes to turn into swamps that eventually dry up into deserts. The following data show the amount of phosphates per load of laundry, in grams, for a random sample of various types of detergents used according to the prescribed directions: Laundry Detergent Phosphates per Load (grams) A & P Blue Sail Dash Concentrated All Cold Water All Breeze Oxydol Ajax Sears Fab Cold Power Bold Rinso 48 47 42 41 34 31 30 29 26 For the given phosphate data, find: (a) the mean; (b) the median; (c) the mode; and (d) the standard deviation. ENGSTAT Notes of AM Fillone, De La Salle University-Manila

ENGSTAT Notes of AM Fillone, De La Salle University-Manila Solution: (a) (b) Arrange data in increasing order - 26, 29, 29, 29, 30, 31, 34, 41, 42, 42, 47, 48 = (1/2)(31+34) = 32.5 grams (c) Mode = 29 (d) Standard deviation, ENGSTAT Notes of AM Fillone, De La Salle University-Manila

Data Displays and Graphical Methods Box-and-Whisker Plot or Box Plot This plot encloses the interquartile range of the data in a box that has the median displayed within The interquartile range has its extremes, the 75th percentile (upper quartile) and the 25th percentile (lower quartile) “Whiskers” extend showing extreme observations in the sample A variation called a box plot can provide the viewer information regarding which observations may be outliers Outliers are observations that are considered to be unusually far from the bulk of the data ENGSTAT Notes of AM Fillone, De La Salle University-Manila

ENGSTAT Notes of AM Fillone, De La Salle University-Manila Example: Consider the data in Table 8.1 about the nicotine content in a random sample of 40 cigarettes. Develop a box-and-whisker plot of the data. ENGSTAT Notes of AM Fillone, De La Salle University-Manila

ENGSTAT Notes of AM Fillone, De La Salle University-Manila Example: Constructing a Stem-and-Leaf Plot Consider the data of Table 1.4, which specifies the “life” of 40 similar car batteries recorded to the nearest tenth of a year. The batteries are guaranteed to last 3 years. Table 1.4: Car Battery Life 2.2 4.1 3.5 4.5 3.2 3.7 3.0 2.6 3.4 1.6 3.1 3.3 3.8 4.7 2.5 4.3 3.6 2.9 3.9 4.4 1.9 4.2 Process: Split each observation into two parts consisting of a stem and a leaf such that the stem represents the digit preceding the decimal and the leaf corresponds to the decimal part of the number. For example, for number 3.7, the digit 3 is designated the stem and the digit 7 is the leaf. The four stems 1, 2, 3, and 4 are listed vertically on the left side in Table 1.5; the leaves are recorded on the right side opposite the appropriate stem value. Stem Leaf Frequency 1 69 2 25669 5 3 0011112223334445567778899 25 4 11234577 8 Table 1.5: Steam-and-Leaf Plot ENGSTAT Notes of AM Fillone, De La Salle University-Manila

To remedy the problem, the number of stems could be increased. Stem-and-Leaf Plot The stem-and-leaf plot of Table 1.5 contains only four stems and consequently does not provide an adequate picture of the distribution. To remedy the problem, the number of stems could be increased. One way of doing this is to increase the number of stems of the plot. One way to accomplish this is to write each stem value twice and then record the leaves 0, 1, 2, 3, and 4 opposite the appropriate stem value where it appears for the first time; and the leaves 5, 6, 7, 8, 9 opposite this same stem value where it appears for the second time Table 1.6: Double-Stem-and-Leaf Plot of Battery Life Stem Leaf Frequency 1 69 2 2* 5669 4 3* 001111222333444 15 3 5567778899 10 4* 11234 5 577 ENGSTAT Notes of AM Fillone, De La Salle University-Manila

ENGSTAT Notes of AM Fillone, De La Salle University-Manila Frequency Distribution The data are grouped into different classes or intervals and can be constructed by counting the leaves belonging to each stem and noting that each stem defines a class interval. A table listing relative frequencies is called a relative frequency distribution. The relative frequency distribution of Battery Life is given in Table 1.7 below. Table 1.7: Relative Frequency Distribution of Battery Life Class Interval Class Midpoint Frequency, f Relative Frequency 1.5-1.9 1.7 2 0.050 2.0-2.4 2.2 1 0.025 2.5-2.9 2.7 4 0.100 3.0-3.4 3.2 15 0.375 3.5-3.9 3.7 10 0.250 4.0-4.4 4.2 5 0.125 4.5-4.9 4.7 3 0.075   1.000 Figure 1.6: Relative frequency histogram ENGSTAT Notes of AM Fillone, De La Salle University-Manila

ENGSTAT Notes of AM Fillone, De La Salle University-Manila Quantile Plot Definition 8.8. A quantile of a sample, q(f), is a value for which a specified fraction f of the data values is less than or equal to q(f). Detection of Deviations from Normality: Normal Quantile-Quantile Plot Definition 8.9: The normal quantile-quantile plot is a plot of y(i) (ordered observations) against q0,1(fi), where fi = (i – 3/8)/(n + ¼). - where a good approximation of the quantile for the N(0,1) random variable is ENGSTAT Notes of AM Fillone, De La Salle University-Manila

ENGSTAT Notes of AM Fillone, De La Salle University-Manila Sampling Distributions Definition 8.10: The probability distribution of a statistic is called a sampling distribution. Sampling Distribution of Theorem 8.2: Central Limit Theorem: If X is the mean of a random sample of size n taken from a population with mean  and finite variance 2, then the limiting form of the distribution of As n , is the standard normal distribution n(z;0,1). Sampling Distribution of the Difference between Two Averages Theorem 8.3: If independent samples of size n1 and n2 are drawn at random from two populations, discrete or continuous, with means 1 and 2, and variances 21 and 22 , respectively, then the sampling distribution of the differences of means, , is approximately normally distributed with mean and variance given by and ENGSTAT Notes of AM Fillone, De La Salle University-Manila

ENGSTAT Notes of AM Fillone, De La Salle University-Manila Hence, is approximately a standard normal variable. Sampling Distribution of S2 Theorem 8.4: If S2 is the variance of a random sample of size n taken from a normal population having the variance 2, then the statistic has a chi-squared distribution with  = n – 1 degrees of freedom. ENGSTAT Notes of AM Fillone, De La Salle University-Manila

ENGSTAT Notes of AM Fillone, De La Salle University-Manila Degrees of Freedom When  is not known and one considers the distribution of There is 1 less degree of freedom, or a degree of freedom is lost in the estimation of  (i.e., when  is replaced byx) In other words, there are n degrees of freedom or independent pieces of information in the random sample from the normal distribution. When the data (the values in the sample) are used to compute the mean, there is 1 less degree of freedom in the information used to estimate 2. ENGSTAT Notes of AM Fillone, De La Salle University-Manila

ENGSTAT Notes of AM Fillone, De La Salle University-Manila Examples: Chi-squared Distribution Ex. For the chi-squared distribution find 1. Answer: 27.488 (Table A.5) Answer: 18.475 2. Answer: 36.415 3. ENGSTAT Notes of AM Fillone, De La Salle University-Manila

ENGSTAT Notes of AM Fillone, De La Salle University-Manila t- Distribution Theorem 8.5: Let Z be a standard normal random variable and V a chi-squared random variable with  degrees of freedom. If Z and V are independent, then the distribution of the random variable T, where is given by the density function This is known as the t-distribution with  degrees if freedom. ENGSTAT Notes of AM Fillone, De La Salle University-Manila

Shape of t-Distribution The distribution of T is similar to the distribution of Z in that they both are symmetric about the mean zero. Both distributions are bell shaped, but the t-distribution is more variable, owing to the fact that the T-values depend on the fluctuations of two quantities,X and S2, whereas the Z-values depend only on the changes ofX from sample to sample. This distribution of T differs from that of Z in that the variance of T depends on the sample size n and is always greater than 1. Only when the sample size n  will the two distributions become the same. Figure 8.14: Symmetry property of the t-distribution ENGSTAT Notes of AM Fillone, De La Salle University-Manila

ENGSTAT Notes of AM Fillone, De La Salle University-Manila Example: t - Distribution ENGSTAT Notes of AM Fillone, De La Salle University-Manila

ENGSTAT Notes of AM Fillone, De La Salle University-Manila Solution: From t-distribution table, Table A.4 Hence, the claim is supported by the data obtained since T value is inside the –t0.025 and t0.025. ENGSTAT Notes of AM Fillone, De La Salle University-Manila

ENGSTAT Notes of AM Fillone, De La Salle University-Manila Corollary 8.1: Let X1, X2, …, Xn be independent random variables that are all normal with mean  and standard deviation . Let and Then the random variable has a t-distribution with  = n – 1 degrees of freedom. F-Distribution Theorem 8.6: Let U and V be two independent random variables having chi-squared distributions with 1 and 2 degrees of freedom, respectively. Then the distribution of the random variable F = (U/v1)/(V/v2) is given by the density This is known at the F-distribution with 1 and 2 degrees of freedom (d.f.). ENGSTAT Notes of AM Fillone, De La Salle University-Manila

ENGSTAT Notes of AM Fillone, De La Salle University-Manila Theorem 8.7: Writing with 1 and 2 degrees of freedom, we obtain Theorem 8.8: If S21 and S22 are the variances of independent random samples of size n1 and n2 taken from normal populations with variances 21 and 22, respectively, then This is known as the F-distribution with 1 = n1 -1 and 2 = n2 -1 degrees of freedom. Use of the F-Distribution The F-Distribution is used in two-sample situations to draw inferences about the population variances. The F-distribution is called the variance ratio distribution. ENGSTAT Notes of AM Fillone, De La Salle University-Manila

ENGSTAT Notes of AM Fillone, De La Salle University-Manila Solution: 2.71 2.92 0.345 ENGSTAT Notes of AM Fillone, De La Salle University-Manila

ENGSTAT Notes of AM Fillone, De La Salle University-Manila Solution: ENGSTAT Notes of AM Fillone, De La Salle University-Manila