INEN 270 ENGINEERING STATISTICS Fall 2011 Introduction.

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

Descriptive Measures MARE 250 Dr. Jason Turner.
Statistics 100 Lecture Set 6. Re-cap Last day, looked at a variety of plots For categorical variables, most useful plots were bar charts and pie charts.
Statistics It is the science of planning studies and experiments, obtaining sample data, and then organizing, summarizing, analyzing, interpreting data,
Random Sampling and Data Description
Measures of Dispersion
Descriptive Statistics
Descriptive Statistics – Central Tendency & Variability Chapter 3 (Part 2) MSIS 111 Prof. Nick Dedeke.
Sullivan – Statistics: Informed Decisions Using Data – 2 nd Edition – Chapter 3 Introduction – Slide 1 of 3 Topic 16 Numerically Summarizing Data- Averages.
1 Economics 240A Power One. 2 Outline w Course Organization w Course Overview w Resources for Studying.
ISE 261 PROBABILISTIC SYSTEMS. Chapter One Descriptive Statistics.
Slides by JOHN LOUCKS St. Edward’s University.
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Chapter Two Descriptive Statistics McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Chapter 2 Describing Data with Numerical Measurements
Describing distributions with numbers
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
Chapter 2 Describing Data with Numerical Measurements General Objectives: Graphs are extremely useful for the visual description of a data set. However,
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
Let’s Review for… AP Statistics!!! Chapter 1 Review Frank Cerros Xinlei Du Claire Dubois Ryan Hoshi.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
Chapter 3 – Descriptive Statistics
Methods for Describing Sets of Data
2011 Summer ERIE/REU Program Descriptive Statistics Igor Jankovic Department of Civil, Structural, and Environmental Engineering University at Buffalo,
1 Excursions in Modern Mathematics Sixth Edition Peter Tannenbaum.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Chapter 3 Descriptive Statistics: Numerical Methods Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Review Measures of central tendency
1 Laugh, and the world laughs with you. Weep and you weep alone.~Shakespeare~
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
Chapter 2 Describing Data.
Ex St 801 Statistical Methods Introduction. Basic Definitions STATISTICS : Area of science concerned with extraction of information from numerical data.
Describing distributions with numbers
Biostatistics Class 1 1/25/2000 Introduction Descriptive Statistics.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
Skewness & Kurtosis: Reference
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
1 CHAPTER 3 NUMERICAL DESCRIPTIVE MEASURES. 2 MEASURES OF CENTRAL TENDENCY FOR UNGROUPED DATA  In Chapter 2, we used tables and graphs to summarize a.
INVESTIGATION 1.
LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
Numerical Measures. Measures of Central Tendency (Location) Measures of Non Central Location Measure of Variability (Dispersion, Spread) Measures of Shape.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall2(2)-1 Chapter 2: Displaying and Summarizing Data Part 2: Descriptive Statistics.
LIS 570 Summarising and presenting data - Univariate analysis.
Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher.
Descriptive Statistics(Summary and Variability measures)
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
COMPLETE BUSINESS STATISTICS
Descriptive Statistics ( )
Exploratory Data Analysis
Methods for Describing Sets of Data
ISE 261 PROBABILISTIC SYSTEMS
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
NUMERICAL DESCRIPTIVE MEASURES
Description of Data (Summary and Variability measures)
Numerical Descriptive Measures
Descriptive Statistics
An Introduction to Statistics
Topic 5: Exploring Quantitative data
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Honors Statistics Review Chapters 4 - 5
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Presentation transcript:

INEN 270 ENGINEERING STATISTICS Fall 2011 Introduction

Agenda Purpose Statistical Concepts Descriptive Statistics and Some of Their Graphs Inferential Statistics Lecture Summary

Lecture 1: Introduction to StatisticsPurpose Statistical Concepts Descriptive Statistics and Some of Their Graphs Inferential Statistics Lecture Summary

Why Study Statistics? You need to know how to evaluate published numerical facts. Your profession or employment may require you to interpret the results of sampling or to employ statistical methods of analysis to make inferences in your work.

What Is the Purpose of Statistics? One purpose of statistics is to make sense of your data. Statistics provide information about your data so you can answer questions and make informed business decisions.

Lecture 1: Introduction to Statistics Purpose Statistical Concepts Descriptive Statistics and Some of Their Graphs Inferential Statistics Lecture Summary

Objectives Explain use of statistics. Define population and sample. Describe processes involved in statistical analysis. Compare descriptive and inferential statistics. Discuss the sampling plan.

Defining the Problem Before you begin any analysis, you should complete certain tasks. 1. Outline the purpose of the study. 2. Document the study questions. 3. Define the population of interest. 4. Determine the need for sampling. 5. Define the data collection protocol.

Example: Speeding Data 65 mph 50 mph 48 mph Speed Limit mph

Population and Sample

Basic Definition STATISTICS: Area of science concerned with extraction of information from numerical data and its use in making inference about a population from data that are obtained from a sample.

Population (set of all measurements) Sample (set of measurements selected from the population) ? ? Extract Information Make Inference

Basic Definition Population and Parameter Population: set representing all measurements of interest to the investigator. Parameters: an unknown population characteristic of interest to the investigator. Sample and Statistic Sample: subset of measurements selected from the population of interest. Statistic: a sample characteristic of interest to the investigator. Descriptive Statistics Center of location: mean, median, mode Variability: variance, standard deviation Distribution

Examples of Population and Sample Selecting the proper diet for shrimp or other sea animals is an important aspect of sea farming. A researcher wishes to estimate the average weight of shrimp maintained on a specific diet for a period of 6 months. One hundred shrimp are randomly selected from an artificial pond and each is weighed. Identify the population Identify the sample Identify the parameter Identify the statistic

Simple Random Sampling

Convenience Sampling

Process of Statistical Data Analysis Population Random Sample Sample Statistics Make Inferences Describe

Sampling Plan

Lecture 1: Introduction to Statistics Purpose Statistical Concepts Descriptive Statistics and Some of Their Graphs Inferential Statistics Lecture Summary

Objectives Compute and interpret statistics describing the location of a set of values, such as the mean and median and mode. Compute and interpret statistics describing the variability in a set of values, such as the range and standard deviation. Compute and interpret the measures of shape, skewness and kurtosis. Produce graphical displays of data.

Some Frequently Used Statistics and Parameters

Measure of Location Descriptive statistics that locate the center of your data are called measures of central tendency Sample Mean The sample mean of a set of n measurements (x 1, x 2,…x n ) is equal to the sum of the measurements divided by n.

Sample Median Median: the middle value (also known as the 50th percentile) The median of a set of n measurements (x 1, x 2,…x n ) is the value that falls in the middle position when the measurements are ordered from the smallest to the largest. x 1,…x n are arranged in increasing order of magnitude Measure of Location

RULE FOR CALCULATING THE MEDIAN 1. Order the measurements from the smallest to the largest. 2. A) If the sample size is odd, the median is the middle measurement. B) If the sample size is even, the median is the average of the two middle measurements.

n=3n= n=3n=3 median

Percentiles th Percentile=91 50 th Percentile=80 25 th Percentile=59 Quartiles break your data up into quarters. third quartile first quartile

Example A random sample of six values were taken from a population. These values were: x 1 =7, x 2 =1, x 3 =10, x 4 =8, x 5 =4, and x 6 =12. What are the sample mean and sample median for these data?

Sample Mean

CALCULATIONS FOR THE SAMPLE MEDIAN 1. Order Sample 2.Median

x 2 =1, x 5 =4, x 1 =7, x 4 =8, x 3 =10, x 6 =12 1. Order Sample MEDIAN = ( ) / 2 = 7.5

Example Given a set of data: 1.7, 2.2, 3.11, 3.9, and 14.7 Sample mean= Sample median =

Example Consider the following sample: Which measure of central tendency best describes the central location of the data: THE SAMPLE MEAN OR SAMPLE MEDIAN? Why?

the median

Why? Because there is an outlier (extreme value),4 in the data set, the mean is heavily influenced by this single outlier. Solution: Trimmed mean—drops the highest and lowest extreme values and averages the rest. e.g. 5% trimmed mean drops the highest and lowest 5% and averages the rest.

Sample Mode What is the mode for the previous example? 44 (occurs twice) 49 (occurs twice)

Measures of Central Tendency (Mode, mean and median) How are they related to a given data set? Depending on the skewness of the population (a) A bell-shaped distribution

(b) A distribution skewed to the left A: mean B: median C: mode A: mode B: median C: mean (c) A distribution skewed to the right

Suppose IRS wants to measure the central tendency of the income of the American population, which measure will you recommend and why? Hint: Bill Gates Skewed to the right

Other Measures of Locations Trimmed means Computed by “trimming away” a certain percent of both the largest and smallest set of values. Less sensitive to outliers than the mean but more-so than the median. What is the relationship between trimmed mean and the median? Example:

The Spread of a Distribution: Variation MeasureDefinition rangethe difference between the maximum and minimum data values interquartile rangethe difference between the 25th and 75th percentiles (IR or IQR) variancea measure of dispersion of the data around the mean standard deviationa measure of dispersion expressed in the same units of measurement as your data (the square root of the variance) coefficient of variation standard deviation as a percentage of of the mean

Typical Variation: Standard Deviation The variance is a measure of variation. The square root of the variance, or standard deviation, is a measure of variation in terms of the original linear scale. is the population standard deviation is an estimate of the population standard deviation.

Typical Variation: Average Squared Deviation Consider the data {3, 4, 8} ObsDataDeviation(Deviation) Sum15014 Average5014/3

Measures of Variability Sample Range X Max -X Min Sample Variance Sample Standard Deviation

Obs Obs.

Sample Variance

Unbiased Estimate of Population Variance Calculate the unbiased estimate of population variance by averaging with n-1 instead of n. This estimator is unbiased because, on average, it equals the population variance.

Discrete and Continuous Data Discrete Data Counted: # of defective items, # of accidents Continuous Data Measured: all possible heights, weights, distance,etc.

Distributions When you examine the distribution of values for speed, you can determine the range of possible data values the frequency of data values whether the data values accumulate in the middle of the distribution or at one end.

Graphical Methods and Data Description Stem and Leaf Plot Relative Frequency distribution Relative Frequency Histogram

Construction of a Stem-Leaf Display List the stem values, in order, in a vertical column Draw a vertical line to the right of the stem values For each observation, record the leaf portion of the observation in the row corresponding to the appropriate stem Reorder the leaves from the lowest to highest within each stem row If the number of leaves appearing in each stem is too large, divide the stems into two groups, the first corresponding to leaves 0 through 4, and the second corresponding to leaves 5 through 9. (This subdivision can be increased to five groups if necessary).

Car Battery Life

STEM LEAF Frequency Stem and Leaf Plot of Battery Life

STEM LEAF Frequency 1 · * 21 2 · * · * · 5773 Double-Stem and Leaf Plot of Battery Life

Relative Frequency Distribution Group data into different classes or intervals Counting leaves belonging to each stem Each stem defines a class interval Divide each class frequency by the total number of observations, we obtain the proportion of the set of observations in each of the classes.

Relative Frequency Distribution of Battery Life Class IntervalClass midpointFrequency, fRelative frequency ??? ??? ???

Class IntervalClass midpointFrequency, fRelative frequency

Relative Frequency Histogram of Battery Life

Picturing Distributions: Histogram PERCENT Bins Each bar in the histogram represents a group of values (a bin). The height of the bar is the percent of values in the bin.

Measures of Shape: Skewness

Measures of Shape: Kurtosis

Box and Whisker Plot or Boxplot Pth Percentile The Pth Percentile is the value X p such that p% of the measurements will fall below that value and (100-p)% of the measurements will fall above the value. Quartile Quartiles divide the measurements into four parts such that 25% of the measurements are contained in each part. The first quartile (Lower Quartile) is denoted by Q 1, the second by Q 2, and the third (Upper Quartile) by Q 3. Data Displays and Graphical Methods P%(100-P)% XpXp Q1Q1 Q2Q2 Q3Q3

InterQuartile Range (IQR) IQR=Q 3 -Q 1 Outlier Observations that are considered to be unusually far removed from the bulk of the data. We label the observations as outliers when the distance from the box exceeds 1.5 times the interquartile range (in either direction). Box encloses the interquartile range of the data Whiskers show the extreme observations in the sample.

Box and Whiskers Plot or Boxplot Calculating Fence Values Lower Inner Fence: Q1-1.5(IQR) Upper Inner Fence: Q3+1.5(IQR) Lower Outer Fence: Q1-3(IQR) Upper Outer Fence: Q3+3(IQR) Maximum Upper Quartile Median Lower Quartile Minimum

A Quick Method 1. Order the data from smallest to largest value. 2. Divide the ordered data set into two data sets using the median as the dividing value. 3. Let the lower quartile be the median of the set of values consisting the smaller values. 4. Let the upper quartile be the median of the set of values consisting of the larger values.

Example Nicotine content was measured in a random sample of 40 cigarettes. The data is displayed below.

Order the data from the smallest to the largest 2.Divide the ordered data set into two data sets using the median as the dividing value

Q2=? Q1=? Q3=? IQR=Q3-Q1=? Q1=( )/2=1.635 Q2=( )/2=1.77 Q3=( )/2=2.000 IQR=Q3-Q1=0.365

Box-whisker Plot Outlier

1. The center of the distribution is indicated by the median line in the box. 2. A measure of the variability is given by the interquartile range, the length of the box. 3. The relative position of the median line indicates the symmetry of the middle 50% of the data. 4. The skewness can be obtained by the length of the whiskers. 5. The presence of outliers can be examined. Information Drawn from Boxplot

A quantile plot simply plots the data values on the vertical axis against an empirical assessment of the fraction of observations exceeded by the data value. Quantile Plot Where i is the order of observations when they are ranked from low to high.

Quantile Plot for paint data (table 8.2 page 238)

The normal quantile-quantile plot is a plot of y(i) (ordered observations) against Normal Quantile Plots

Lecture 1: Introduction to Statistics Purpose Statistical Concepts Descriptive Statistics and Some of Their Graphs Inferential Statistics Lecture Summary

Objectives Understand the importance of making inference. Understand the steps conducting a statistical study.

Statistical Inference making an "INFORMED GUESS" about a parameter based on a statistic. (This is the main objective of statistics.)

STATISTICAL INFERENCE GATHER DATA MAKE INFERENCES PARAMETERS SAMPLE STATISTICS POPULATION SAMPLE

Variable A VARIABLE is a characteristic of an individual or object that may vary for different observations. A QUANTITATIVE VARIABLE measures a variable on some sort of scale. A QUALITATIVE VARIABLE categorizes the values of the variable.

RAISIN BRAN EXAMPLE A cereal company claims that the average amount of raisins in its boxes of raisin bran is two scoops. A random sample of five boxes was taken off the production line, and an analysis revealed an average of 1.9 scoops per box.

Components of the Problem Identify the population Identify the sample Identify the symbol for the parameter Identify the symbol for the statistic

Five Steps in a Statistical Study : 1. Stating the problem 2. Gathering the data 3. Summarizing the data 4. Analyzing the data 5. Reporting the results

Stating the Problem Specifically identifying the population to be sampled Identifying the parameter (s) being studied

Gathering the Data SURVEYS Random Sampling Stratified Sampling Cluster Sampling Systematic sampling EXPERIMENTS Completely Randomized Design Randomized Block Design Factorial Design

Lecture 1: Introduction to Statistics Purpose Statistical Concepts Descriptive Statistics and Some of Their Graphs Inferential Statistics Lecture Summary

Summary Basics of statistics Descriptive statistics and graphs Inferential statistics Textbook Chapter 1 (page 1-28) Chapter 8 (page )