Chapter 6 - Random Sampling and Data Description More joy of dealing with large quantities of data Chapter 6B You can never have too much data.

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

Descriptive Measures MARE 250 Dr. Jason Turner.
Random Sampling and Data Description
Modeling Process Quality
IB Math Studies – Topic 6 Statistics.
6 Descriptive Statistics 6-1 Numerical Summaries of Data 6-4 Box Plots
Chapter 2 Graphs, Charts, and Tables – Describing Your Data
Chapter 2 Describing Data Sets
ISE 261 PROBABILISTIC SYSTEMS. Chapter One Descriptive Statistics.
Measures of Central Tendency
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Descriptive statistics (Part I)
Inferences About Process Quality
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Chapter 2 Describing Data with Numerical Measurements
Describing distributions with numbers
STATISTIC & INFORMATION THEORY (CSNB134) MODULE 2 NUMERICAL DATA REPRESENTATION.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Chapter 2 Describing Data with Numerical Measurements General Objectives: Graphs are extremely useful for the visual description of a data set. However,
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
Let’s Review for… AP Statistics!!! Chapter 1 Review Frank Cerros Xinlei Du Claire Dubois Ryan Hoshi.
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
Engineering Probability and Statistics - SE-205 -Chap 1 By S. O. Duffuaa.
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Describing Data: Displaying and Exploring Data Unit 1: One Variable Statistics CCSS: N-Q (1-3);
CHAPTER 1 Basic Statistics Statistics in Engineering
2011 Summer ERIE/REU Program Descriptive Statistics Igor Jankovic Department of Civil, Structural, and Environmental Engineering University at Buffalo,
© 2008 Brooks/Cole, a division of Thomson Learning, Inc. 1 Chapter 4 Numerical Methods for Describing Data.
INTRODUCTORY STATISTICS Chapter 2 DESCRIPTIVE STATISTICS PowerPoint Image Slideshow.
STATISTICS I COURSE INSTRUCTOR: TEHSEEN IMRAAN. CHAPTER 4 DESCRIBING DATA.
Lecture 2 Graphs, Charts, and Tables Describing Your Data
Descriptive statistics Describing data with numbers: measures of location.
Chapter 2 Describing Data.
6-1 Numerical Summaries Definition: Sample Mean.
Describing distributions with numbers
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 2-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
Chap 2-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course in Business Statistics 4 th Edition Chapter 2 Graphs, Charts, and Tables.
1 1.Develop and interpret a stem-and-leaf display 2.Develop and interpret a: 1.Dot plot 3.Develop and interpret quartiles, deciles, and percentiles 4.Develop.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
Engineering Statistics KANCHALA SUDTACHAT. Statistics  Deals with  Collection  Presentation  Analysis and use of data to make decision  Solve problems.
1 Review Sections 2.1, 2.2, 1.3, 1.4, 1.5, 1.6 in text.
1 Chapter 4 Numerical Methods for Describing Data.
2-1 Data Summary and Display Population Mean For a finite population with N measurements, the mean is The sample mean is a reasonable estimate of.
Stracener_EMIS 7305/5305_Spr08_ Reliability Data Analysis and Model Selection Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall2(2)-1 Chapter 2: Displaying and Summarizing Data Part 2: Descriptive Statistics.
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
2/15/2016ENGM 720: Statistical Process Control1 ENGM Lecture 03 Describing & Using Distributions, SPC Process.
Copyright © Cengage Learning. All rights reserved. 5 Joint Probability Distributions and Random Samples.
Displaying & Exploring Data Ir.Muhril Ardiansyah,M.Sc.,Ph.D.1 Chapter 4. Describing Data: Displaying And Exploring Data
1 Design and Analysis of Experiments (2) Basic Statistics Kyung-Ho Park.
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
Slide 1 Copyright © 2004 Pearson Education, Inc. Chapter 5 Probability Distributions 5-1 Overview 5-2 Random Variables 5-3 Binomial Probability Distributions.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Statistics Descriptive Statistics. Statistics Introduction Descriptive Statistics Collections, organizations, summary and presentation of data Inferential.
Applied Statistics and Probability for Engineers Chapter 6
Exploratory Data Analysis
Engineering Probability and Statistics - SE-205 -Chap 6
Lesson 8 Introduction to Statistics
Chapter 6 ENGR 201: Statistics for Engineers
Bar graphs are used to compare things between different groups
An Introduction to Statistics
Topic 5: Exploring Quantitative data
Numerical Measures: Skewness and Location
2-1 Data Summary and Display 2-1 Data Summary and Display.
(-4)*(-7)= Agenda Bell Ringer Bell Ringer
Review on Modelling Process Quality
Descriptive Statistics Civil and Environmental Engineering Dept.
Presentation transcript:

Chapter 6 - Random Sampling and Data Description More joy of dealing with large quantities of data Chapter 6B You can never have too much data.

Today in Prob & Stat

6-2 Stem-and-Leaf Diagrams Steps for Constructing a Stem-and-Leaf Diagram

6-2 Stem-and-Leaf Diagrams

Example 6-4

Figure 6-4 Stem-and-leaf diagram for the compressive strength data in Table 6-2.

Figure observations on batch yields Stem-and-leaf displays for Example 6-5. Stem: Tens digits. Leaf: Ones digits. too few too many just right

Figure 6-6 Stem-and-leaf diagram from Minitab. Number of observations In the middle stem

6-4 Box Plots The box plot is a graphical display that simultaneously describes several important features of a data set, such as center, spread, departure from symmetry, and identification of observations that lie unusually far from the bulk of the data. Whisker Outlier Extreme outlier

Figure 6-13 Description of a box plot.

Figure 6-14 Box plot for compressive strength data in Table 6- 2.

Figure 6-15 Comparative box plots of a quality index at three plants.

6-5 Time Sequence Plots A time series or time sequence is a data set in which the observations are recorded in the order in which they occur. A time series plot is a graph in which the vertical axis denotes the observed value of the variable (say x ) and the horizontal axis denotes the time (which could be minutes, days, years, etc.). When measurements are plotted as a time series, we often see trends, cycles, or other broad features of the data

Figure 6-16 Company sales by year (a) and by quarter (b).

Figure 6-17 gosh! – a stem and leaf diagram combined with a time series plot A digidot plot of the compressive strength data in Table 6-2.

Figure 6-18 A digidot plot of chemical process concentration readings, observed hourly.

6-6 Probability Plots Probability plotting is a graphical method for determining whether sample data conform to a hypothesized distribution based on a subjective visual examination of the data. Probability plotting typically uses special graph paper, known as probability paper, that has been designed for the hypothesized distribution. Probability paper is widely available for the normal, lognormal, Weibull, and various chi-square and gamma distributions.

Probability (Q-Q) * Plots Forget ‘normal probability paper’ Plot the z score versus the ranked observations, x (j) Subjective, visual technique usually applied to test normality. Can also be adapted to other distributions. Method (for normal distribution): Rank the observations x (1), x (2), …, x (n) from smallest to largest Compute the (j-1/2)/n value for each x (j) Plot z j =F -1 ((j-1/2)/n) versus x (j) Parentheses usually indicate ordering of data.

Computing z j, where z j =  -1 (j – ½)/n xj values are ordered least to greatest

Example in EXCEL – Table 6-6, pp. 214 z j is the function NORMSINV

Example in EXCEL – Table 6-6, cont’d z j is the function NORMSINV

Example 6-7

Example 6-7 (continued)

Figure 6-19 Normal probability plot for battery life.

Figure 6-20 Normal probability plot obtained from standardized normal scores.

Figure 6-21 Normal probability plots indicating a nonnormal distribution. (a) Light-tailed distribution. (b) Heavy-tailed distribution. (c ) A distribution with positive (or right) skew.

The Beginning of a Comprehensive Example Descriptive Statistics in Action see real numbers, real data watch as they are manipulated in perverse ways be thrilled as they are sorted and be amazed as they are compressed into a single numbers

The Raw Data As part of a life span study of a particular type of lithium polymer rechargable battery, 120 batteries were operated and their life span in operating hours determined. Data generated from a Weibull distribution with  = 2.8 and  = 2000

Descriptive Statistics - Minitab Variable N Mean Median TrMean StDev SE Mean Battery Life Variable Minimum Maximum Q1 Q3 Battery Life trimmed mean

More Minitab

Stem and Leaf Plot Leaf Unit = (40)

More Minitab

Time Series Plot Based upon the order that the data was generated

Time Series Plot Sorted by failure time

Computer Support This is easy if you use the computer. hang on, we are going to Excel…

A Recap … Population – the totality of observations with which we are concerned. Issue: conceptual vs. actual. Sample – subset of observations selected from a population. Statistic – any function of the observations in a sample. Sample range – If the n observations in a sample are denoted by x 1, x 2, …,x n, then the sample range is r = max(x i ) – min(x i ). Sample mean and variance. Note that these are functions of the observations in a sample and are, therefore, statistics.

More Recapping … Note difference in denominators Sample variance uses an estimate of the mean (xbar) in its calculation. If divided by n, the sample variance would be a biased estimate – biased low. Note terminology – ‘population parameter’ vs. ‘sample statistic’

Sampling Process X a random variable that represents one selection from a population. Each observation in the sample is obtained under identical conditions. The population does not change during sampling. The probability distribution of values does not change during sampling. f(x 1,x 2,…,x n ) = f(x 1 )f(x 2 )…f(x n ) if the sample is independent. Notation X 1, X 2,…, X n are the random variables. x 1, x 2,…, x n are the values of the random variables.

A Final Recap… A probability distribution is often a model for a population. This is often the case when the population is conceptual or infinite. The histogram should resemble to distribution of population values. The bigger the sample the stronger the resemblance.

Our Work Here Today is Done Next Week: The Glorious Midterm Prob/Stat students Discussing stem and leaf plots