2-1 Data Summary and Display Population Mean For a finite population with N measurements, the mean is The sample mean is a reasonable estimate of.

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

Descriptive Measures MARE 250 Dr. Jason Turner.
Modeling Process Quality
1 Chapter 1: Sampling and Descriptive Statistics.
IB Math Studies – Topic 6 Statistics.
5-3 Inference on the Means of Two Populations, Variances Unknown.
Descriptive Statistics A.A. Elimam College of Business San Francisco State University.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-1 Statistics for Business and Economics 7 th Edition Chapter 2 Describing Data:
ISE 261 PROBABILISTIC SYSTEMS. Chapter One Descriptive Statistics.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter Two Treatment of Data.
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Descriptive statistics (Part I)
Chap 3-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 3 Describing Data: Numerical Statistics for Business and Economics.
1 1 Slide © 2003 South-Western/Thomson Learning TM Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window.
Describing Data: Numerical
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
Engineering Probability and Statistics - SE-205 -Chap 1 By S. O. Duffuaa.
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
Numerical Descriptive Techniques
Statistics and Numerical Method Part I: Statistics Week 1I: Data Presentation 1/2555 สมศักดิ์ ศิวดำรงพงศ์
Methods for Describing Sets of Data
2011 Summer ERIE/REU Program Descriptive Statistics Igor Jankovic Department of Civil, Structural, and Environmental Engineering University at Buffalo,
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 1 Overview and Descriptive Statistics.
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
4-5 Inference on the Mean of a Population, Variance Unknown Hypothesis Testing on the Mean.
6-3 Multiple Regression Estimation of Parameters in Multiple Regression.
Chapter 3 Descriptive Statistics: Numerical Methods Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Chapter 2 Describing Data.
6-1 Numerical Summaries Definition: Sample Mean.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
Lecture 3 Describing Data Using Numerical Measures.
Chapter 6 - Random Sampling and Data Description More joy of dealing with large quantities of data Chapter 6B You can never have too much data.
Variation This presentation should be read by students at home to be able to solve problems.
6-1 Introduction To Empirical Models Based on the scatter diagram, it is probably reasonable to assume that the mean of the random variable Y is.
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
Chap 3-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 3 Describing Data Using Numerical.
6-3 Multiple Regression Estimation of Parameters in Multiple Regression.
Chapter 8 Making Sense of Data in Six Sigma and Lean
Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering.
5-1 Introduction 5-2 Inference on the Means of Two Populations, Variances Known Assumptions.
Engineering Statistics KANCHALA SUDTACHAT. Statistics  Deals with  Collection  Presentation  Analysis and use of data to make decision  Solve problems.
Data Summary Using Descriptive Measures Sections 3.1 – 3.6, 3.8
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall2(2)-1 Chapter 2: Displaying and Summarizing Data Part 2: Descriptive Statistics.
Inference on Two Population DATA CARLSBAD; INPUT YEAR COUNT CARDS;
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
Statistics Unit Test Review Chapters 11 & /11-2 Mean(average): the sum of the data divided by the number of pieces of data Median: the value appearing.
Yandell – Econ 216 Chap 3-1 Chapter 3 Numerical Descriptive Measures.
Methods for Describing Sets of Data
STATISTICS AND PROBABILITY IN CIVIL ENGINEERING
Business and Economics 6th Edition
Analysis and Empirical Results
EXPLORATORY DATA ANALYSIS and DESCRIPTIVE STATISTICS
ISE 261 PROBABILISTIC SYSTEMS
Topic 5: Exploring Quantitative data
6-1 Introduction To Empirical Models
2-1 Data Summary and Display 2-1 Data Summary and Display.
10.5 Organizing & Displaying Date
2-1 Data Summary and Display 2-1 Data Summary and Display.
Probability and Statistics
St. Edward’s University
Descriptive Statistics Civil and Environmental Engineering Dept.
Presentation transcript:

2-1 Data Summary and Display

Population Mean For a finite population with N measurements, the mean is The sample mean is a reasonable estimate of the population mean.

2-1 Data Summary and Display Sample Variance and Sample Standard Deviation

2-1 Data Summary and Display The sample variance is The sample standard deviation is

2-1 Data Summary and Display Computational formula for s 2

2-1 Data Summary and Display Population Variance When the population is finite and consists of N values, we may define the population variance as The sample variance is a reasonable estimate of the population variance.

2-2 Stem-and-Leaf Diagram Steps for Constructing a Stem-and-Leaf Diagram

2-2 Stem-and-Leaf Diagram

Median = (40 th + 41 st )/2=( )/2=161.5 Q 1 = (n+1)/4=20.25  btn 20 th & 21 st Q1= ( )/2 = 144 Q 2 = median Q 3 = 3(n+1)/4 = Q3 = ( )/2 = 181 IQR = interquartile range = Q3-Q1 Percentiles, quartiles, and the median range

2-2 Stem-and-Leaf Diagram

2-3 Histograms A histogram is a more compact summary of data than a stem-and-leaf diagram. To construct a histogram for continuous data, we must divide the range of the data into intervals, which are usually called class intervals, cells, or bins. If possible, the bins should be of equal width to enhance the visual information in the histogram.

2-3 Histograms

An important variation of the histogram is the Pareto chart. This chart is widely used in quality and process improvement studies where the data usually represent different types of defects, failure modes, or other categories of interest to the analyst. The categories are ordered so that the category with the largest number of frequencies is on the left, followed by the category with the second largest number of frequencies, and so forth.

2-3 Histograms

2-4 Box Plots The box plot is a graphical display that simultaneously describes several important features of a data set, such as center, spread, departure from symmetry, and identification of observations that lie unusually far from the bulk of the data. Whisker Outlier Extreme outlier

2-4 Box Plots

1 st quartile = rd quartile = nd quartile = median = IQR = Q 3 – Q 1 = 181 – = IQR = Q IQR = IQR = Q 3 – Q 1 = 181 – = IQR = Q IQR = – = 87.25

2-4 Box Plots

OPTIONS NODATE NOOVP NONUMBER; DATA STRENGTH; INPUT STRENGTH CARDS; PROC UNIVARIATE DATA=STRENGTH PLOT NORMAL FREQ; VAR STRENGTH; histogram strength/vscale=count; TITLE 'DESCRIPTIVE STATISTICS AND GRAPHS'; /* PROC CHART DATA=STRENGTH; VBAR STRENGTH; VBAR STRENGTH/TYPE=PCT; HBAR STRENGTH/TYPE=CPCT DISCRETE; TITLE 'HISTOGRAM'; */ RUN; QUIT; SAS code and output

DESCRIPTIVE STATISTICS AND GRAPHS UNIVARIATE 프로시저 변수 : STRENGTH 적률 N 80 가중합 80 평균 관측치 합 표준 편차 분산 왜도 첨도 제곱합 수정 제곱합 변동계수 평균의 표준 오차 기본 통계 측도 위치측도 변이측도 평균 표준 편차 중위수 분산 1141 최빈값 범위 사분위 범위 위치모수 검정 : Mu0=0 검정 -- 통계량 p 값 스튜던트의 t t Pr > |t| <.0001 부호 M 40 Pr >= |M| <.0001 부호 순위 S 1620 Pr >= |S| <.0001 정규성 검정 검정 ---- 통계량 p 값 Shapiro-Wilk W Pr < W Kolmogorov-Smirnov D Pr > D > Cramer-von Mises W-Sq Pr > W-Sq > Anderson-Darling A-Sq Pr > A-Sq > 분위수 ( 정의 5) 분위수 추정값 100% 최댓값 % % % % Q % 중위수 % Q % % % % 최솟값 76.0

DESCRIPTIVE STATISTICS AND GRAPHS UNIVARIATE 프로시저 변수 : STRENGTH 극 관측치 최소 최대 ---- 값 관측치 값 관측치 빈도 수 백분율 백분율 백분율 값 빈도 셀 누적 값 빈도 셀 누적 값 빈도 셀 누적 SAS code and output

DESCRIPTIVE STATISTICS AND GRAPHS UNIVARIATE 프로시저 변수 : STRENGTH 줄기 잎 # 상자그림 | | | | | | *--+--* | | | | | | | 값 : ( 줄기. 잎 )*10**+1 SAS code and output 정규 확률도 *+ | *++ | ***+ | *+ | *** | **** | ***** | ****+ | **+ | *** | +** | +++* 75++*

SAS code and output

2-5 Time Series Plots A time series or time sequence is a data set in which the observations are recorded in the order in which they occur. A time series plot is a graph in which the vertical axis denotes the observed value of the variable (say x ) and the horizontal axis denotes the time (which could be minutes, days, years, etc.). When measurements are plotted as a time series, we often see trends, cycles, or other broad features of the data

2-5 Time Series Plots

OPTIONS NODATE NOOVP NONUMBER LS=80; DATA STRENGTH; INPUT STRENGTH N=_N_; CARDS; SYMBOL INTERPOL=JOIN VALUE=DOT HEIGHT=1 LINE=1; PROC GPLOT DATA=STRENGTH; PLOT STRENGTH*N; TITLE 'TIME SERIES GRAPH FOR STRENGTH'; RUN; QUIT; SAS code and output

2-6 Multivariate Data The dot diagram, stem-and-leaf diagram, histogram, and box plot are descriptive displays for univariate data; that is, they convey descriptive information about a single variable. Many engineering problems involve collecting and analyzing multivariate data, or data on several different variables. In engineering studies involving multivariate data, often the objective is to determine the relationships among the variables or to build an empirical model.

2-6 Multivariate Data

Sample Correlation Coefficient The strength of a linear relationship between two variables

2-6 Multivariate Data Strong when 0.8≤ r ≤ 1, weak 0 ≤ r ≤ 0.5, and moderate otherwise

2-6 Multivariate Data

OPTIONS NODATE NOOVP NONUMBER LS=80; DATA SHAMPOO; INPUT FOAM SCENT COLOR RESIDUE REGION QUALITY; CARDS; PROC CORR DATA=SHAMPOO; VAR FOAM SCENT COLOR RESIDUE REGION QUALITY; TITLE 'CORRELATIONS OF VARIABLES'; PROC SGSCATTER DATA=SHAMPOO; MATRIX FOAM SCENT COLOR RESIDUE REGION QUALITY; TITLE 'MATRIX OF SCATTER PLOTS FOR THE SHAMPOO DATA'; SYMBOL INTERPOL=NONE; PROC GPLOT DATA=SHAMPOO; PLOT QUALITY*FOAM=REGION; TITLE 'SCATTER PLOT OF SHAMPOO QUALITY VS. FORM'; RUN; QUIT: SAS code and output

CORRELATIONS OF VARIABLES CORR 프로시저 6 개의 변수 : FOAM SCENT COLOR RESIDUE REGION QUALITY 단순 통계량 변수 N 평균 표준편차 합 최솟값 최댓값 FOAM SCENT COLOR RESIDUE REGION QUALITY 피어슨 상관 계수, N = 24 H0: Rho=0 가정하에서 Prob > |r| FOAM SCENT COLOR RESIDUE REGION QUALITY FOAM SCENT COLOR RESIDUE REGION QUALITY SAS code and output