Continuous Statistical Distributions: A Practical Guide for Detection, Description and Sense Making Unit 3.

Slides:



Advertisements
Similar presentations
Chapter 3 Properties of Random Variables
Advertisements

Measures of Dispersion
Ana Jerončić. about half (71+37=108)÷200 = 54% of the bills are “small”, i.e. less than 30 EUR There are only a few telephone bills in the middle range.
Descriptive Statistics
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Calculating & Reporting Healthcare Statistics
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Continuous Random Variables Chap. 12. COMP 5340/6340 Continuous Random Variables2 Preamble Continuous probability distribution are not related to specific.
Slides by JOHN LOUCKS St. Edward’s University.
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
CHAPTER 3: Statistical Description of Data
Statistics 800: Quantitative Business Analysis for Decision Making Measures of Locations and Variability.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Continuous Random Variables and Probability Distributions.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
2 Textbook Shavelson, R.J. (1996). Statistical reasoning for the behavioral sciences (3 rd Ed.). Boston: Allyn & Bacon. Supplemental Material Ruiz-Primo,
Chapter 2 Describing Data with Numerical Measurements
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 4.1 Chapter Four Numerical Descriptive Techniques.
Summarizing Scores With Measures of Central Tendency
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Objectives 1.2 Describing distributions with numbers
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
POPULATION DYNAMICS Required background knowledge:
1 DATA DESCRIPTION. 2 Units l Unit: entity we are studying, subject if human being l Each unit/subject has certain parameters, e.g., a student (subject)
Continuous Probability Distributions  Continuous Random Variable  A random variable whose space (set of possible values) is an entire interval of numbers.
Chapter 2 Describing Data.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 6 Continuous Random Variables.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved.
Lecture 3 Describing Data Using Numerical Measures.
Skewness & Kurtosis: Reference
Describing and Displaying Quantitative data. Summarizing continuous data Displaying continuous data Within-subject variability Presentation.
INVESTIGATION 1.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
Fundamentals of Data Analysis Lecture 3 Basics of statistics.
Chapter 12 Continuous Random Variables and their Probability Distributions.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
Chapter SixteenChapter Sixteen. Figure 16.1 Relationship of Frequency Distribution, Hypothesis Testing and Cross-Tabulation to the Previous Chapters and.
South Dakota School of Mines & Technology Introduction to Probability & Statistics Industrial Engineering.
Stracener_EMIS 7305/5305_Spr08_ Reliability Data Analysis and Model Selection Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
LIS 570 Summarising and presenting data - Univariate analysis.
Medical Statistics (full English class) Ji-Qian Fang School of Public Health Sun Yat-Sen University.
Descriptive Statistics(Summary and Variability measures)
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
Statistics Josée L. Jarry, Ph.D., C.Psych. Introduction to Psychology Department of Psychology University of Toronto June 9, 2003.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Descriptive Statistics ( )
Confidence Intervals Cont.
Modeling and Simulation CS 313
MECH 373 Instrumentation and Measurements
Analysis and Empirical Results
Doc.RNDr.Iveta Bedáňová, Ph.D.
STAT 4030 – Programming in R STATISTICS MODULE: Basic Data Analysis
BAE 5333 Applied Water Resources Statistics
Chapter 3 Describing Data Using Numerical Measures
Review of Descriptive Statistics
Summarizing Scores With Measures of Central Tendency
Descriptive Statistics
Description of Data (Summary and Variability measures)
Reliability Mathematics
Chapter 3 Describing Data Using Numerical Measures
Numerical Descriptive Measures
HMI 7530– Programming in R STATISTICS MODULE: Basic Data Analysis
CHAPTER 5 Fundamentals of Statistics
Numerical Descriptive Measures
Numerical Descriptive Statistics
Numerical Descriptive Measures
Presentation transcript:

Continuous Statistical Distributions: A Practical Guide for Detection, Description and Sense Making Unit 3

Continuous Statistical Distribution Describes behavior of a continuous random variable The probability that the c.r. variable has any value is described by a probability density function (pdf), the probability that the variable will take on any particular value. Continuous pdfs can Symmetric Asymmetric (or skewed)

Goals Definition of continuous distributions Probability density function, cumulative distribution function, descriptive statistics, histograms, probability plots, and mixture distributions. Visualization of data structure with probability plots.

Continuous pdf shapes

Descriptive Statistics Central Tendency Mean (arithmetic mean or average) Median: observation separating upper from lower half (50%) of data set Mode: observation that occurs most frequently in a data set Dispersion Standard deviation

Examples include: Lognormal, Gamma, Chi-square, Weibull, Exponential, F and Extreme Value

Gaussian probability distribution and cumulative probability distribution functions, µ=10, σ= 1 (blue), 2 (green), and 3 (red)

Gaussian probability distribution and cumulative probability distribution functions, σ= 2; µ=10 (blue), 12 (green), and 14 (red)

Histogram (visualize ‘pdf of data sample’) Gaussian data: Working with Random Samples (DATA) Histogram (visualize ‘pdf of data sample’)

Empirical Cumulative Distribution Functions Gaussian data: Working with Random samples Empirical Cumulative Distribution Functions

Empirical Cumulative Distribution Functions Gaussian data: Working with Random samples Empirical Cumulative Distribution Functions Bold line: ECDF for all samples,1000 observations

Probability Plot: Equal Percentiles re: Hypothetical Distribution Gaussian data: Working with Random samples Probability Plot: Equal Percentiles re: Hypothetical Distribution

Probability Plot: Equal Percentiles re: Hypothetical Distribution Gaussian data: Working with Random samples Probability Plot: Equal Percentiles re: Hypothetical Distribution

Plot the sorted data (x-axis) versus the y-axis points. Normal Probability Plot: Equal Percentiles re: Normal (Gaussian) Distribution – IN EXCEL For x-axis, sort (or rank) data sample observations in ascending order (from smallest to largest) For y-axis, make a corresponding array of probability values, (i-0.5)/N, where N is the sample and i=1,2,3,…,N. Then make an array that is ‘NORMSINV()’ of these probability values, the expected value of each observation from a unit normal (mean=0, sd=1) distribution. ‘NORMINV()’ can also be used for other means and sd. Plot the sorted data (x-axis) versus the y-axis points.

Make scatter plot of corresponding points Normal Probability Plot: Equal Percentiles re: other distributions – IN EXCEL For the x-axis, sort (or rank) data sample observations in ascending order (from smallest to largest) For the y-axis, construct probability array (i-0.5)/N, where N is the sample and i=1,2,3,…,N. Chi-square distribution: ‘CHIINV()’ Gamma distribution: ‘GAMMAINV()’ Beta distribution: ‘BETAINV()’ F distribution: ‘FINV()’ Make scatter plot of corresponding points

Probability Plot re: Unit Normal Distribution Gaussian data: Working with Random samples Probability Plot re: Unit Normal Distribution

Probability Plot re: Unit Normal Distribution Gaussian data: Working with Random samples Probability Plot re: Unit Normal Distribution Bold line: plot for all samples,1000 observations

Probability Plot re: Unit Normal Distribution Gaussian data: Working with Random samples Probability Plot re: Unit Normal Distribution Slope estimates 1/SD

Probability Plot re: Unit Normal Distribution Gaussian data: Working with Random samples Probability Plot re: Unit Normal Distribution

Histogram (visualize ‘pdf of data sample’) Working with Random Samples (DATA) Histogram (visualize ‘pdf of data sample’)

Empirical Cumulative Distribution Functions Gaussian data: Working with Random samples Empirical Cumulative Distribution Functions

Probability Plot re: Unit Normal Distribution Working with Random samples Probability Plot re: Unit Normal Distribution

For the y-axis, calculate ‘Cumulative Hazard’ Hazard Plots – IN EXCEL For the x-axis, sort (or rank) data sample observations in ascending order (from smallest to largest) For the y-axis, calculate ‘Cumulative Hazard’ For each observation, enter 1/(reverse rank order) For the smallest of N observations, enter 1/N For the second smallest, enter 1/(N-1) …. Cumulative Hazard is the cumulative sum of these values for each observation. E.g., for the third smallest observation, the cumulative hazard is 1/N+1/(N-1)+1/(N-2) Make scatter plot of corresponding points

Probability Plot re: Cumulative Hazard (unit exponential distribution) Working with Random samples Probability Plot re: Cumulative Hazard (unit exponential distribution)

Make scatter plot of corresponding probability points Sample Probability-Probability (P-P) and Quantile-Quantile (Q-Q) Plots: Scatter Plot of Equal Percentiles or Quantiles of Two Samples– IN EXCEL For the x-axis, sort (or rank) first data sample observations in ascending order (from smallest to largest) For the y-axis, sort (or rank) second data sample observations in ascending order Make scatter plot of corresponding probability points If samples are from same distribution, the plot is linear.

Probability Plots: Are they identically distributed Working with Random samples Probability Plots: Are they identically distributed

Probability Plot re: Cumulative Hazard (unit exponential distribution) Working with Random samples Probability Plot re: Cumulative Hazard (unit exponential distribution)

Mixture Distributions

Mixture Distributions

Mixture Distributions

Mixture Distributions

Mixture Distributions

Mixture Distributions + + Mixture 2

Mixture Distributions + =

Call Center Data: Call Frequency

Call Center Data: Call Frequency

Call Center Data: Call Frequency

Call Center Data: Call Frequency Mean S,D, 10:09 hr ± 9 min 14:58 hr ± 34 min

Call Center Data: Call Frequency 10:09 hr ± 9 min 10:04 hr ± 11 min 14:58 hr ± 34 min 14:58 hr ± 15 min

Call Center Data: Call Frequency

Call Center Data: Interval Between Calls

Call Center Data: Interval Between Calls

Call Center Data: Interval Between Calls

Call Center Data: Interval Between Calls

Call Center Data: Call Service Times

Call Center Data: Call Service Times

Call Center Data: Call Service Times

Call Center Data: Call Service Times

Goals Definition of continuous distributions Probability density function, cumulative distribution function, descriptive statistics, histograms, probability plots, and mixture distributions. Visualization of data structure with probability plots.