Descriptive Statistics Tabular and Graphical Displays –Frequency Distribution - List of intervals of values for a variable, and the number of occurrences.

Slides:



Advertisements
Similar presentations
Chapter 3, Numerical Descriptive Measures
Advertisements

Describing Quantitative Variables
Lesson Describing Distributions with Numbers parts from Mr. Molesky’s Statmonkey website.
Descriptive Measures MARE 250 Dr. Jason Turner.
Measures of Dispersion
Descriptive Statistics
Measures of Dispersion or Measures of Variability
Descriptive Statistics – Central Tendency & Variability Chapter 3 (Part 2) MSIS 111 Prof. Nick Dedeke.
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter Two Treatment of Data.
Intro to Descriptive Statistics
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Chapter Two Descriptive Statistics McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Describing Data: Numerical
Describing Data Using Numerical Measures
Statistics for Linguistics Students Michaelmas 2004 Week 1 Bettina Braun.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
LECTURE 12 Tuesday, 6 October STA291 Fall Five-Number Summary (Review) 2 Maximum, Upper Quartile, Median, Lower Quartile, Minimum Statistical Software.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
Descriptive Statistics
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
Numerical Descriptive Techniques
Methods for Describing Sets of Data
LECTURE 8 Thursday, 19 February STA291 Fall 2008.
Chapter 2: Methods for Describing Sets of Data
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Chapter 2 Describing Data.
Ex St 801 Statistical Methods Introduction. Basic Definitions STATISTICS : Area of science concerned with extraction of information from numerical data.
Lecture 3 Describing Data Using Numerical Measures.
Applied Quantitative Analysis and Practices LECTURE#09 By Dr. Osman Sadiq Paracha.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
1 CHAPTER 3 NUMERICAL DESCRIPTIVE MEASURES. 2 MEASURES OF CENTRAL TENDENCY FOR UNGROUPED DATA  In Chapter 2, we used tables and graphs to summarize a.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
INVESTIGATION 1.
Chap 3-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 3 Describing Data Using Numerical.
 The mean is typically what is meant by the word “average.” The mean is perhaps the most common measure of central tendency.  The sample mean is written.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Unit 3: Averages and Variations Week 6 Ms. Sanchez.
Summary Statistics: Measures of Location and Dispersion.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall2(2)-1 Chapter 2: Displaying and Summarizing Data Part 2: Descriptive Statistics.
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
LIS 570 Summarising and presenting data - Univariate analysis.
Statistics and Data Analysis
Descriptive Statistics(Summary and Variability measures)
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Descriptive Statistics ( )
Methods for Describing Sets of Data
Chapter 3 Describing Data Using Numerical Measures
Chapter 6 ENGR 201: Statistics for Engineers
Averages and Variation
Descriptive Statistics
Description of Data (Summary and Variability measures)
Chapter 3 Describing Data Using Numerical Measures
Numerical Descriptive Measures
Descriptive Statistics
Descriptive Statistics
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Statistics: The Interpretation of Data
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Presentation transcript:

Descriptive Statistics Tabular and Graphical Displays –Frequency Distribution - List of intervals of values for a variable, and the number of occurrences per interval –Relative Frequency - Proportion (often reported as a percentage) of observations falling in the interval –Histogram/Bar Chart - Graphical representation of a Relative Frequency distribution –Stem and Leaf Plot - Horizontal tabular display of data, based on 2 digits (stem/leaf)

Comparing Groups Side-by-side bar charts 3 dimensional histograms Back-to-back stem and leaf plots Goal: Compare 2 (or more) groups wrt variable(s) being measured Do measurements tend to differ among groups?

Sample & Population Distributions Distributions of Samples and Populations- As samples get larger, the sample distribution gets smoother and looks more like the population distribution –U-shaped - Measurements tend to be large or small, fewer in middle range of values –Bell-shaped - Measurements tend to cluster around the middle with few extremes (symmetric) –Skewed Right - Few extreme large values –Skewed Left - Few extreme small values

Measures of Central Tendency Mean - Sum of all measurements divided by the number of observations (even distribution of outcomes among cases). Can be highly influenced by extreme values. Notation: Sample Measurements labeled Y 1,...,Y n

Median, Percentiles, Mode Median - Middle measurement after data have been ordered from smallest to largest. Appropriate for interval and ordinal scales P th percentile - Value where P% of measurements fall below and (100-P)% lie above. Lower quartile(25 th ), Median(50 th ), Upper quartile(75 th ) often reported Mode - Most frequently occurring outcome. Typically reported for ordinal and nominal data.

Measures of Variation Measures of how similar or different individual’s measurements are –Range -- Largest-Smallest observation –Deviation -- Difference between i th individual’s outcome and the sample mean: – Variance of n observations Y 1,...,Y n is the “average” squared deviation:

Measures of Variation Standard Deviation - Positive square root of the variance (measure in original units): Properties of the standard deviation: s  0, and only equals 0 if all observations are equal s increases with the amount of variation around the mean Division by n-1 (not n) is due to technical reasons (later) s depends on the units of the data (e.g. $1000s vs $)

Empirical Rule If the histogram of the data is approximately bell-shaped, then: –Approximately 68% of measurements lie within 1 standard deviation of the mean. –Approximately 95% of measurements lie within 2 standard deviations of the mean. –Virtually all of the measurements lie within 3 standard deviations of the mean.

Other Measures and Plots Interquartile Range (IQR)-- 75 th % ile - 25 th % ile (measures the spread in the middle 50% of data) Box Plots - Display a box containing middle 50% of measurements with line at median and lines extending from box. Breaks data into four quartiles Outliers - Observations falling more than 1.5IQR above (below) upper (lower) quartile

Dependent and Independent Variables Dependent variables are outcomes of interest to investigators. Also referred to as Responses or Endpoints Independent variables are Factors that are often hypothesized to effect the outcomes (levels of dependent variables). Also referred to as Predictor or Explanatory Variables Research ??? Does I.V.  D.V.

Example - Clinical Trials of Cialis Clinical trials conducted worldwide to study efficacy and safety of Cialis (Tadalafil) for ED Patients randomized to Placebo, 10mg, and 20mg Co-Primary outcomes: –Change from baseline in erectile dysfunction domain if the International Index of Erectile Dysfunction (Numeric) –Response to: “Were you able to insert your P… into your partner’s V…?” (Nominal: Yes/No) –Response to: “Did your erection last long enough for you to have succesful intercourse?” (Nominal: Yes/No) Source: Carson, et al. (2004).

Example - Clinical Trials of Cialis Population: All adult males suffering from erectile dysfunction Sample: 2102 men with mild-to-severe ED in 11 randomized clinical trials Dependent Variable(s): Co-primary outcomes listed on previous slide Independent Variable: Cialis Dose: (0, 10, 20 mg) Research Questions: Does use of Cialis improve erectile function?

Sample Statistics/Population Parameters Sample Mean and Standard Deviations are most commonly reported summaries of sample data. They are random variables since they will change from one sample to another. Population Mean (  ) and Standard Deviation (  ) computed from a population of measurements are fixed (unknown in practice) values called parameters.