Lecture 1: Basic Descriptive Statistics 1.Types of Biological Data 2.Summary Descriptive Statistics Measures of Central Tendency Measures of Dispersion.

Slides:



Advertisements
Similar presentations
Bios 101 Lecture 4: Descriptive Statistics Shankar Viswanathan, DrPH. Division of Biostatistics Department of Epidemiology and Population Health Albert.
Advertisements

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Review of Basics. REVIEW OF BASICS PART I Measurement Descriptive Statistics Frequency Distributions.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Review of Basics. REVIEW OF BASICS PART I Measurement Descriptive Statistics Frequency Distributions.
BHS Methods in Behavioral Sciences I April 18, 2003 Chapter 4 (Ray) – Descriptive Statistics.
Some Basic Concepts Schaum's Outline of Elements of Statistics I: Descriptive Statistics & Probability Chuck Tappert and Allen Stix School of Computer.
Descriptive Statistics Chapter 3 Numerical Scales Nominal scale-Uses numbers for identification (student ID numbers) Ordinal scale- Uses numbers for.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-1 Statistics for Business and Economics 7 th Edition Chapter 2 Describing Data:
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Intro to Descriptive Statistics
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Introduction to Statistics Chapter 3 Using Statistics to summarize.
Very Basic Statistics.
Chap 3-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 3 Describing Data: Numerical Statistics for Business and Economics.
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
Levels of Measurement Nominal measurement Involves assigning numbers to classify characteristics into categories Ordinal measurement Involves sorting objects.
Descriptive Statistics Healey Chapters 3 and 4 (1e) or Ch. 3 (2/3e)
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Describing Data: Numerical
Chapter 13 Section 5 - Slide 1 Copyright © 2009 Pearson Education, Inc. AND.
6 - 1 Basic Univariate Statistics Chapter Basic Statistics A statistic is a number, computed from sample data, such as a mean or variance. The.
@ 2012 Wadsworth, Cengage Learning Chapter 5 Description of Behavior Through Numerical 2012 Wadsworth, Cengage Learning.
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
With Statistics Workshop with Statistics Workshop FunFunFunFun.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
© 2006 McGraw-Hill Higher Education. All rights reserved. Numbers Numbers mean different things in different situations. Consider three answers that appear.
QBM117 Business Statistics Descriptive Statistics Numerical Descriptive Measures.
Statistics and Quantitative Analysis U4320 Segment 2: Descriptive Statistics Prof. Sharyn O’Halloran.
Introduction to Descriptive Statistics Objectives: 1.Explain the general role of statistics in assessment & evaluation 2.Explain three methods for describing.
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems.
Chapter 2 Describing Data.
© 2006 McGraw-Hill Higher Education. All rights reserved. Numbers Numbers mean different things in different situations. Consider three answers that appear.
Descriptive Statistics becoming familiar with the data.
Biostatistics Class 1 1/25/2000 Introduction Descriptive Statistics.
Lecture 3 Describing Data Using Numerical Measures.
Chap 3-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 3 Describing Data Using Numerical.
Measures of Central Tendency: The Mean, Median, and Mode
Topics for our first Seminar The readings are Chapters 1 and 2 of your textbook. Chapter 1 contains a lot of terminology with which you should be familiar.
Chapter Eight: Using Statistics to Answer Questions.
Unit 2 (F): Statistics in Psychological Research: Measures of Central Tendency Mr. Debes A.P. Psychology.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
BASIC STATISTICAL CONCEPTS Chapter Three. CHAPTER OBJECTIVES Scales of Measurement Measures of central tendency (mean, median, mode) Frequency distribution.
IE(DS)1 Descriptive Statistics Data - Quantitative observation of Behavior What do numbers mean? If we call one thing 1 and another thing 2 what do we.
Measure of Central Tendency and Spread of Data Notes.
Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher.
CHAPTER 2: Basic Summary Statistics
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
Measurements Statistics WEEK 6. Lesson Objectives Review Descriptive / Survey Level of measurements Descriptive Statistics.
Welcome to… The Exciting World of Descriptive Statistics in Educational Assessment!
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
Descriptive Statistics Printing information at: Class website:
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
Statistical Methods Michael J. Watts
Measurements Statistics
Populations.
Business and Economics 6th Edition
Statistical Methods Michael J. Watts
Chapter 3 Describing Data Using Numerical Measures
Central Tendency and Variability
Descriptive Statistics
Chapter 3 Describing Data Using Numerical Measures
Descriptive Statistics
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Research Day 3.
Descriptive Statistics
CHAPTER 2: Basic Summary Statistics
Presentation transcript:

Lecture 1: Basic Descriptive Statistics 1.Types of Biological Data 2.Summary Descriptive Statistics Measures of Central Tendency Measures of Dispersion 3.Assignments 1.Types of Biological Data 2.Summary Descriptive Statistics Measures of Central Tendency Measures of Dispersion 3.Assignments

Scales of Measurement: General Comments Any observation or experiment in biology involves the collection of information (observe plants) Empirical observations become statistical data once they are cast as some type of measurement (plant height) Measurement is the assignment of numbers to objects or events according to rules (measure plant 1, plant 2, … ) Different rules lead to different kinds of scales of measurement A dataset can thus be classified according to the type of scale by which it is measured – Different scales admit different permissible statistics (see table summary) Any observation or experiment in biology involves the collection of information (observe plants) Empirical observations become statistical data once they are cast as some type of measurement (plant height) Measurement is the assignment of numbers to objects or events according to rules (measure plant 1, plant 2, … ) Different rules lead to different kinds of scales of measurement A dataset can thus be classified according to the type of scale by which it is measured – Different scales admit different permissible statistics (see table summary) 1. Types of Biological Data

Scales of Measurement: General Comments Observation Measurement Rule 1 Measurement Scale 1 Data Type 1 Measurement Rule 2 Measurement Scale 2 Data Type 2 Measurement Rule 3 Measurement Scale 3 Data Type 3 Measurement Rule 4 Measurement Scale 4 Data Type 4 1. Types of Biological Data

Scales of Measurement: NOIR Data on a Nominal Scale – A nominal scale assigns numbers as mere labels or types- words or letters would work just as well – Example: numbers on jerseys that serve to identify athletic participants – Example: rocks can be classified as igneous, sedimentary, and metamorphic Data on an Ordinal Scale – An ordinal scale assigns numbers according to some rank ordering – Example: order in which participant’s finish a race (1 st, 2 nd, … ) Data on a Nominal Scale – A nominal scale assigns numbers as mere labels or types- words or letters would work just as well – Example: numbers on jerseys that serve to identify athletic participants – Example: rocks can be classified as igneous, sedimentary, and metamorphic Data on an Ordinal Scale – An ordinal scale assigns numbers according to some rank ordering – Example: order in which participant’s finish a race (1 st, 2 nd, … ) 1. Types of Biological Data

Scales of Measurement: NOIR Data on an Interval Scale – An interval scale assigns numbers according to some rank ordering and assigns the size of intervals in between data (but has no true zero point) – Example: The temperature scales of degrees Celsius and degrees Fahrenheit are interval scales The amount of temperature change from 27° C to 32° C is the same as the temperature change from 104° C to 109° C The choices for 0° C and 0° F are arbitrary; that is, it makes no sense to say that 98° F is twice as hot as 49° F Data on a Ratio Scale – A ratio scale is an interval scale with a true zero point. – Example: A participant’s finish time for a race A finish time of 25 seconds is better than 50 seconds (order) and is, indeed, twice as fast (true zero) – Example: The Kelvin temperature scale (has an absolute zero) Data on an Interval Scale – An interval scale assigns numbers according to some rank ordering and assigns the size of intervals in between data (but has no true zero point) – Example: The temperature scales of degrees Celsius and degrees Fahrenheit are interval scales The amount of temperature change from 27° C to 32° C is the same as the temperature change from 104° C to 109° C The choices for 0° C and 0° F are arbitrary; that is, it makes no sense to say that 98° F is twice as hot as 49° F Data on a Ratio Scale – A ratio scale is an interval scale with a true zero point. – Example: A participant’s finish time for a race A finish time of 25 seconds is better than 50 seconds (order) and is, indeed, twice as fast (true zero) – Example: The Kelvin temperature scale (has an absolute zero) 1. Types of Biological Data

Discrete vs. Continuous Measurements may take on discrete or continuous values: A set of values is discrete if it is countable – Set of possible number of arms on a starfish – Set of possible number of leaves on a plant – Set of possible number of granules of sand on a beach A set of values is continuous if it is uncountable – Set of possible weights of starfish – Set of possible surface areas for leaves – Set of possible amounts of time spent counting sand granules Measurements may take on discrete or continuous values: A set of values is discrete if it is countable – Set of possible number of arms on a starfish – Set of possible number of leaves on a plant – Set of possible number of granules of sand on a beach A set of values is continuous if it is uncountable – Set of possible weights of starfish – Set of possible surface areas for leaves – Set of possible amounts of time spent counting sand granules 1. Types of Biological Data

Summary: Organizational Chart Data Types Non-Metric NominalOrdinal Metric IntervalRatio DiscreteContinuous 1. Types of Biological Data

Overview When a dataset is summarized by its statistical information, there is a loss of information. That is, given the summary statistics, there is no way to recover the original data. Basic summary statistics may be grouped as: 1.measures of central tendency (giving in some sense the central value of a data set) 2.measures of dispersion (giving a measure of how spread out that data set is) When a dataset is summarized by its statistical information, there is a loss of information. That is, given the summary statistics, there is no way to recover the original data. Basic summary statistics may be grouped as: 1.measures of central tendency (giving in some sense the central value of a data set) 2.measures of dispersion (giving a measure of how spread out that data set is) 2. Summary Descriptive Statistics of Datasets

Measures of Central Tendency Arithmetic Mean Dataset: Average: Example: This statistic doesn’t make sense for data on nominal or ordinal scales: jersey numbers, top ten list Arithmetic Mean Dataset: Average: Example: This statistic doesn’t make sense for data on nominal or ordinal scales: jersey numbers, top ten list 2. Summary Descriptive Statistics of Datasets

Measures of Central Tendency Median: half the dataset fall below this value; half above Dataset: Median: This statistic doesn’t make sense for data on nominal scales: jersey numbers The median is less effected by outliers than the mean; in this case the mean is approximately 167,000 Median: half the dataset fall below this value; half above Dataset: Median: This statistic doesn’t make sense for data on nominal scales: jersey numbers The median is less effected by outliers than the mean; in this case the mean is approximately 167, Summary Descriptive Statistics of Datasets

Measures of Central Tendency Mode: The mode is the most frequently occurring value (or values - there may be more than one) in a data set Dataset: Mode: This statistic is meaningful for all scales Mode: The mode is the most frequently occurring value (or values - there may be more than one) in a data set Dataset: Mode: This statistic is meaningful for all scales 2. Summary Descriptive Statistics of Datasets

Measures of Central Tendency Midrange: The midrange is the value halfway between the largest and smallest values in the data set Dataset: Midrange: This statistic doesn’t make sense for data on nominal or ordinal scales: jersey numbers, top ten list Midrange: The midrange is the value halfway between the largest and smallest values in the data set Dataset: Midrange: This statistic doesn’t make sense for data on nominal or ordinal scales: jersey numbers, top ten list 2. Summary Descriptive Statistics of Datasets

Measures of Central Tendency Geometric Mean: The geometric mean of a set of n data is the n th root of the product of the n data values, Dataset: Geometric Mean: The geometric mean arises as an appropriate estimate of growth rates of a population when the growth rates vary through time or space It is always less than or equal to the arithmetic mean Geometric Mean: The geometric mean of a set of n data is the n th root of the product of the n data values, Dataset: Geometric Mean: The geometric mean arises as an appropriate estimate of growth rates of a population when the growth rates vary through time or space It is always less than or equal to the arithmetic mean 2. Summary Descriptive Statistics of Datasets

Measures of Dispersion Range Dataset: Range: Variance: the mean sum of the squares of the deviations of the data from the arithmetic mean – The “best” estimate of this (take a good statistics class to find out how “best” is defined) is the sample variance: Standard Deviation: Range Dataset: Range: Variance: the mean sum of the squares of the deviations of the data from the arithmetic mean – The “best” estimate of this (take a good statistics class to find out how “best” is defined) is the sample variance: Standard Deviation: 2. Summary Descriptive Statistics of Datasets

Table Summary Permissible Statistic/Opera tion NominalOrdinalIntervalRatio Mode ✓✓✓✓ Median ✗✓✓✓ Addition, Mean, Variance ✗✗✓✓ Multiplication, Ratio ✗✗✗✓ 2. Summary Descriptive Statistics of Datasets

Homework, MATLAB 1.Homework: Chapter 1 Exercises Download MATLAB as soon as possible. We will begin working with MATLAB in class next Thursday. 1.Homework: Chapter 1 Exercises Download MATLAB as soon as possible. We will begin working with MATLAB in class next Thursday. 3. Assignments

1.1 Exercise capacity (in seconds) was determined for each of 11 patients who were being treated for chronic heart failure: 906, 1320, 711, 1170, 684, 1200, 837, 1056, 897, 882, 1008 (a)Determine the mean and the median of the data. Solution: To find the median, we first order the data: 684, 711, 837, 882, 897, 906, 1008, 1056, 1170, 1200, 1320 Since there are eleven (an odd number) data points, the median will be the 6 th data point. That is, the median is 906. Exercise capacity (in seconds) was determined for each of 11 patients who were being treated for chronic heart failure: 906, 1320, 711, 1170, 684, 1200, 837, 1056, 897, 882, 1008 (a)Determine the mean and the median of the data. Solution: To find the median, we first order the data: 684, 711, 837, 882, 897, 906, 1008, 1056, 1170, 1200, 1320 Since there are eleven (an odd number) data points, the median will be the 6 th data point. That is, the median is 906. Homework

1.2 Daily crude oil output (in millions of barrels) is shown below for the years 1971 to Compute the mean, median, and mode for the data. Solution: Let’s use MATLAB to solve this problem. Daily crude oil output (in millions of barrels) is shown below for the years 1971 to Compute the mean, median, and mode for the data. Solution: Let’s use MATLAB to solve this problem. Homework

1.2 Homework

1.4 Ten hospital employees on a standard American diet agreed to adopt a vegetarian diet for one month. Below is the change in the serum cholesterol level (before - after). 49, −10, 27, 13, 36, 19, 48, 21, 8, 16 (a)Compute the median and mean change in cholesterol. (b)Compute the range, variance and standard deviation of the data. Is the data fairly spread out or close together? Solution: Again we use MATLAB. Ten hospital employees on a standard American diet agreed to adopt a vegetarian diet for one month. Below is the change in the serum cholesterol level (before - after). 49, −10, 27, 13, 36, 19, 48, 21, 8, 16 (a)Compute the median and mean change in cholesterol. (b)Compute the range, variance and standard deviation of the data. Is the data fairly spread out or close together? Solution: Again we use MATLAB. Homework

1.4 Homework

1.4 In order to study for the quiz, we now do these by hand. First we rewrite the dataset in numerical order: -10, 8, 13, 16, 19, 21, 27, 36, 48, 49 Since there are ten (an even number) data points, the median will be halfway between 19 and 21. That is, the median is 20. Finding the variance is more work: In order to study for the quiz, we now do these by hand. First we rewrite the dataset in numerical order: -10, 8, 13, 16, 19, 21, 27, 36, 48, 49 Since there are ten (an even number) data points, the median will be halfway between 19 and 21. That is, the median is 20. Finding the variance is more work: Homework

1.5 Twelve sheep were fed pingue as a part of an experiment and died as a result. The time of death in hours after the administering of pingue for each sheep follows: Compute the range, variance and standard deviation of the sample. Answer: range: 96 variance: std: Twelve sheep were fed pingue as a part of an experiment and died as a result. The time of death in hours after the administering of pingue for each sheep follows: Compute the range, variance and standard deviation of the sample. Answer: range: 96 variance: std: Homework