Agricultural and Biological Statistics

Slides:



Advertisements
Similar presentations
Chapter 3 Properties of Random Variables
Advertisements

Descriptive Statistics
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Calculating & Reporting Healthcare Statistics
Chapter 3 Describing Data Using Numerical Measures
Lecture 2 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D
Descriptive Statistics – Central Tendency & Variability Chapter 3 (Part 2) MSIS 111 Prof. Nick Dedeke.
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Biostatistics Unit 2 Descriptive Biostatistics 1.
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Introduction to Statistics Chapter 3 Using Statistics to summarize.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 3 Describing Data Using Numerical Measures.
Chapter 3: Central Tendency
July, 2000Guang Jin Statistics in Applied Science and Technology Chapter 4 Summarizing Data.
Today: Central Tendency & Dispersion
Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately describes the center of the.
Describing Data: Numerical
6 - 1 Basic Univariate Statistics Chapter Basic Statistics A statistic is a number, computed from sample data, such as a mean or variance. The.
@ 2012 Wadsworth, Cengage Learning Chapter 5 Description of Behavior Through Numerical 2012 Wadsworth, Cengage Learning.
BIOSTATISTICS II. RECAP ROLE OF BIOSATTISTICS IN PUBLIC HEALTH SOURCES AND FUNCTIONS OF VITAL STATISTICS RATES/ RATIOS/PROPORTIONS TYPES OF DATA CATEGORICAL.
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
Chapter 3 Statistical Concepts.
Chapter 3 – Descriptive Statistics
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Graphical Summary of Data Distribution Statistical View Point Histograms Skewness Kurtosis Other Descriptive Summary Measures Source:
JDS Special Program: Pre-training1 Basic Statistics 01 Describing Data.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Measures of Central Tendency and Dispersion Preferred measures of central location & dispersion DispersionCentral locationType of Distribution SDMeanNormal.
What is Business Statistics? What Is Statistics? Collection of DataCollection of Data –Survey –Interviews Summarization and Presentation of DataSummarization.
Descriptive Statistics: Numerical Methods
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
1 PUAF 610 TA Session 2. 2 Today Class Review- summary statistics STATA Introduction Reminder: HW this week.
Chapter 2 Describing Data.
© 2006 McGraw-Hill Higher Education. All rights reserved. Numbers Numbers mean different things in different situations. Consider three answers that appear.
Describing Data Lesson 3. Psychology & Statistics n Goals of Psychology l Describe, predict, influence behavior & cognitive processes n Role of statistics.
Lecture 3 Describing Data Using Numerical Measures.
Skewness & Kurtosis: Reference
TYPES OF STATISTICAL METHODS USED IN PSYCHOLOGY Statistics.
INVESTIGATION 1.
Chap 3-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 3 Describing Data Using Numerical.
Business Statistics Spring 2005 Summarizing and Describing Numerical Data.
Central Tendency & Dispersion
Unit 2 (F): Statistics in Psychological Research: Measures of Central Tendency Mr. Debes A.P. Psychology.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Data Summary Using Descriptive Measures Sections 3.1 – 3.6, 3.8
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall2(2)-1 Chapter 2: Displaying and Summarizing Data Part 2: Descriptive Statistics.
LIS 570 Summarising and presenting data - Univariate analysis.
Chapter 3: Central Tendency 1. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
CHAPTER 2: Basic Summary Statistics
Descriptive Statistics(Summary and Variability measures)
Chapter 3 EXPLORATION DATA ANALYSIS 3.1 GRAPHICAL DISPLAY OF DATA 3.2 MEASURES OF CENTRAL TENDENCY 3.3 MEASURES OF DISPERSION.
Summarizing Data with Numerical Values Introduction: to summarize a set of numerical data we used three types of groups can be used to give an idea about.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Descriptive Statistics
Descriptive Statistics
Chapter 3 Describing Data Using Numerical Measures
Topic 3: Measures of central tendency, dispersion and shape
Chapter 3 Created by Bethany Stubbe and Stephan Kogitz.
Central Tendency and Variability
Descriptive Statistics
Description of Data (Summary and Variability measures)
Describing Data: Numerical Measures
Chapter 3: Central Tendency
CHAPTER 2: Basic Summary Statistics
Chapter 3: Central Tendency
Presentation transcript:

Agricultural and Biological Statistics

Summarizing Data Chapter 2

Summarizing Data Data would be observations on one or more variables selected from a population ( or a sample)

Summarizing Data Qualitative Variables - Variables that express attributes about a sample or population Quantitative Variables - Variables that are the result of measurement or counting.

Qualitative Variables give rise to nominal or ordinal data. Summarizing Data Qualitative Variables give rise to nominal or ordinal data. Quantitative Variables give rise to ratio or intervals

It is also important to categorize observations as to their source. Summarizing Data It is also important to categorize observations as to their source. Primary Data- Collected by means of experiment or survey. Secondary Data- acquired from a source that did not collect the data even though the source may have published them.

Summarizing Data Data collection allows us to make informed decisions about the problem at hand. Manipulation and statistical analysis of the data allows for improved decision making. Generally speaking statistical data is collected in random order. Since there is no real order to the data it is difficult to obtain any valuable information upon inspection. Data: 3, 1, 7, 22, 9, 10, 4, 17, 19

Definition of Array Array- an array reorders the data from the smallest to the largest value. Data: 3, 1, 7, 22, 9, 10, 4, 17, 19 Arrayed data: 1, 3, 4, 7, 9, 10, 17, 19, 22

Measures of Dispersion Definition: Range - a range is computed by subtracting the smallest from the largest observation. Range: 22-1=21 An array also indicates something about the distribution the units between the two extremes and their tendency to cluster toward some central value.

Measures of Dispersion Data can further be summarized in the form of a frequency distribution. A number of classes are chosen (5 to 15 normally) The distribution has the classes on the vertical axis and frequencies on the horizontal axis.

Measures of Dispersion Cotton Yield 215 to 235 235 to 255 255 to 275 These don’t have to have equal width classes. (Income) Number of farms 4 6 13 21 15 7 5

Measures of Dispersion Histogram- A frequency distribution presented as a bar chest Advantage- See its shape Frequency polygon- A line graph used to display data Frequencies on y-axis. Class midpoints on x-axis.

Averages Averages- a number used to represent the central value of data set or distribution. 1. Arithmetic Mean- most widely used. n µ = ∑ Xi 1 N

Example: 7 this is the population 3 2 8 20/4 = 5 = Arithmetic Mean

Example cont: Now Take Some Samples X= ∑ Xi 1 S1 = 7 3 10/2 = 5 S2 = 7 8 15/2 = 7.5

1a. Weighted Mean. n x = ∑ wi xi 1 n ∑ wi

Crop Hourly Number wx Cucumbers 4.50 950 4,275 Melons 4.75 600 2,850 Weighted Mean Example Crop Hourly Number wx Wage,x Workers,w Cucumbers 4.50 950 4,275 Melons 4.75 600 2,850 Onions 5.25 1,020 5,355 2,570 12,480 x = 12,480 = 4.86. The other way it is 4.83 2,570

Two Properties of Arithmetic Mean a. Sum of deviations from the mean are zero. 3 3-4 = -1 7 x = 4 7-4 = 3 4 4-4 = 0 6 6-4 = 2 0 0-4 = -4

b. The sum of squares of the deviation’s from the mean is a minimum.

2. Midrange- ( or center) is the arithmetic mean of the smallest and the largest items in the data set. Unreliable as estimate of the population mean. Based on two values that change significantly from sample to sample

Example 2 where X1is smallest and Xn is the largest MR = 0 + 7 = 3.5 2 MR = X1 + Xn 2 where X1is smallest and Xn is the largest MR = 0 + 7 = 3.5 2

Median 3. Median – a place average for ungrouped data, it is the value of the middle observation after the data is arrayed. When there is an even number of observations the middle two observations are averaged. Better measure when extreme values are encountered. Should not be used for small sample sizes. Half of observations are below half above.

Mode 4. Mode – It is most common observation in the data set. For ungrouped data we determine the mode by inspection. Ungrouped data may not have a mode. All values appear once. Several modes could occur as well. Use mode when we want to know what is in vogue.

An arithmetic mean might be meaningless. ABC show 1 CBS show 2 NBC show 3 X = 2.3 meaningless

Characteristics of Mean, Median, Mode Use three averages together to determine relative symmetry of distribution. Perfect symmetry.. All three values (averages) are identical. If distribution has a tail on the right. Skewed positively. Arithmetic mean is largest Mode smallest Median 2/3 of the way in between. Toward mean.

Characteristics of Mean, Median, Mode Mean is the largest because its affected by large values. Median is sensitive to position of the values. Arithmetic mean is only one that can be used in algebraic calculations, which makes it most useful. Down side impossible to calculate with open ended classes. This does not affect the other two averages.

Measures of dispersion Range (R) = Xn-X1 Can be used with mean, median, and midrange. Range indicates both how high and low the numbers go and the range of the data itself. Based on two extreme values of the data set. Not first choice for a measure of dispersion.

Quartile Deviation QD = Q3- Q1 2 Used only with the median. One half the distance between the first and the third quartiles.

Quartile Deviation 8 12 6 14 10 6 8 10 12 14 1st Middle 3rd 8 12 6 14 10 6 8 10 12 14 1st Middle 3rd Quartile Quartile Quartile QD = 13-7/2 = 3

Quartile Deviation QD is similar to the range but uses values in the middle half of the distribution rather than the endpoints. Poor measure when wide dispersion in the tails of the distribution!

Standard Deviation A measure of dispersion used with the arithmetic mean. Its value is based on all the observations of the data set.

SD For ungrouped data SD is most widely used measure of dispersion. Arithmetic mean is most widely used average. s2- sample variance is an estimate of the population variance σ2 computed from sample data.

Standard Deviation In repeated sampling, the sample variance is biased and underestimates to population variance by the fixed amount Thus revision in the sample SD formula is needed; divide by n-1 for sample

Example for Calculating SD Days Absent 7 -3 9 14 4 16 8 -2 5 -5 25 15 11 1 60 80 x=10

Standard Deviation (Another Formula) This does not contain deviations from the mean.

Two properties of S & X 1. If we add a constant to every element in the data set, the mean changes by that same value and the SD remains unchanged. 2. Multiplying each value of x by a constant multiplies the mean and SD by the absolute value of the constant and the variance by the square of the constant.

Standardizing The Data Mean for every element 1/s for every element in data set. 0 mean 1SD becomes Z for population

Coefficient of Variation Uses the SD and mean to measure the variability of the data set. Gives a relative measure of variability in the data set States how large the Standard Deviation is in comparison to the mean in percentage terms CV=100 would mean that the S & X are equal. When CV over 50 use caution in stating that mean represents population.