Presentation is loading. Please wait.

Presentation is loading. Please wait.

Univariate Descriptive Statistics Chapter 2. Lecture Overview Tabular and Graphical Techniques Distributions Measures of Central Tendency Measures of.

Similar presentations


Presentation on theme: "Univariate Descriptive Statistics Chapter 2. Lecture Overview Tabular and Graphical Techniques Distributions Measures of Central Tendency Measures of."— Presentation transcript:

1 Univariate Descriptive Statistics Chapter 2

2 Lecture Overview Tabular and Graphical Techniques Distributions Measures of Central Tendency Measures of Dispersion

3 Tabular and Graphical Techniques Frequency Tables –Ungrouped –Grouped Histograms Cumulative Frequency Histogram

4 Frequency Tables BinFrequency 1703 1807 1908 2009 21012 2206 2306 2404 2502 2603

5 Histograms Note: sometimes percent is on the Y axis rather than frequency

6 Cumulative Frequency Histograms

7 Key Concepts Choosing Intervals (i.e., choosing your “bins”) Rules from the textbook (pages 38 – 39) Commonly Used Examples from GIS –Equal Interval –Quantiles (e.g., quartiles and quintiles) –Natural Breaks –Standard Deviation

8 Rules For Bin Sizes Note: This is very relevant for GIS Rule 1: Use intervals with simple bounds Rule 2: Respect natural breakpoints Rule 3: Intervals should not overlap Rule 4: Intervals should be the same width Rule 5: Select an appropriate number of classes

9 The Effect of Classification Equal Interval –Splits data into user-specified number of classes of equal width –Each class has a different number of observations

10 The Effect of Classification Quantiles –Data divided so that there are an equal number of observations are in each class –Some classes can have quite narrow intervals

11 The Effect of Classification Natural Breaks –Splits data into classes based on natural breaks represented in the data histogram

12 The Effect of Classification Standard Deviation –Mean + or – Std. Deviation(s)

13 Key Concepts Making sense of your histograms using distributions –Rectangular –Unimodal –Bimodal –Multimodal –Skew (positive and negative)

14 Bimodal Distribution

15 Multimodal Distribution

16 Skew An asymmetrical distribution

17 Measures of Central Tendency Measures of central tendency –Measures of the location of the middle or the center of a distribution –Mean, median, mode, midrange

18 Definitions Midrange Mode Median –Quantiles Mean

19 Definitions Sample Mean Population Mean

20 Description of Mean Mean – Most commonly used measure of central tendency Average of all observations The sum of all the scores divided by the number of scores Note: Assuming that each observation is equally significant

21 Symbols n: the number of observations N: the number of elements in the whole population Σ: this (capital sigma) is the symbol for sum i: the starting point of a series of numbers X: one element in our dataset, usually has a subscript (e.g., i, min, max) : the sample mean : the population mean

22 Summation Notation: Components indicates we are taking a sum refers to where the sum of terms begins refers to where the sum of terms ends indicates what we are summing up

23 Mathematical Notation of Mean The mathematical notation used most often in this course is the summation notation The Greek letter capital sigma is used as a shorthand way of indicating that a sum is to be taken: The expression is equivalent to:

24 A summation will often be written leaving out the upper and/or lower limits of the summation, assuming that all of the terms available are to be summed Summation Notation: Simplification

25 Equation for Mean Sample mean: Population mean:

26 Example Mean Calculations Example I –Data: 8, 4, 2, 6, 10 Example II –Sample: 10 trees randomly selected from Battle Park –Diameter (inches): 9.8, 10.2, 10.1, 14.5, 17.5, 13.9, 20.0, 15.5, 7.8, 24.5

27 Example Mean Calculations Example III Monthly mean temperature (°F) at Chapel Hill, NC (2001). Annual mean temperature (°F)

28 Mean annual precipitation (mm) Mean annual temperature (°F) 58.51 (°F) Mean 1198.10 (mm) Mean Examples IV & V Chapel Hill, NC (1972-2001)

29 Advantage –Sensitive to any change in the value of any observation Disadvantage –Very sensitive to outliers Explanation of Mean #Tree Height (m) #Tree Height (m) 15.065.3 26.077.1 37.5825.4 48.097.5 54.8104.5 Mean = 6.19 mwithout #8 Mean = 8.10 mwith #8

30 Measures of Dispersion Used to describe the data dispersion/spread/variation/deviation numerically Usually used in conjunction with measures of central tendency

31 Measures of variation score # of obs score Low variationHigh variation Groups have equal means and equal n, but one varies more than the other

32 Definitions Range Mean Deviation Variance Standard Deviation Coefficient of Variation Pearson’s

33 Symbols s 2 : the sample variance σ 2 : the population variance s: the sample standard deviation σ : the population standard deviation

34 Sample Variance and Standard Deviation Note: as with the mean there are both sample and population standard deviations & variances VarianceStandard Deviation

35 Next Class Read chapter 3 Work on the homework Come with questions Bring your laptop


Download ppt "Univariate Descriptive Statistics Chapter 2. Lecture Overview Tabular and Graphical Techniques Distributions Measures of Central Tendency Measures of."

Similar presentations


Ads by Google