USING STATISTICS TO DESCRIBE GEOGRAPHICAL DATA

Slides:



Advertisements
Similar presentations
Unit 16: Statistics Sections 16AB Central Tendency/Measures of Spread.
Advertisements

Unit 1.1 Investigating Data 1. Frequency and Histograms CCSS: S.ID.1 Represent data with plots on the real number line (dot plots, histograms, and box.
IB Math Studies – Topic 6 Statistics.
Unit 4 – Probability and Statistics
Box and Whisker Plot 5 Number Summary for Odd Numbered Data Sets.
Box and Whisker Plots A diagram that summarizes data by dividing it into four parts. It compares two sets of data.
Measures of Central Tendency
Chapter 13 Section 5 - Slide 1 Copyright © 2009 Pearson Education, Inc. AND.
What is a box and whisker plot? A box and whisker plot is a visual representation of how data is spread out and how much variation there is. It doesn’t.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
BUS250 Seminar 4. Mean: the arithmetic average of a set of data or sum of the values divided by the number of values. Median: the middle value of a data.
SECTION 1-7: ANALYZING AND DISPLAYING DATA Goal: Use statistical measures and data displays to represent data.
Nature of Science Science Nature of Science Scientific methods Formulation of a hypothesis Formulation of a hypothesis Survey literature/Archives.
Worked examples and exercises are in the text STROUD PROGRAMME 27 STATISTICS.
Table of Contents 1. Standard Deviation
Percentiles and Box – and – Whisker Plots Measures of central tendency show us the spread of data. Mean and standard deviation are useful with every day.
Data Analysis Qualitative Data Data that when collected is descriptive in nature: Eye colour, Hair colour Quantitative Data Data that when collected is.
Describing and Displaying Quantitative data. Summarizing continuous data Displaying continuous data Within-subject variability Presentation.
What is the MEAN? How do we find it? The mean is the numerical average of the data set. The mean is found by adding all the values in the set, then.
DATA ANALYSIS n Measures of Central Tendency F MEAN F MODE F MEDIAN.
INVESTIGATION 1.
INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures.
Measures of Center vs Measures of Spread
 The mean is typically what is meant by the word “average.” The mean is perhaps the most common measure of central tendency.  The sample mean is written.
Unit 2 (F): Statistics in Psychological Research: Measures of Central Tendency Mr. Debes A.P. Psychology.
What’s with all those numbers?.  What are Statistics?
1 Descriptive Statistics Descriptive Statistics Ernesto Diaz Faculty – Mathematics Redwood High School.
Summary Statistics: Measures of Location and Dispersion.
Measures Of Central Tendency
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
LIS 570 Summarising and presenting data - Univariate analysis.
Box and Whisker Plots This data shows the scores achieved by fifteen students who took a short maths test. The test was marked out of.
Worked examples and exercises are in the text STROUD PROGRAMME 27 STATISTICS.
Descriptive Statistics(Summary and Variability measures)
Copyright © 2016 Brooks/Cole Cengage Learning Intro to Statistics Part II Descriptive Statistics Intro to Statistics Part II Descriptive Statistics Ernesto.
Statics – Part II Chapter 9. Mean The mean, is the sum of the data divided by the number of pieces of data. The formula for calculating the mean is where.
5,8,12,15,15,18,20,20,20,30,35,40, Drawing a Dot plot.
Descriptive Statistics
Descriptive Statistics Ernesto Diaz Faculty – Mathematics
Exploratory Data Analysis
Cumulative Frequency and Box Plots
Intro to Statistics Part II Descriptive Statistics
Range, Mean, Median, Mode Essential Question: How do we take a random sample, and what statistics can we find with the data? Standard: MM1D3.a.
PROGRAMME 27 STATISTICS.
Chapter 6 ENGR 201: Statistics for Engineers
Statistics Collecting and analyzing large amounts of numerical data
Shoe Sizes.
DS5 CEC Interpreting Sets of Data
Description of Data (Summary and Variability measures)
Theme 4 Describing Variables Numerically
Unit 4 Statistics Review
Percentiles and Box-and- Whisker Plots
Cumulative Frequency and Box Plots
Representing Quantitative Data
Cronnelly.
Unit 7. Day 13..
Unit 7. Day 12..
How to create a Box and Whisker Plot
Box and Whisker Plots.
Measures of Central Tendency
Numerical Descriptive Statistics
Warm Up # 3: Answer each question to the best of your knowledge.
Day 52 – Box-and-Whisker.
Box and Whisker Plots A diagram that summarizes data by dividing it into four parts. It compares two sets of data. Dittamo & Lewis 2014.
Box and Whisker Plots A.K.A Box Plots.
Box and Whisker Plots and the 5 number summary
Box and Whisker Plots and the 5 number summary
Box and Whisker Plots and the 5 number summary
Basic Biostatistics Measures of central tendency and dispersion
Central Tendency & Variability
Presentation transcript:

USING STATISTICS TO DESCRIBE GEOGRAPHICAL DATA STATISTICS IN GEOGRAPHY USING STATISTICS TO DESCRIBE GEOGRAPHICAL DATA

During a GEOGRAPHICAL INVESTIGATION there is a set sequence of events:- Decide on the area of investigation and do some background research so that you are aware of the main ideas, concepts and factors involved. Formulate a hypothesis based on the information you have researched. Decide what data you will need to collect to test your idea / ideas, and produce a data collection plan involving data collection / sampling methods. Collect the data. Classify the data, begin the statistical analysis and present the data with appropriate maps, graphs and images. Analyse and Interpret the data, reach substantiated conclusions (related to your hypothesis / hypotheses) and discuss the effectiveness and limitations of the study. This powerpoint is about the classification and statistical analysis of data.

Lets look at some data that was collected from a site on Chesil Beach, a shingle tombolo in Dorset. A random sample of 30 pebbles was measured. The long axis of each piece was measured in mm. The beach is renowned for how well sorted the shingle is at any one site. This is the data for one site towards the western end :- 8 13 12 10 18 11 14 9 16 There are 2 main ways in which the data can be described statistically Measures of central tendency (the middle of the data) Measures of spread / dispersion (what is the range of the data around the middle value) These 2 measures allow you to describe the data you have collected and also let you begin to compare one set of data with another.

There are 3 main measures of central tendency MEAN MODE MEDIAN The mean or average is easily calculated. All the figures are added and then divided by the number of values. x̄ = ∑ x x = data n ∑ = sum of n = sample size x̄ = sample mean For this sample of long axes from a Chesil Beach sample the mean = 11.5 mm The mean or average is a good measure to use to show the middle of a set of data, but it can be affected by extreme values.

Below is another sample of 30 pieces of beach material from the eastern end of Chesil Beach. 70 84 60 67 87 58 66 72 68 56 80 82 62 55 65 64 54 69 76 The mean for this site is 67.3 mm So you can see that there is a significant difference between the means at the two sites Mean at first site 11.5 mm Mean at second site 67.3 mm

The MODE is the most frequently occurring value. Sometimes the data has to be grouped into classes to find the MODAL CLASS Sample 1 18 16 14 14 13 12 12 12 12 12 12 12 11 11 11 10 10 10 10 10 9 9 8 8 8 8 Sample 2 87 84 82 80 76 72 70 69 68 67 66 66 66 65 64 62 60 58 56 55 54 mode mode Therefore:- Sample 1 has a mean of 11.5mm and a mode of 12mm Sample 2 has a mean of 67.3mm and a mode of 66mm

The MEDIAN is the mid-point of a set of data. The data is ranked in descending order (highest at the top, lowest at the bottom), and the middle value is the MEDIAN. Look at these 2 examples, the first has an even number of values and the second an odd number:- 13 values * 10 values * Here is the MEDIAN, 5 points above and 5 below, half way between the 5th and 6th. Here is the MEDIAN, 6 points above and 6 points below, exactly on the 7th. The MEDIAN value is not affected by extremes. Look at the lower value changing, but the MEDIAN stays the same.

Now lets work out the MEDIAN value for our two samples of shingle from Chesil Beach. 18 16 14 14 13 12 12 12 12 12 12 12 11 11 11 10 10 10 10 10 9 9 8 8 8 8 Sample 2 87 84 82 80 76 72 70 69 68 67 66 66 66 65 64 62 60 58 56 55 54 There are 30 values here so we are looking for ½ way between the 15th and 16th, that will give 15 values above and 15 below The median for sample 1 is 11mm, and the median for sample 2 is 66mm. Count 15 from the top, and the Median is indicated above

B. Measures of spread / dispersion The Inter Quartile Range is calculated by using a dispersion diagram. The values are set out on a vertical scale and the MEDIAN, UPPER QUARTILE and LOWER QUARTILE are calculated. The Inter Quartile Range (IQR) is the difference between the upper and lower quartiles. * 6 values above the median so the upper Quartile (UQ) is half way between the 3rd and 4th. * 5 values above the median so the upper quartile (UQ) is the 3rd. UQ IQR Median IQR 5 values BELOW the median so the lower quartile (LQ) is the 3rd. 6 values below the median so the upper Quartile (UQ) is half way between the 3rd and 4th. LQ The upper and lower quartiles divide the upper and lower data in half and so the whole data set into quarters The Inter Quartile Range (IQR) is the difference on the scale between the upper and lower quartiles

A very good visual representation of the Inter Quartile Range (IQR) is a BOX and WHISKER diagram. Highest value Upper Quartile This set of data / values has a similar Median to the one opposite, but a much larger range and Inter Quartile Range : the data has a greater spread around the middle value / median. IQR Median Lower Quartile Lowest value

* ** **** Now lets draw dispersion diagrams and work out the Inter Quartile Ranges for samples 1 and 2 from Chesil Beach. Sample 2 80 You will see that sample 2 has a much wider spread of values and is not so well sorted. Sample 1 UQ = 72 Shingle size in mm 20 70 IQR ** * ******* **** ****** median UQ = 12 median LQ = 10 LQ = 60 10 60 Inter Quartile Range 72 – 60 = 12 Inter Quartile Range 12 – 10 = 2

Another very good measure of spread is the STANDARD DEVIATION. This is a measure of spread about the mean value. Basically it looks at the difference between each value and the mean. Where Ϭ = standard deviation ∑ = sum of x = data x̄ = mean of x n = number of items in data list The formula is Ϭ = ∑ ( x – x̄ )² n Nowadays most calculators will give you the standard deviation and certainly you will find an on-line calculator to find the value easily, so I wont show you how to calculate it using a table and the formula. = square root The STANDARD DEVIATION for sample 1 is 2.58, and for sample 2 it is 8.68, which confirms that sample two spreads more widely about the mean and is less well sorted than sample 1