Welcome to the Quantitative Analysis (Statistics/EXCEL) Module John Gates Oxford Centre for Water Research School of Geography and the Environment
What is statistics? …the collection and analysis of numerical data in large quantities. – Oxford English Dictionary The mathematics of the collection, organization, and interpretation of numerical data, especially the analysis of population characteristics by inference from sampling. – American Heritage Dictionary Statistics: the mathematical theory of ignorance. – Morris Kline It has long recognized by public men of all kinds... that statistics come under the head of lying, and that no lie is so false or inconclusive as that which is based on statistics. - H. Belloc There are three kinds of lies - lies, damned lies and statistics. – Benjamin Disraeli Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write. – H.G. Wells
Why statistics? make quantified statements about a phenomenon we are interested in frequently this phenomenon is too large to go out and measure exhaustively… …so we collect samples as proxies of the greater population of individuals or items that make up the phenomenon we are interested in
Aims of the course Introduction to basic statistics Demonstrate geographical context Learn to use analysis tools in EXCEL Make you an intelligent user of data Make you an intelligent user of statistics We will bypass much of the underlying maths, rather will emphasize the understanding of underlying principles
How the course works 1.Cover the statistical principles in lecture course lecture notes 2.Go through lecture notes in own time before practical use textbooks to supplement lecture notes 3.Attend practical work through practical handouts ask demonstrators for help 4.Take online assessments theory – any time after lecture practical – any time after finishing prac
Course Structure Lectures on Mondays –in OUCE Lecture Theatre Practicals on Tuesday afternoons (except next week) –in Medical Sciences Teaching Centres computing laboratory
pm pm Harris Manchester Brasenose Christ Church Hertford Jesus St. Edmunds Hall St. Hildas Worcester Keble Mansfield Merton Regents Park St. Annes St. Catz St. Peters Wadham St. Johns Practicals
Course Information ALL INFORMATION IS ON THE WEB –Lecture notes and glossary –Practical notes –Excel files –Internet resources –Recommended textbooks –Tests
Week 1 - Central Tendency 1. Types of statistics 2. Types of data 3. Samples 4. Frequency distribution 5. Measures of central tendency a) mode b) median c) arithmetic mean 6. Precision and accuracy
1a. Descriptive Statistics Definition: Quantitative methods of organizing, summarizing, and presenting data numerical data in an informative way. Describe the overall characteristics of a sample (and hence the population?) Transform raw data into more easily understood forms Central tendency – average character of the data.
1b. Inferential (analytical) Statistics Definition: The branch of statistics used to make inferences or judgments about a larger population based on the data collected from a smaller sample drawn from the population
2. Types of Data Interval Ordinal Nominal
2. Types of Data Interval Ordinal Nominal -- Can tell exactly how far any measurement is from any other -- Examples: height, age, size
2. Types of Data Interval Ordinal Nominal -- A set of observation ordered according to some criterion, i.e. ranking -- Cannot tell how far one measurement is from the next -- Examples: horses positions in race, the ten highest mountains in the world -- Note that interval data can be converted into ordinal form
2. Types of Data Interval Ordinal Nominal -- Also referred to as categorical data -- Data are grouped into categories -- Examples: land use type, ethnicity, rock type -- Note that interval data can be converted into nominal form
3. Samples Definition: A subset of the target population Random: –the individuals in the samples are randomly selected –each member of the population has a known, but possibly non-equal, chance of being included in the sample Independent: –a sample should have no effect and are not affected by other samples selected from the same population, or different populations
4. Frequency Distribution The spread of data along its range –either mathematical description –or (and) visual description… …a frequency histogram –define categories or intervals or classes –count the number of measurements that fall into each class –plot classes along x-axis –plot counts (frequencies) on y-axis
4. Frequency Distribution Grade (in percent) Grades for 1 st Stats Practical ( )
5a. Mode Definition: The most commonly occurring value for nominal data we refer to the modal class not appropriate for ordinal or (usually) interval data Modal Class
5b. Median Definition: The central value in an ordered set of data Raw data Sorted data Median
5b. Median even number of values Raw data Sorted data Median (4 + 5) / 2 = 4.5
5c. Arithmetic Mean
data: 4, 2, 5, 1, 7, 10, 6
The average average = central tendency the mean, mode and median are all measures of average average mean
6. Precision and accuracy Precision: –The degree of refinement with which an operation is performed or a measurement stated Accuracy: –Freedom from mistake or error
6. Precision and accuracy
Week 1 - Central Tendency 1. Types of statistics 2. Types of data 3. Samples 4. Frequency distribution 5. Measures of central tendency a) mode b) median c) arithmetic mean 6. Precision and accuracy
Excel skills in Practical 1 Entering and sorting data Calculating mean, median and mode Creating frequency histograms Introduction to formulas functions