4.2.1 Descriptive Statistics and Classification of Data 1 UPA Package 4, Module 2 DESCRIPTIVE STATISTICS AND CLASSIFICION OF DATA
4.2.1 Descriptive Statistics and Classification of Data 2 Descriptive Statistics and Classification of Data Introduction statistics Level of measurement Descriptive statistics Classification of data SPSS and EXCEL Introduction exercises 4.2.1a/b
4.2.1 Descriptive Statistics and Classification of Data 3 Introduction Statistics
4.2.1 Descriptive Statistics and Classification of Data 4 Introduction Statistics Concepts Collecting (sampling), analyzing and conclusions about data Population - great variability of data Probability and uncertainty Assumptions and models Purpose On the basis of a sample draw sound conclusions about whole population Functions Descriptive summarizing data to make data more useful Inductiveprediction, testing relationships, generalization on basis of sampling
4.2.1 Descriptive Statistics and Classification of Data 5 Level of Measurement Measurement Ability to assign numbers to things according to a clear and well defined rule Are we dealing with the “right” things? Operational definition e.g. how to measure poverty? Data types Nominalclassification of data into categories or classes (land use) Ordinalorder of categories (low/middle/high income), no magnitude Interval/Ratiounit of measurement (income in € / month) Know your variables !
4.2.1 Descriptive Statistics and Classification of Data 6 Level of Measurement Data Transformations Change high level of measurement (interval) to lower level of measurement (ordinal or nominal). Advantages and disadvantages
4.2.1 Descriptive Statistics and Classification of Data 7 Descriptive Statistics Summarizing Data Readability (limited classes) vs. accuracy (many classes) Summarizing Nominal / Ordinal Data Proportions Percentages Summarizing Interval Data Locationmean, mode, median, freq. distribution Variationstandard deviation, coefficient of variation
4.2.1 Descriptive Statistics and Classification of Data 8 Descriptive Statistics Proportions, counting (f/N) Percentages, relative size. Proportions * 100 Rate: fraction, proportion or percent 5.722High Income Total (N) Middle-Income Low-Income %fSocial Class
4.2.1 Descriptive Statistics and Classification of Data 9 Descriptive Statistics Location Central tendency, typicality, degree of homogeneity – numbers that represent the center or typical value of a frequency distribution Modemost frequent score, highest point of the curve Medianmiddle case when arranged in order of size appropriate for skewed distributions Meancentral tendency of the group extreme values disturbing effect Mean=mode=median in case of symmetric distributions
4.2.1 Descriptive Statistics and Classification of Data 10 Descriptive Statistics Frequency distributiontable that shows the frequency of observations in each category of a variable Categories should have no overlap (exclusive) and complete coverage Give examples of frequency distribution for nominal, ordinal and interval type of data Discrete and continuous data Class limits
4.2.1 Descriptive Statistics and Classification of Data 11 Descriptive Statistics Variation Dispersion, degree of heterogeneity – numbers that depict the amount of spread or variability in a data set Rangedifference between highest and lowest score Standard deviationsquare root of the mean of the squared deviations from the mean Coefficient of variationstandard deviation (s) relative to the mean (x) cv= s / x Variances²
4.2.1 Descriptive Statistics and Classification of Data 12 Descriptive Statistics Decision Tree for the correct use of descriptive statistics
4.2.1 Descriptive Statistics and Classification of Data 13 Organization and Classification of Data Why classification Simplicity, clarity, particularity, data source, data processing Classification principles Purpose, exclusivity, exhaustiveness, detail vs. readability Number of classes and class limits (interval data) First look at distribution, avoid e.g. empty classes
4.2.1 Descriptive Statistics and Classification of Data 14 Classification of Data
4.2.1 Descriptive Statistics and Classification of Data 15 Frequency Distribution Frequencies (graphs, tables) first look at data distribution Patterns and deviations from that pattern, outliers Absolute, relative and cumulative freq. Distributions Categorical and quantitative variables histograms, bar charts, ogives, pie charts smooth (computerized) curves versus histograms
4.2.1 Descriptive Statistics and Classification of Data 16 Frequency Distribution Frequency table Patterns and deviations from that pattern, outliers Absolute, relative and cumulative freq. Distributions
4.2.1 Descriptive Statistics and Classification of Data 17 Frequency Distribution histograms, bar charts, ogives, pie charts smooth (computerized) curves versus histograms
4.2.1 Descriptive Statistics and Classification of Data 18 SPSS and Excel Statistical Package for the Social Sciences (SPSS) Comprehensive system for analyzing data Investment and upgrades, modular structure, simple spreadsheet structure Spreadsheet (e.g. Microsoft EXCEL) Limited statistical functionality (but how much do you really need?) Relatively cheap
4.2.1 Descriptive Statistics and Classification of Data 19 SPSS and Excel Explore Central Bureau of Statistics (CBS) data from Spatial units Municipality, Districts, Neighborhoods Municipality of Enschede Selection and basic descriptive analysis of variables Exercise 4.2.1a use SPSS Exercise 4.2.1b use EXCEL