Download presentation
Presentation is loading. Please wait.
1
Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric and non-parametric tests Break Regression in SPSS Writing a dissertation proposal when you plan to use statistics Exercises, assessment and assistance
2
Non-parametric statistics Non-parametric statistics in human geography Different types of non-parametric test: –1 sample –2 independent samples –2 tied samples –3 or more samples
3
The Chi-Square test Most versatile test in social science Can be used to examine nominal data, ordinal data and interval/ratio data in groups There are no assumptions about independent or paired observations
4
Theory of Chi-Square The test examines the difference between observed counts and expected values Suppose we wanted to examine the difference between age groups in our sample and people in those groups in the UK? Or perhaps the difference between age groups between two or three samples? Chi-Square can examine these differences
5
The Chi-Square Equation χ2 χ2 = Sum of: (observed - expected) 2 expected
6
One way Chi-Square test Examines whether there is a difference between one sample and a population We can assume either that the expected counts will be equal between categories or that we know the proportions But, before we do the test, we have to cross- tabulate the data
7
The Cross-tabulation
8
The expected counts Expected counts relate to either equal proportions or previously known proportions (e.g. from a population) These are then compared to observed counts and the difference is calculated A significance level is selected and the null hypothesis is accepted or rejected
9
The Contingency Table
10
The test result Chi-Square is calculated as the sum of each difference for every cell Assessed as for other statistical tests χ 2 = 7.1 (p <0.05)
11
Two way Chi-Square tests Very often, we want to compare more than one sample with a population, such as with another sample, or three or more samples Two way Chi-Square allows us to do this easily Again, we cross-tabulate the data
12
The Contingency table
13
Two-way analysis Chi-Square calculates expected values by multiplying the row and column totals and dividing between the grand total Expected values represent the number in each category which, given the sample sizes and distribution, we would expect to see in each cell
14
The Chi-Square result Chi-Square gives the result and we evaluate the test with the use of significance tests χ 2 = 21.7 (p <0.05) But, we can only state that there is a difference - not what the difference is. For example, does our sample from the north have more older people in it? We must examine the relative proportions of the contingency table to find this out
15
The expected counts problem Chi-Square has the stipulation that 20% or less of the expected counts in an analysis must be under 5. If there are more than this, the test is invalid So, how can we get over this problem?
16
Recoding variables We can aggregate suitable variables to make the number of groups smaller Aggregating only works with ordinal data This reduces the number of groups and makes the likelihood of obtaining counts below 5 less We can also use this to make interval/ratio data into groups
17
Chi-Square: Qualifications You should have no less than 20 cases As stated above, not more than 20% of cells should have expected values under 5 You should not necessarily ignore a contingency table, even if the Chi-Square test is invalid Remember, above all, that Chi-Square is a test of difference, not correlation
18
Statistical correlation: relationships among variables Relationships are concerned with the extent to which variable A is related to B This is termed correlation Correlation does not necessarily imply causation, but merely a possible relationship There are parametric and non-parametric tests of correlation
19
Types of correlation Perfect positive correlation: +1 Perfect negative correlation: Linear relationship No correlation: 0 Non-linear relationship
20
Parametric correlation: Pearson’s r Assumes your data are on interval/ratio scales AND are normally distributed Measured as - +1 This result shows the strength of the relationship The test must be judged by its significance (as for other parametric tests: < > 0.05)
21
Non-parametric correlation: Spearman’s r s Assumes ordinal data, or interval/ratio data that are not normally distributed Data are ranked for the test Measured as for Pearson’s Significance as for Pearson’s
22
From correlation to explanation: regression analysis Regression seeks to examine the nature of the relationship between one or more independent variables and a dependent variable It is concerned with prediction, not just correlation To predict, there is an equation which describes the ‘line of best fit’ between variables
23
The Line of best fit Line of best fit ‘fits’ a straight line through the data points you observe Can be expressed by: Y = mx + c Where: Y = Dependent variable c = constant (intercept) m = slope gradient x = independent variable
24
Predicting using the regression equation You can use the equation to predict levels of Y for given levels of X This is often of use when looking at different outcome situations
25
Interpreting regression results R 2 : the ‘goodness of fit’ that the model offers, expressed in per cent F: the significance of the model The regression coefficients and associated p values
26
Regression: assumptions Your data: –Are measured on interval/ratio scales; –Are normally distributed; –And are therefore Parametric; and... –Have a linear relationship You can use other techniques for non-linear regression and regression with nominal/ordinal variables
27
Is any of this relevant to me? YES - you have to write a dissertation proposal Saying you will ‘analyse’ the data using appropriate methods is not enough You will get a far higher mark if you follow these simple steps in the next two months when preparing your proposal:
28
Writing your Dissertation Proposal: key points Do you need to use a questionnaire/other quantitative instrument? If yes, what key questions are you posing? ALWAYS relate these questions to your plans for analysis How will you analyse these collected data to meet your aims and objectives?
29
Writing your proposal Methodology Questionnaire Questions Data this will yield Analysis types Analysis tools Quantitative/qualitative? Type: closed/open/both? Yes/no; frequency; categorical; multiple response? Parametric/non-parametric? Description, Differences, relationships? Parametric/non-parametric?
30
Example of this process
31
A final word Think carefully about your questionnaire - can you meet the objectives you have set yourself? Do you need to use every statistical test? Assessments (all 3) due in on 6 May Where can you get help? –Friday 14th March, 9-11am; –Monday 28th April, 11am-1pm E-mail: S.W.Barr@exeter.ac.uk
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.