Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey www.medievalarchitecture.net.

Similar presentations


Presentation on theme: "Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey www.medievalarchitecture.net."— Presentation transcript:

1 Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey www.medievalarchitecture.net

2 Lecture aims To introduce correlation and regression techniques To introduce correlation and regression techniques

3 The scattergram In correlation, we are always dealing with paired scores, and so values of the two variables taken together will be used to make a scattergram In correlation, we are always dealing with paired scores, and so values of the two variables taken together will be used to make a scattergram

4 example Quantities of New Forrest pottery recovered from sites at varying distances from the kilns Quantities of New Forrest pottery recovered from sites at varying distances from the kilns Site Distance (km) Quantity 1498 22060 33241 43447 52462

5 Negative correlation Here we can see that the quantity of pottery decreases as distance from the source increases

6 Positive correlation Here we see that the taller a pot, the wider the rim

7 Curvilinear monotonic relation Again the further from source, the less quantity of artefacts

8 Arched relationship (non-monotonic) Here we see the first molar increases with age and is then worn down as the animal gets older

9

10 scattergram This shows us that scattergrams are the most important means of studying relationships between two variables This shows us that scattergrams are the most important means of studying relationships between two variables

11 REGRESSION Regression differs from other techniques we have looked at so far in that it is concerned not just with whether or not a relationship exists, or the strength of that relationship, but with its nature Regression differs from other techniques we have looked at so far in that it is concerned not just with whether or not a relationship exists, or the strength of that relationship, but with its nature In regression analysis we use an independent variable to estimate (or predict) the values of a dependent variable In regression analysis we use an independent variable to estimate (or predict) the values of a dependent variable

12 Regression equation y = f(x) y = y axis (in this case the dependent y = y axis (in this case the dependent f = function (of x) f = function (of x) x = x axis x = x axis

13 y = f(x) y = x y = 2x y = x 2

14

15 General linear equations y = a + bx y = a + bx Where y is the dependent variable, x is the independent variable, and the coefficients a and b are constants, i.e. they are fixed for a given data Where y is the dependent variable, x is the independent variable, and the coefficients a and b are constants, i.e. they are fixed for a given data

16 Therefore: If x = 0 then the equation reduces to y = a, so a represents the point where the regression line crosses the y axis (the intercept) If x = 0 then the equation reduces to y = a, so a represents the point where the regression line crosses the y axis (the intercept) The b constant defines the slope of gradient of the regression line The b constant defines the slope of gradient of the regression line Thus for the pottery quantity in relation to distance from source, b represents the amount of decrease in pottery quantity from the source Thus for the pottery quantity in relation to distance from source, b represents the amount of decrease in pottery quantity from the source

17 y = a + bx

18

19

20 least-squares

21

22

23

24 y = a + bx

25

26 y = 102.64 – 1.8x

27

28

29 CORRELATION

30 1 correlation coefficient

31 CORRELATION 1 correlation coefficient 2 significance

32 CORRELATION 1 correlation coefficient r 2 significance

33 CORRELATION 1 correlation coefficient r -1 to +1 2 significance

34

35 nominal – in name only ordinal – forming a sequence interval – a sequence with fixed distances ratio – fixed distances with a datum point Levels of measurement:

36 nominal ordinal interval ratio Levels of measurement:

37 nominal ordinal interval Product-Moment Correlation Coefficient ratio Levels of measurement:

38 nominal ordinal Spearmans Rank Correlation Coefficient interval ratio Levels of measurement:

39

40 The Product-Moment Correlation Coefficient

41 length (cm) width (cm) sample – 20 bronze spearheads n=20

42 length (cm) width (cm) r = nΣxy – (Σx)(Σy) g [nΣx 2 – (Σx) 2 ] [nΣy 2 – (Σy) 2 ] n=20

43 r = nΣxy – (Σx)(Σy) g [nΣx 2 – (Σx) 2 ] [nΣy 2 – (Σy) 2 ] n=20

44 r = nΣxy – (Σx)(Σy) g [nΣx 2 – (Σx) 2 ] [nΣy 2 – (Σy) 2 ] n=20

45 r = nΣxy – (Σx)(Σy) g= +0.67 [nΣx 2 – (Σx) 2 ] [nΣy 2 – (Σy) 2 ] n=20

46 Test of product moment correlation coefficient

47 H 0 : true correlation coefficient = 0

48 Test of product moment correlation coefficient H 0 : true correlation coefficient = 0 H 1 : true correlation coefficient 0

49 Test of product moment correlation coefficient H 0 : true correlation coefficient = 0 H 1 : true correlation coefficient 0 Assumptions: both variables approximately random

50 Test of product moment correlation coefficient H 0 : true correlation coefficient = 0 H 1 : true correlation coefficient 0 Assumptions: both variables approximately random Sample statistics needed: n and r

51 Test of product moment correlation coefficient H 0 : true correlation coefficient = 0 H 1 : true correlation coefficient 0 Assumptions: both variables approximately random Sample statistics needed: n and r Test statistic: TS = r

52 Test of product moment correlation coefficient H 0 : true correlation coefficient = 0 H 1 : true correlation coefficient 0 Assumptions: both variables approximately random Sample statistics needed: n and r Test statistic: TS = r Table: product moment correlation coefficient table.

53

54 n = 20

55 n = 20 r = 0.67 p<0.01

56 length (cm) width (cm)

57 Spearmans Rank Correlation Coefficient (r s )

58 H 0 : true correlation coefficient = 0

59 Spearmans Rank Correlation Coefficient (r s ) H 0 : true correlation coefficient = 0 H 1 : true correlation coefficient 0

60 Spearmans Rank Correlation Coefficient (r s ) H 0 : true correlation coefficient = 0 H 1 : true correlation coefficient 0 Assumptions: both variables at least ordinal

61 Spearmans Rank Correlation Coefficient (r s ) H 0 : true correlation coefficient = 0 H 1 : true correlation coefficient 0 Assumptions: both variables at least ordinal Sample statistics needed: n and r s

62 Spearmans Rank Correlation Coefficient (r s ) H 0 : true correlation coefficient = 0 H 1 : true correlation coefficient 0 Assumptions: both variables at least ordinal Sample statistics needed: n and r s Test statistic: TS = r s

63 Spearmans Rank Correlation Coefficient (r s ) H 0 : true correlation coefficient = 0 H 1 : true correlation coefficient 0 Assumptions: both variables at least ordinal Sample statistics needed: n and r s Test statistic: TS = r s Table: Spearmans rank correlation coefficient table


Download ppt "Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey www.medievalarchitecture.net."

Similar presentations


Ads by Google