Presentation is loading. Please wait.

Presentation is loading. Please wait.

Some statistical methods on syntactic variables in L1 writing Report from an ongoing study Bård Uri Jensen PhD student UiB / Hedmark University College.

Similar presentations


Presentation on theme: "Some statistical methods on syntactic variables in L1 writing Report from an ongoing study Bård Uri Jensen PhD student UiB / Hedmark University College."— Presentation transcript:

1 Some statistical methods on syntactic variables in L1 writing Report from an ongoing study Bård Uri Jensen PhD student UiB / Hedmark University College (Hamar) Solstrand 2010-03-26

2 Contents Introducing the project The ELEV corpus vs the ASK corpus Extracting data Analysing data

3 My doctoral project Research question – Do people tend to make different grammatical choices when they type on keyboard rather than write by hand? Hypotheses – Higher production speed affects the choices in a ”spontaneous” direction – Skilled writers may utilise the enhanced functionality and shift features in the opposite direction – Other psychological factors may affect the choices motivational factors social media norms

4 The ELEV corpus A ”parallel” corpus of hand-written and keyboarded texts – Two texts by each pupil The ASK corpus system Manual syntactic segmentation – t-units – clauses – fragments No error tags

5 Alle mennesker er forskjellige, Kvinnfolk driver på data og gutter leser bøker Jeg liker å få på ski. Fordi det gir meg bedre kondisjon. All humans are different, Women use computers and boys read books I like cross-country skiing. Because it gives me better stamina.

6 drikk deg full. Er dette en sunn utvikling? get (yourself) drunk. Is this a healthy development?

7 Politiet vet det er folk under 18 som drikker der, The police know there are people under 18 who drink there,

8 Men hva med andre bøker? men veit da om flere jenter som ikke gjør det også! But what about other books? but [I] know about several girls who don’t do it also!

9 Er dette en sunn utvikling? Is this a healthy development?

10 Corpus searches [features='.* subst.*']; []* ; []{5,10} ; ([lemma='\$.']*[!lemma='\$.']){5,10} [lemma='\$.']* ;

11 Corpus searches : frontal subclauses [features='.* konj.*']? ( | | ) [];

12 Corpus searches : embedding [!clause]+ []* [!clause]+ ;

13 Corpus searches : lexical distribution [lemma!='\$.']; [features=".* verb.*"];

14 Statistics : Three examples Some simple analyses – differences of mean – correlations Classification analysis Clustering

15 Mean & correlation

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43 Classification analysis Independent variables (parameters) – writing mode hand ~ keyboard – writing skills medium ~ high – gender – essay question Dependent variable – freq of attributive adjectives – subclause freq

44 YES

45

46

47

48

49

50

51

52

53 Cluster analysis About 50 dependent variables

54

55

56

57 References Baayen 2008: Analyzing linguistic : A practical introduction to statistics using R Dodge 2010: The concise encyclopedia of statistics Gries 2009: Statistics for linguistics with R : a practical introduction Zuur et al. 2009: A beginner’s guide to R

58 Bård Uri Jensen Hedmark University College (Hamar) http://www.hihm.no http://privat.hihm.no/buj bard.jensen@hihm.no http://privat.hihm.no/buj/solstrand2010.pdf


Download ppt "Some statistical methods on syntactic variables in L1 writing Report from an ongoing study Bård Uri Jensen PhD student UiB / Hedmark University College."

Similar presentations


Ads by Google