Download presentation
Presentation is loading. Please wait.
Published byCortez Sanburn Modified over 10 years ago
1
Some statistical methods on syntactic variables in L1 writing Report from an ongoing study Bård Uri Jensen PhD student UiB / Hedmark University College (Hamar) Solstrand 2010-03-26
2
Contents Introducing the project The ELEV corpus vs the ASK corpus Extracting data Analysing data
3
My doctoral project Research question – Do people tend to make different grammatical choices when they type on keyboard rather than write by hand? Hypotheses – Higher production speed affects the choices in a ”spontaneous” direction – Skilled writers may utilise the enhanced functionality and shift features in the opposite direction – Other psychological factors may affect the choices motivational factors social media norms
4
The ELEV corpus A ”parallel” corpus of hand-written and keyboarded texts – Two texts by each pupil The ASK corpus system Manual syntactic segmentation – t-units – clauses – fragments No error tags
5
Alle mennesker er forskjellige, Kvinnfolk driver på data og gutter leser bøker Jeg liker å få på ski. Fordi det gir meg bedre kondisjon. All humans are different, Women use computers and boys read books I like cross-country skiing. Because it gives me better stamina.
6
drikk deg full. Er dette en sunn utvikling? get (yourself) drunk. Is this a healthy development?
7
Politiet vet det er folk under 18 som drikker der, The police know there are people under 18 who drink there,
8
Men hva med andre bøker? men veit da om flere jenter som ikke gjør det også! But what about other books? but [I] know about several girls who don’t do it also!
9
Er dette en sunn utvikling? Is this a healthy development?
10
Corpus searches [features='.* subst.*']; []* ; []{5,10} ; ([lemma='\$.']*[!lemma='\$.']){5,10} [lemma='\$.']* ;
11
Corpus searches : frontal subclauses [features='.* konj.*']? ( | | ) [];
12
Corpus searches : embedding [!clause]+ []* [!clause]+ ;
13
Corpus searches : lexical distribution [lemma!='\$.']; [features=".* verb.*"];
14
Statistics : Three examples Some simple analyses – differences of mean – correlations Classification analysis Clustering
15
Mean & correlation
43
Classification analysis Independent variables (parameters) – writing mode hand ~ keyboard – writing skills medium ~ high – gender – essay question Dependent variable – freq of attributive adjectives – subclause freq
44
YES
53
Cluster analysis About 50 dependent variables
57
References Baayen 2008: Analyzing linguistic : A practical introduction to statistics using R Dodge 2010: The concise encyclopedia of statistics Gries 2009: Statistics for linguistics with R : a practical introduction Zuur et al. 2009: A beginner’s guide to R
58
Bård Uri Jensen Hedmark University College (Hamar) http://www.hihm.no http://privat.hihm.no/buj bard.jensen@hihm.no http://privat.hihm.no/buj/solstrand2010.pdf
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.