Download presentation
Presentation is loading. Please wait.
Published byMadison Walton Modified over 9 years ago
1
Measuring Linguistic Complexity Kristopher Kyle 3-5-2015
2
Who is this guy? Interested in: L2 Writing Quality/Development Assessment Natural Language Processing Productive Vocabulary Productive Syntax
3
Outline of Workshop Why measure linguistic complexity? How can linguistic complexity measures be conceptualized? How do we actually measure linguistic complexity? Hands-on workshop I: Measuring syntactic complexity Hands-on workshop II: From raw data to findings (if time)
4
Why measure linguistic complexity? In the 70’s, SLA researchers (e.g., Larsen-Freeman, 1978) wanted to measure language development Larsen-Freeman proposed three constructs of development: complexity accuracy fluency The general hypothesis (with regard to complexity) has been: As language learners develop, their language will become more complex. How complexity is measured has been the subject of much debate (e.g., Bulté & Housen, 2012)
5
How can linguistic complexity measures be conceptualized? Wolfe-Quintero et al. (1998) provides a compendium of CAF measures up until the late 90’s Lexical Complexity: a variety of general and part of speech specific type/token ratio counts Syntactic Complexity a variety of clause, sentence, and T-unit measures that focus on clausal complexity.
6
How can linguistic complexity measures be conceptualized? Most of syntactic complexity indices are ratio scores: (Structure A)/(Structure B). The denominator (Structure B) is either: clause: a main verb and its dependents (I eat pizza.) T-unit: an independent clause and any attached dependent clauses (I eat pizza because it is delicious.) sentence: A string of words that starts with a capital letter and ends with sentence-ending punctuation (I think you know what a sentence is.)
7
How can linguistic complexity measures be conceptualized? The numerator (Structure A) has included many structures: clauses dependent clauses adverbial clauses T-units complex T-units coordinate phrases complex nominals verb phrases passives
8
How can linguistic complexity measures be conceptualized? Length of unit measures have also been prominent (e.g., Ortega, 2003; Lu, 2011). Mean length of clause (MLC) Mean length of T-unit (MLTU) Mean length of sentence (MLS)
9
How can linguistic complexity measures be conceptualized? The rise of phrasal complexity: Biber, Poonpon, and Grey (2011) suggested that clausal subordination (i.e., what most syntactic complexity indices measure) is NOT a prominent feature of academic writing Informal speech includes many dependent clauses, but academic writing includes many dependent phrases (and especially noun phrases.
10
How can linguistic complexity measures be conceptualized? Some important issues: Definition of measures What counts as a clause? Prominence of broad indices What does MLC really tell us about development? Often only a limited range of measures are used.
11
How do we actually measure linguistic complexity? To measure linguistic complexity, we have two options. Option #1: Count features by hand Option #2: Count features using a computer
12
How do we actually measure linguistic complexity? Advantages of Option 1: Researcher has full control over how syntactic complexity is measured. Human counts may be more accurate Disadvantages of Option 1: Expensive! Intra-rater reliability Inter-rater reliability – who is qualified?
13
How do we actually measure linguistic complexity? Advantages of Option 2: Very cheap Reliable (same results every time) Usually Accurate Biber (e.g., 2004) and Lu (2010, 2011) report accuracies above 90% Can analyze a broad range of indices at once. Disadvantages of Option 2: Research has less control (is at mercy of available programs) Some data is not well-suited to automatic analysis Some linguistic features cannot be reliably captured
14
Hands-on workshop I: Measuring syntactic complexity Go to www.kristopherkyle.com/workshop/ and download the “short_samples.zip” file.www.kristopherkyle.com/workshop/ Without talking with your neighbor(s) fill in the included excel sheet for examples 1-5. What were your answers? Any issues with example 5? Now do the same for example 6…
15
Hands-on workshop I: Measuring syntactic complexity Tool for the Automatic Analysis of Syntactic Complexity (TAASC) Prototype!!! Includes indices created by Xiaofe Lu (Syntactic Complexity Analyzer; Lu, 2011) Also includes some replications of the Biber Tagger
16
Hands-on workshop I: Measuring syntactic complexity How TAASC works: Reads file Splits file into sentences Parses each sentence uses Stanford Parser Uses regular expressions (a way to search for patterns) to identify particular structures in the parse tree. uses Stanford Tregex (regular expressions for parse trees)
17
Hands-on workshop I: Measuring syntactic complexity Now, lets check to see if your computer is set up correctly. First, search for Terminal (mac) or Command Prompt (Windows) Then type: java –version Then type: python Go to www.kristopherkyle.com/workshop/ and download the appropriate version of TAASC (windows or mac).www.kristopherkyle.com/workshop/ Extract it to your Desktop Copy the example files to the “to_process_2” folder
18
Hands-on workshop I: Measuring syntactic complexity Now, in Terminal/Command Prompt type: cd [location of TAASC folder] (then press “return”) python [name of the appropriate TAASC program] (“return”) Your results should now be in a file called “results.csv” If you want to examine the accuracy of the parse trees, look in the folder “parsed_files” using Tregex
19
Hands-on workshop I: Measuring syntactic complexity Some simple patterns: VP VP<S Some important patterns: clause: S|SINV|SQ <<# MD|VBP|VBZ|VBD T-unit: S|SBARQ|SINV|SQ > ROOT | [$-- S|SBARQ|SINV|SQ !>> SBAR|VP]
20
Hands-on workshop II: From raw data to findings Go to www.kristopherkyle.com/workshop/ and download the “Workshop_Data.zip” file.www.kristopherkyle.com/workshop/ 58 participants, three timed essays over 1 year. IEP Levels 3-4 (Intermediate/Advanced) Now let’s analyze some data! NOTE: We didn’t get to this in class…
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.