A Probabilistic Representation of Systemic Functional Grammar Robert Munro Department of Linguistics, SOAS, University of London.

Slides:



Advertisements
Similar presentations
TWO STEP EQUATIONS 1. SOLVE FOR X 2. DO THE ADDITION STEP FIRST
Advertisements

1 Verification by Model Checking. 2 Part 1 : Motivation.
You have been given a mission and a code. Use the code to complete the mission and you will save the world from obliteration…
Global Value Numbering using Random Interpretation Sumit Gulwani George C. Necula CS Department University of California, Berkeley.
Multistage Sampling.
Bellwork If you roll a die, what is the probability that you roll a 2 or an odd number? P(2 or odd) 2. Is this an example of mutually exclusive, overlapping,
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Introduction to Management Science, Modeling, and Excel Spreadsheets
Author: Graeme C. Simsion and Graham C. Witt Chapter 8 Organizing the Data Modeling Task.
1 Copyright © 2010, Elsevier Inc. All rights Reserved Fig 2.1 Chapter 2.
1 Chapter 40 - Physiology and Pathophysiology of Diuretic Action Copyright © 2013 Elsevier Inc. All rights reserved.
Copyright: SIPC From Ontology to Data Model: Choices and Design Decisions Matthew West Reference Data Architecture and Standards Manager Shell International.
Source of slides: Introduction to Automata Theory, Languages and Computation.
Business Transaction Management Software for Application Coordination 1 Business Processes and Coordination.
and 6.855J Cycle Canceling Algorithm. 2 A minimum cost flow problem , $4 20, $1 20, $2 25, $2 25, $5 20, $6 30, $
What is Statistics? Chapter One GOALS ONE
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Title Subtitle.
Mathematics- Module B Diana Roscoe & Crystal Lancour Comparison of the Prioritized Curriculum and Common Core State Standards (CCSS) Welcome! 1.
Winter Education Conference Consequential Validity Using Item- and Standard-Level Residuals to Inform Instruction.
0 - 0.
DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
MULTIPLYING MONOMIALS TIMES POLYNOMIALS (DISTRIBUTIVE PROPERTY)
ADDING INTEGERS 1. POS. + POS. = POS. 2. NEG. + NEG. = NEG. 3. POS. + NEG. OR NEG. + POS. SUBTRACT TAKE SIGN OF BIGGER ABSOLUTE VALUE.
SUBTRACTING INTEGERS 1. CHANGE THE SUBTRACTION SIGN TO ADDITION
MULT. INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
Addition Facts
Year 6 mental test 10 second questions Numbers and number system Numbers and the number system, fractions, decimals, proportion & probability.
|epcc| NeSC Workshop Open Issues in Grid Scheduling Ali Anjomshoaa EPCC, University of Edinburgh Tuesday, 21 October 2003 Overview of a Grid Scheduling.
Machine Learning Approaches to the Analysis of Large Corpora : A Survey Xunlei Rose Hu and Eric Atwell University of Leeds.
ZMQS ZMQS
LABELING TURKISH NEWS STORIES WITH CRF Prof. Dr. Eşref Adalı ISTANBUL TECHNICAL UNIVERSITY COMPUTER ENGINEERING 1.
BT Wholesale October Creating your own telephone network WHOLESALE CALLS LINE ASSOCIATED.
Configuration management
Université du Québec École de technologie supérieure Face Recognition in Video Using What- and-Where Fusion Neural Network Mamoudou Barry and Eric Granger.
ABC Technology Project
January 23 rd, Document classification task We are interested to solve a task of Text Classification, i.e. to automatically assign a given document.
© S Haughton more than 3?
© Charles van Marrewijk, An Introduction to Geographical Economics Brakman, Garretsen, and Van Marrewijk.
© Charles van Marrewijk, An Introduction to Geographical Economics Brakman, Garretsen, and Van Marrewijk.
Twenty Questions Subject: Twenty Questions
Squares and Square Root WALK. Solve each problem REVIEW:
University of Sheffield NLP Module 4: Machine Learning.
Design formulation ● design disciplines ● differences ● commonalities ● formulation 1/24.
This, that, these, those Number your paper from 1-10.
Chapter 5 Test Review Sections 5-1 through 5-4.
1 IPSI 2003 © 2003 T. Abou-Assaleh, N. Cercone, & V. Keselj An Overview of the Theory of Relaxed Unification Tony Abou-Assaleh Nick Cercone & Vlado Keselj.
Who are the Experts?Simon KampaSlide 1 Who are the Experts? Simon Kampa IAM Group University of Southampton
1 First EMRAS II Technical Meeting IAEA Headquarters, Vienna, 19–23 January 2009.
Addition 1’s to 20.
25 seconds left…...
Test B, 100 Subtraction Facts
1 Minimally Supervised Morphological Analysis by Multimodal Alignment David Yarowsky and Richard Wicentowski.
Week 1.
DTU Informatics Introduction to Medical Image Analysis Rasmus R. Paulsen DTU Informatics TexPoint fonts.
1 Cross-Correlations and Cleaning Up Data Jessica Ferguson.
We will resume in: 25 Minutes.
Chapter 18: The Chi-Square Statistic
4/4/2015Slide 1 SOLVING THE PROBLEM A one-sample t-test of a population mean requires that the variable be quantitative. A one-sample test of a population.
Information Extraction Lecture 7 – Linear Models (Basic Machine Learning) CIS, LMU München Winter Semester Dr. Alexander Fraser, CIS.
Introduction to Machine Learning Approach Lecture 5.
Exploiting Named Entity Taggers in a Second Language Thamar Solorio Computer Science Department National Institute of Astrophysics, Optics and Electronics.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
School of Computer Science & Engineering
Materials & Methods Introduction Abstract Results Conclusion
Materials & Methods Introduction Abstract Results Conclusion
Materials & Methods Introduction Abstract Results Conclusion
Materials & Methods Introduction Abstract Results Conclusion
Materials & Methods Introduction Abstract Results Conclusion
Presentation transcript:

A Probabilistic Representation of Systemic Functional Grammar Robert Munro Department of Linguistics, SOAS, University of London

2 Outline Introduction Functions in the nominal group Machine learning Testing framework Classification vs unmarked function Gradational realization Delicacy Conclusions

3 Introduction An exploration of the ability of machine learning to learn and represent functional categories as fundamentally probabilistic Gauged in terms of the ability to: computationally learn functions from labeled examples and apply to new texts. represent functions probabilistically: a gradation of potential realization between categories. explore finer layers of delicacy.

4 Functions in the nominal group Functions: Deictic, Ordinative, Quantitative, Epithet, Classifier, Thing (Halliday 1994) Gradations: Here, red functions also functions as an Epithet. The uptake of such marked classifiers will not be uniform. Overlap does not necessarily limit significance. Deictic The Ordin. first Quant. three Epith. tasty Class. red Thing wines

5 Machine Learning Machine learning: computational inference from specific examples. A learner named Seneschal was developed for the task here: probabilistic seeks sub-categories (improves both classification and analysis) allows categories to overlap not too dependent on the size of the data set

6 Machine Learning y x ? ?? ? ? ? ? The task here: Given categories & with known values for x and y, infer a probabilistic model (potentially with sub-categories) that can classify new examples:

7 Machine Learning It is important that attributes (x,y,z...) : represent features that distinguish functions can be discovered automatically (for large scales) are meaningful for analysis…? Compared to manually constructed parsers: greater scales than are practical more features/dimensions than are possible (100s are common)

8 Testing Framework The model was learned from 10,000 labeled words from Reuters sports newswires from features: part-of-speech and its context punctuation group / phrase contexts collocational tendencies probability of repetition

9 Testing Framework Accuracy: The ability to correctly identify the dominant function in 4 test corpora (1,000 words each): 1.Reuters Sports Newswires (1996) 2.Reuters Sports Newswires (2003) 3.Bio-informatics abstracts 4.Extract from Virginia Woolfs The Voyage Out

10 Testing Framework Gradational model of realization: calculated as the probability of a word realizing other functions, averaged between all clusters. Finer layers of delicacy: Manual analysis of clusters found within a function.

11 Unmarked function Unmarked function: function defined by only part-of-speech (POS) and word order. eg: adjective = Epithet, non-final noun = Classifier Previous functional parsers have assumed that most instances are unmarked: POS taggers are almost 100% accurate word order is trivial …so the problem is solved?

12 Unmarked function This is a false assumption. Across the corpora: < 40% of non-final adjs realized Epithets < 50% of Classifiers were nouns 44% of Classifiers were marked!

13 Unmarked function This task halved the classification error:

14 Gradational Realization Deictic The Ordin. first Quant. three Epith. tasty Class. red Thing wines Deictic Ordin. Quant. Epith. Class. Thing Nominal functions are typically represented deterministically: Although described as probabilistic, With relationships existing between all functions

15 Delicacy Deictic Numerative Epithet Expansive Thing Demonstrative Possessive Ordinative Quantitative Hyponymic Classifier First Name Intermediary Last Name non-Nom. Stated Described Discursive Nominative Named Entity Group-Releasing Nominal Tabular

16 Delicacy More delicate functions for Classifiers (Matthiessen 1995) : Hyponymic: describing a taxonomy or general type-of relationship eg: red wine, gold medal,neural network architecture' Expansive: expands the description of the Head. eg: knee surgery, optimization problems',sprint champion,

17 Delicacy

18 Delicacy More delicate descriptions can be found: more features more instances / registers other algorithms / parameters Methodology can be applied to: other parts of a grammar other languages

19 Conclusions Gradational modeling of functional realization is desirable Sophisticated methods are necessary for computationally modeling functions: Markedness is common Machine learning is a useful tool and participant in linguistic analysis.

20 Thank you Acknowledgments: Geoff Williams Sanjay Chawla The slides and extended paper will be published at: