1 Minimally Supervised Morphological Analysis by Multimodal Alignment David Yarowsky and Richard Wicentowski.

Slides:



Advertisements
Similar presentations
You have been given a mission and a code. Use the code to complete the mission and you will save the world from obliteration…
Advertisements

Advanced Piloting Cruise Plot.
© 2008 Pearson Addison Wesley. All rights reserved Chapter Seven Costs.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
1 Copyright © 2010, Elsevier Inc. All rights Reserved Fig 2.1 Chapter 2.
By D. Fisher Geometric Transformations. Reflection, Rotation, or Translation 1.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Title Subtitle.
My Alphabet Book abcdefghijklm nopqrstuvwxyz.
Multiplying binomials You will have 20 seconds to answer each of the following multiplication problems. If you get hung up, go to the next problem when.
DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
ADDING INTEGERS 1. POS. + POS. = POS. 2. NEG. + NEG. = NEG. 3. POS. + NEG. OR NEG. + POS. SUBTRACT TAKE SIGN OF BIGGER ABSOLUTE VALUE.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Addition Facts
Year 6 mental test 5 second questions
Year 6 mental test 10 second questions
Overview of Lecture Partitioning Evaluating the Null Hypothesis ANOVA
Solve Multi-step Equations
Richmond House, Liverpool (1) 26 th January 2004.
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
Randomized Algorithms Randomized Algorithms CS648 1.
ABC Technology Project
3 Logic The Study of What’s True or False or Somewhere in Between.
1 Undirected Breadth First Search F A BCG DE H 2 F A BCG DE H Queue: A get Undiscovered Fringe Finished Active 0 distance from A visit(A)
VOORBLAD.
15. Oktober Oktober Oktober 2012.
1. 2 No lecture on Wed February 8th Thursday 9 th Feb 14: :00 Thursday 9 th Feb 14: :00.
Name Convolutional codes Tomashevich Victor. Name- 2 - Introduction Convolutional codes map information to code bits sequentially by convolving a sequence.
1 Breadth First Search s s Undiscovered Discovered Finished Queue: s Top of queue 2 1 Shortest path from s.
BIOLOGY AUGUST 2013 OPENING ASSIGNMENTS. AUGUST 7, 2013  Question goes here!
Factor P 16 8(8-5ab) 4(d² + 4) 3rs(2r – s) 15cd(1 + 2cd) 8(4a² + 3b²)
Squares and Square Root WALK. Solve each problem REVIEW:
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
© 2012 National Heart Foundation of Australia. Slide 2.
Lets play bingo!!. Calculate: MEAN Calculate: MEDIAN
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
Chapter 5 Test Review Sections 5-1 through 5-4.
GG Consulting, LLC I-SUITE. Source: TEA SHARS Frequently asked questions 2.
Addition 1’s to 20.
25 seconds left…...
Slippery Slope
Januar MDMDFSSMDMDFSSS
Week 1.
Analyzing Genes and Genomes
1 Let’s Recapitulate. 2 Regular Languages DFAs NFAs Regular Expressions Regular Grammars.
We will resume in: 25 Minutes.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Figure Essential Cell Biology (© Garland Science 2010)
Essential Cell Biology
Intracellular Compartments and Transport
A SMALL TRUTH TO MAKE LIFE 100%
PSSA Preparation.
TASK: Skill Development A proportional relationship is a set of equivalent ratios. Equivalent ratios have equal values using different numbers. Creating.
Immunobiology: The Immune System in Health & Disease Sixth Edition
Essential Cell Biology
How Cells Obtain Energy from Food
Immunobiology: The Immune System in Health & Disease Sixth Edition
Energy Generation in Mitochondria and Chlorplasts
CpSc 3220 Designing a Database
Traktor- og motorlære Kapitel 1 1 Kopiering forbudt.
DISTRIBUSI PROBABILITAS KONTINYU Referensi : Walpole, RonaldWalpole. R.E., Myers, R.H., Myers, S.L., and Ye, K Probability & Statistics for Engineers.
Presentation transcript:

1 Minimally Supervised Morphological Analysis by Multimodal Alignment David Yarowsky and Richard Wicentowski

2 Introduction The Algorithm capable of inducing inflectional morphological analyses of regular and highly irregular forms. The Algorithm combines four original alignment models based on: Relative corpus frequency. Contextual Similarity. Weighted string similarity. Incrementally retrained inflectional transduction probabilities.

3 Lecture ’ s Subjects Task definition. Required and Optional resources. The Algorithm. Empirical Evaluation.

4 Task Definition Consider this task as three steps: Estimate a probabilistic alignment between inflected forms and root forms. Train a supervised morphological analysis learner on a weighted subset of these aligned pairs. Use the result from step 2 to iteratively refine the alignment in step 1.

5 Example (POS) Definitions:

6 Task Definition cont. The target output of step 1:

7 Required and Optional resources For the given language we need: A table of the inflectional Part of Speech (POS). A list of the canonical suffixes. A large text corpus.

8 Required and Optional resources cont. A list of the candidate noun, verb and adjective roots (from dictionary), and any rough mechanism for identifying the candidates POS of the remaining vocabulary. (not based on morphological analysis). A list of the consonants and vowels.

9 Required and Optional resources cont. A list of common function words. A distance/similarity tables generated on previously studied languages. Not essential If available

10 The Algorithm Combines four original alignment models: Alignment by Frequency Similarity. Alignment by Context Similarity. Alignment by Weighted Levenshtein Distance. Alignment by Morphological Transformation Probabilities.

11 Lemma Alignment by Frequency Similarity The motivating dilemma: singsinged VBD ? singsang VBD ? taketaked VBD ?

12 Lemma Alignment by Frequency Similarity cont. This Table is based on relative corpus frequency:

13 Lemma Alignment by Frequency Similarity cont.

14 Lemma Alignment by Frequency Similarity cont. A problem: the true alignments between inflections are unknown in advance. A simplifying assumption: the frequency ratios between inflections and roots is not significantly different between regular and irregular morphological processes.

15 Lemma Alignment by Frequency Similarity cont. Similarity between regular and irregular forms:

16 Lemma Alignment by Frequency Similarity cont. The expected frequency should also be estimable from the frequency of any of the other inflectional variants. VBD/VBG and VBD/VBZ could also be used as estimators.

17 Lemma Alignment by Frequency Similarity cont.

18 Lemma Alignment by Context Similarity Based on contextual similarity of the candidate form. Computing similarity between vectors of weighted and filtered context features. Clustering inflectional variants of verbs (e.g. sipped, sipping, and sip).

19 Lemma Alignment by Context Similarity cont. Example: CW subj (AUX|NEG)*V keyword DET?CW*CW obj eatingtheappleShlomois

20 Lemma Alignment by Weighted Levenshtein Distance Consider overall stem edit distance. A cost matrix with initial distance costs: initially set to (0.5,0.6,1.0,0.98)

21 Lemma Alignment by Morphological Transformation Probabilities The goal is to generalize a mapping function via a generative probabilistic model.

22 Lemma Alignment by Morphological Transformation Probabilities Result table:

23 Lemma Alignment by Morphological Transformation Probabilities cont. + +  P(inflection | root,suffix,POS)=P(stemchange | root,suffix,POS) unique

24 Lemma Alignment by Morphological Transformation Probabilities cont. Example:

25 Lemma Alignment by Morphological Transformation Probabilities cont. Example: P(solidified | solidify, +ed, VBD) = P(y  i | solidify, +ed, VBD) ≈ 1 P(y  i | ify, +ed) + (1- 1 )( 2 P(y  i | fy, +ed) + (1- 2 )( 3 P(y  i | y, +ed) + (1- 3 )( 4 P(y  i | +ed) + (1- 4 ) P(y  i) POS can be deleted

26 Lemma Alignment by Model Combination and the Pigeonhole Principle No single model is sufficiently effective on its own. The Frequency, Levenshtein and Context Similarity models retain equal relative weight. The Morphological Transformation Similarity model increases in relative weight.

27 Lemma Alignment by Model Combination and the Pigeonhole Principle Example:

28 Lemma Alignment by Model Combination and the Pigeonhole Principle cont. The final alignment is based on the pigeonhole principle. For a given POS a root shouldn't have more than one inflection nor should multiple inflections in the same POS share the same root.

29 Empirical Evaluation Performance: