Seed Generation and Seeded Version Space Learning Version 0.02 Katharina Probst Feb 28,2002.

Slides:



Advertisements
Similar presentations
CS CS4432: Database Systems II Logical Plan Rewriting.
Advertisements

Online Scheduling with Known Arrival Times Nicholas G Hall (Ohio State University) Marc E Posner (Ohio State University) Chris N Potts (University of Southampton)
Part of Speech Tagging Importance Resolving ambiguities by assigning lower probabilities to words that don’t fit Applying to language grammatical rules.
Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence.
Prénom Nom Document Analysis: Linear Discrimination Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Enabling MT for Languages with Limited Resources Alon Lavie Language Technologies Institute Carnegie Mellon University.
1 A Tree Sequence Alignment- based Tree-to-Tree Translation Model Authors: Min Zhang, Hongfei Jiang, Aiti Aw, et al. Reporter: 江欣倩 Professor: 陳嘉平.
Iterative Optimization and Simplification of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial.
The Perceptron Algorithm (Dual Form) Given a linearly separable training setand Repeat: until no mistakes made within the for loop return:
Inducing Information Extraction Systems for New Languages via Cross-Language Projection Ellen Riloff University of Utah Charles Schafer, David Yarowksy.
Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
Reformulated - SVR as a Constrained Minimization Problem subject to n+1+2m variables and 2m constrains minimization problem Enlarge the problem size and.
Automatic Rule Learning for Resource-Limited Machine Translation Alon Lavie, Katharina Probst, Erik Peterson, Jaime Carbonell, Lori Levin, Ralf Brown Language.
Machine Learning: Symbol-Based
Lecture outline Support vector machines. Support Vector Machines Find a linear hyperplane (decision boundary) that will separate the data.
Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.
CSCI 5708: Query Processing I Pusheng Zhang University of Minnesota Feb 3, 2004.
Natural Language Processing Expectation Maximization.
Chapter 8: Relational Database Design First Normal Form First Normal Form Functional Dependencies Functional Dependencies Decomposition Decomposition Boyce-Codd.
Software Testing Sudipto Ghosh CS 406 Fall 99 November 9, 1999.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Illinois-Coref: The UI System in the CoNLL-2012 Shared Task Kai-Wei Chang, Rajhans Samdani, Alla Rozovskaya, Mark Sammons, and Dan Roth Supported by ARL,
CS Learning Rules1 Learning Sets of Rules. CS Learning Rules2 Learning Rules If (Color = Red) and (Shape = round) then Class is A If (Color.
Recent Major MT Developments at CMU Briefing for Joe Olive February 5, 2008 Alon Lavie and Stephan Vogel Language Technologies Institute Carnegie Mellon.
Writing an ERG mal-rule David Mott IBM Emerging Technology Services.
Finite State Transducers for Morphological Parsing
Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Tom M. Mitchell.
Rule Learning - Overview Goal: Syntactic Transfer Rules 1) Flat Seed Generation: produce rules from word- aligned sentence pairs, abstracted only to POS.
AMTEXT: Extraction-based MT for Arabic Faculty: Alon Lavie, Jaime Carbonell Students and Staff: Laura Kieras, Peter Jansen Informant: Loubna El Abadi.
MEMT: Multi-Engine Machine Translation Faculty: Alon Lavie, Robert Frederking, Ralf Brown, Jaime Carbonell Students: Shyamsundar Jayaraman, Satanjeev Banerjee.
Logical Database Design (1 of 3) John Ortiz Lecture 6Logical Database Design (1)2 Introduction  The logical design is a process of refining DB schema.
1 Lecture 6: Schema refinement: Functional dependencies
Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.
Ideas for 100K Word Data Set for Human and Machine Learning Lori Levin Alon Lavie Jaime Carbonell Language Technologies Institute Carnegie Mellon University.
Overview Concept Learning Representation Inductive Learning Hypothesis
CS654: Digital Image Analysis
CS 8751 ML & KDDDecision Trees1 Decision tree representation ID3 learning algorithm Entropy, Information gain Overfitting.
CS 5751 Machine Learning Chapter 3 Decision Tree Learning1 Decision Trees Decision tree representation ID3 learning algorithm Entropy, Information gain.
A Trainable Transfer-based MT Approach for Languages with Limited Resources Alon Lavie Language Technologies Institute Carnegie Mellon University Joint.
Machine Learning Queens College Lecture 7: Clustering.
Test Case Designing UNIT - 2. Topics Test Requirement Analysis (example) Test Case Designing (sample discussion) Test Data Preparation (example) Test.
1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.
A Trainable Transfer-based MT Approach for Languages with Limited Resources Alon Lavie Language Technologies Institute Carnegie Mellon University Joint.
Minimizing Delay in Shared Pipelines Ori Rottenstreich (Technion, Israel) Joint work with Isaac Keslassy (Technion, Israel) Yoram Revah, Aviran Kadosh.
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.
Chapter 8: Relations. 8.1 Relations and Their Properties Binary relations: Let A and B be any two sets. A binary relation R from A to B, written R : A.
Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement Ariadna Font Llitjós, Katharina Probst, Jaime Carbonell Language Technologies.
Avenue Architecture Learning Module Learned Transfer Rules Lexical Resources Run Time Transfer System Decoder Translation Correction Tool Word- Aligned.
October 10, 2003BLTS Kickoff Meeting1 Transfer with Strong Decoding Learning Module Transfer Rules {PP,4894} ;;Score: PP::PP [NP POSTP] -> [PREP.
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
CMU MilliRADD Small-MT Report TIDES PI Meeting 2002 The CMU MilliRADD Team: Jaime Carbonell, Lori Levin, Ralf Brown, Stephan Vogel, Alon Lavie, Kathrin.
AVENUE: Machine Translation for Resource-Poor Languages NSF ITR
Semi-Automatic Learning of Transfer Rules for Machine Translation of Minority Languages Katharina Probst Language Technologies Institute Carnegie Mellon.
Enabling MT for Languages with Limited Resources Alon Lavie and Lori Levin Language Technologies Institute Carnegie Mellon University.
Clustering Machine Learning Unsupervised Learning K-means Optimization objective Random initialization Determining Number of Clusters Hierarchical Clustering.
METHOD: Family Classification Scheme 1)Set for a model building: 67 microbial genomes with identified protein sequences (Table 1) 2)Set for a model.
The AVENUE Project: Automatic Rule Learning for Resource-Limited Machine Translation Faculty: Alon Lavie, Jaime Carbonell, Lori Levin, Ralf Brown Students:
1 Partial Orderings Epp, section Introduction An equivalence relation is a relation that is reflexive, symmetric, and transitive A partial ordering.
Unsupervised Learning: Clustering
Unsupervised Learning: Clustering
Machine Learning for the Quantified Self
Module 11: File Structure
Database Management System
The minimum cost flow problem
Ordering of Hypothesis Space
Unsupervised Learning: Clustering
Supervised machine learning: creating a model
Implementation of Learning Systems
Presentation transcript:

Seed Generation and Seeded Version Space Learning Version 0.02 Katharina Probst Feb 28,2002

Seed Generation Type of Information Source of Information SL, TL sentence Informant AlignmentInformant Phrase Information Elicitation corpus, same as SL on TL SL POS sequence English parse (c,f) TL POS sequence English parse, TL dictionary X-side constraints English parse (f) Y-side constraints English parse, list of projecting features, TL dictionary (later: feature detection) XY constraints ---

Clustering ► Seed rules are “clustered” into groups that warrant attempt to merge ► Clustering criteria: POS sequences, Phrase information, Alignments ► Main reason for clustering: divide the large version space into a number of smaller version spaces and run the algorithm on each version space separately ► Possible danger: Rules that should be considered together (such as “the man”, “men”) will not be

The Version Space ► A set of seed rules in a cluster defines a version space as follows: The seed rules form the specific boundary (S). A virtual rule with the same POS sequences, alignments, and phrase information, but no constraints forms the general boundary (G): G boundary: virtual rule with no constraints S boundary: seed rules Generalizations of seed rules, less specific than rule in G

The partial ordering of rules in the version space ► A rule TR2 is said to be strictly more general than another rule TR1 if the set of f-structures that satisfy TR2 are a superset of the set of f- structures that satisfy TR1. It is said to be equivalent to TR1 if the set of f-structures that satisfy TR1 is the same as the set of f-structures that satisfy TR2. ► We have defined three operations that move a transfer rule to a strictly more general rule

Generalization operations ► Operation 1: delete value constraint, e.g. ((X1 agr) = *3pl) → NULL ► Operation 2: delete agreement constraint, e.g. ((X1 agr) = (X2 agr)) → NULL ► Operation 3: merge two value constraints to an agreement constraint ((X1 agr) = *3pl), ((X2 agr) = *3pl) → ((X1 agr) = (X2 agr)) [Note: if the first index is an X index and the second a Y index, this operation should only be performed if the feature is in the list of projecting features]

Merging two transfer rules At the heart of the seeded version space learning algorithm is the merging of two transfer rules (TR1 and TR2) to a more general rule (TR3): 1. Insert into TR3 all constraints that are both in TR1 and TR2 and remove them from TR1 and TR2. 2. Perform all instances of Operation 3 on TR1 and TR2 separately. 3. Repeat step 1. [Note: Operation 1 and Operation 2 are executed implicitly].

Seeded Version Space Algorithm 1. Remove duplicate rules from the S boundary 2. Try to merge each pair of transfer rules 3. A merge is successful only if the CSet of the merged rule is a superset of the union of the CSets of the two unmerged rules, where the CSet of a rule denotes the set of training sentences that are “covered”, i.e. translated correctly by the rule 4. Pick the successful merge that optimizes an evaluation criterion 5. Repeat until no more merges are found

Evaluating a set of transfer rules ► Initial thought: evaluate a set based on  The coverage of its rules, i.e. the union of its CSets  The size of the rule set ► Goal: maximize coverage and minimize set size ► Currently: merges are only successful if there is no loss in coverage, so size of rule set only criterion used ► Future(1): Coverage should be measured on a test set ► Future(2): Relax the constraint that a successful merge cannot result in loss of coverage

Next steps ► Compositionality, integration with transfer engine ► Exploring the space below the seed rules ► Specializing: we do not want a merge to be a final decision, want to allow for a rule to be “lowered” to a more specific rule ► What is the right inductive bias?