Weighted Tree Transducers A Short Introduction Cătălin Ionuţ Tîrnăucă Research Group on Mathematical Linguistics, Rovira i Virgili University Pl. Imperial.

Slides:



Advertisements
Similar presentations
T YPE I SOMORPHISM O LA M AHMOUD S UPERVISED BY : D R. M ARCELO F IORE UNIVERSITY OF CAMBRIDGE Computer Laboratory Theory & Semantics Group I NTRODUCTION.
Advertisements

Non-deterministic Tree Automata Models for Statistical Machine Translation Chiara Moraglia.
CS Morphological Parsing CS Parsing Taking a surface input and analyzing its components and underlying structure Morphological parsing:
CS 208: Computing Theory Assoc. Prof. Dr. Brahim Hnich Faculty of Computer Sciences Izmir University of Economics.
Equivalence of Extended Symbolic Finite Transducers Presented By: Loris D’Antoni Joint work with: Margus Veanes.
ANTLR in SSP Xingzhong Xu Hong Man Aug Outline ANTLR Abstract Syntax Tree Code Equivalence (Code Re-hosting) Future Work.
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
Finite-State Transducers Shallow Processing Techniques for NLP Ling570 October 10, 2011.
The Power of Correction Queries Cristina Bibire Research Group on Mathematical Linguistics, Rovira i Virgili University Pl. Imperial Tarraco 1, 43005,
Formal Language, chapter 9, slide 1Copyright © 2007 by Adam Webber Chapter Nine: Advanced Topics in Regular Languages.
1 Pass Compiler 1. 1.Introduction 1.1 Types of compilers 2.Stages of 1 Pass Compiler 2.1 Lexical analysis 2.2. syntactical analyzer 2.3. Code generation.
Markov Algorithms An Alternative Model of Computation.
::ICS 804:: Theory of Computation - Ibrahim Otieno SCI/ICT Building Rm. G15.
January 5, 2015CS21 Lecture 11 CS21 Decidability and Tractability Lecture 1 January 5, 2015.
Natural Language and Speech Processing Creation of computational models of the understanding and the generation of natural language. Different fields coming.
NLP and Speech Course Review. Morphological Analyzer Lexicon Part-of-Speech (POS) Tagging Grammar Rules Parser thethe – determiner Det NP → Det.
Finite-State Automata Shallow Processing Techniques for NLP Ling570 October 5, 2011.
Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.
Finite Automata Finite-state machine with no output. FA consists of States, Transitions between states FA is a 5-tuple Example! A string x is recognized.
1/7 INFO60021 Natural Language Processing Harold Somers Professor of Language Engineering.
1 Terminology l Statement ( 敘述 ) »declaration, assignment containing expression ( 運算式 ) l Grammar ( 文法 ) »a set of rules specify the form of legal statements.
1 Finite state automaton (FSA) LING 570 Fei Xia Week 2: 10/07/09 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAA.
تمرين شماره 1 درس NLP سيلابس درس NLP در دانشگاه هاي ديگر ___________________________ راحله مکي استاد درس: دکتر عبدالله زاده پاييز 85.
CMPT-825 (Natural Language Processing) Presentation on Zipf’s Law & Edit distance with extensions Presented by: Kaustav Mukherjee School of Computing Science,
BİL744 Derleyici Gerçekleştirimi (Compiler Design)1.
Theory of Computation. General Concepts  Scope of course – Formal languages – Automata theory – Computability – Computational complexity.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Natural Language Processing Lab Northeastern University, China Feiliang Ren EBMT Based on Finite Automata State Transfer Generation Feiliang Ren.
9/8/20151 Natural Language Processing Lecture Notes 1.
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
Chapter 10 Natural Language Processing Xiu-jun GONG (Ph. D) School of Computer Science and Technology, Tianjin University
Computational Linguistics Yoad Winter *General overview *Examples: Transducers; Stanford Parser; Google Translate; Word-Sense Disambiguation * Finite State.
Parser-Driven Games Tool programming © Allan C. Milne Abertay University v
Tree-adjoining grammar (TAG) is a grammar formalism defined by Aravind Joshi and introduced in Tree-adjoining grammars are somewhat similar to context-free.
Learning DFA from corrections Leonor Becerra-Bonache, Cristina Bibire, Adrian Horia Dediu Research Group on Mathematical Linguistics, Rovira i Virgili.
© M. Winter COSC/MATH 4P61 - Theory of Computation COSC/MATH 4P61 Theory of Computation Michael Winter –office: J323 –office hours: Mon & Fri, 10:00am-noon.
Tree Bimorphisms and Their Relevance in the Theory of Translations Cătălin-Ionuţ Tîrnăucă GRLMC, Rovira i Virgili University
Suléne Pilon & Danie Prinsloo Overview: Teaching and Training in South Africa 25 November 2008;
1 CSI 5180: Topics in AI: Natural Language Processing, A Statistical Approach Instructor: Nathalie Japkowicz Objectives of.
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
Computational Linguistics. The Subject Computational Linguistics is a branch of linguistics that concerns with the statistical and rule-based natural.
Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University EMNLP 2009 Presented by Ji Zongcheng.
Compiler design Lecture 1: Compiler Overview Sulaimany University 2 Oct
Daisy Arias Math 382/Lab November 16, 2010 Fall 2010.
Introduction CPSC 388 Ellen Walker Hiram College.
CSE467/567 Computational Linguistics Carl Alphonce Computer Science & Engineering University at Buffalo.
Compiler Design Introduction 1. 2 Course Outline Introduction to Compiling Lexical Analysis Syntax Analysis –Context Free Grammars –Top-Down Parsing –Bottom-Up.
Finite State Machines 1.Finite state machines with output 2.Finite state machines with no output 3.DFA 4.NDFA.
CSA4050: Advanced Topics in NLP Computational Morphology II Introduction 2 Level Morphology.
NLP. Introduction to NLP Time flies like an arrow –Many parses –Some (clearly) more likely than others –Need for a probabilistic ranking method.
Learning regular tree languages from correction and equivalence queries Cătălin Ionuţ Tîrnăucă Research Group on Mathematical Linguistics, Rovira i Virgili.
CS416 Compiler Design1. 2 Course Information Instructor : Dr. Ilyas Cicekli –Office: EA504, –Phone: , – Course Web.
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 جامعة الملك فيصل عمادة.
Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.
CC410: System Programming Dr. Manal Helal – Fall 2014 – Lecture 12–Compilers.
Two Level Morphology Alexander Fraser & Liane Guillou CIS, Ludwig-Maximilians-Universität München Computational Morphology.
Theory of Computation Automata Theory Dr. Ayman Srour.
Approaches to Machine Translation
Introduction to the Theory of Computation
PRESENTED BY: PEAR A BHUIYAN
CS416 Compiler Design lec00-outline September 19, 2018
Introduction CI612 Compiler Design CI612 Compiler Design.
CSE322 Mealy and Moore Machine
Building Finite-State Machines
Approaches to Machine Translation
Natural Language Processing
CS416 Compiler Design lec00-outline February 23, 2019
Lec00-outline May 18, 2019 Compiler Design CS416 Compiler Design.
Statistical NLP Winter 2009
Presentation transcript:

Weighted Tree Transducers A Short Introduction Cătălin Ionuţ Tîrnăucă Research Group on Mathematical Linguistics, Rovira i Virgili University Pl. Imperial Tarraco 1, 43005, Tarragona, Spain Pl. Imperial Tarraco 1, 43005, Tarragona, Spain 9 th of January 2006 Seminar I

Outline  History  What is a WTT?  Why WTT’s?  Where can we apply WTT’s?  Work plan  References

History 1960s: tree automata and tree languages emerged quite naturally from the view of finite automata as unary algebras (J.R. Büchi and J.B. Wright). 1970: tree transducers were introduced independently by Rounds and Thatcher as a generalization of finite state transducers: - top-down tree transducer (root-to-frontier) - bottom-up tree transducer (frontier-to-root). 1970’s: weighted transducers and rational power series were developed by M. P. Schützenberger, S. Eilenberg, A. Salomaa, W. Kuich, J. Berstel, M. Soittola, C. Reutanauer, M. Mohri.

History (II) 1999, 2001: tree series transducers were generalized from tree transducers by allowing tree series as output rather than trees, where a tree series is a mapping from output trees to some semiring (W. Kuich, J. Engelfriet, Z. Fülöp, H. Vogler ): - the semantics is defined in an algebraic style. 2004: weighted tree transducers were introduced as an alternative approach of tree series transducers by Z. Fülöp and H. Vogler: - the semantics is defined in an operational style. MAIN RESULT: Tree series transducers and weighted tree transducers are semantically equivalent for both the top-down and the bottom-up case.

What is a WTT? A tree transducer is a finite state machine which computes a tree transformation. In other words, given an input tree over the input ranked alphabet, the tree transducer computes a set of output trees over the output ranked alphabet. Informally, a weighted tree transducer is a tree transducer each (term rewriting) rule of which is associated with a weight taken from a semiring. Along a successful transformation the weights of the involved rules are multiplied and, for every pair of input tree and output tree, the weights of its successful transformations are summed up.

What is a WTT? (II) So, we can say that a WTT combines two extremely powerful tools: tree transducers weighted transducers Weighted Tree Transducer Input tree (Output tree, Weight) A scheme of a weighted tree transducer (WTT) can be visualised in the following picture:

What is a WTT? (III) Formally, a weighted tree transducer is a tuple where: - is a tree transducer with R being the set of its term rewriting rules; - is a semiring; - is a function which associates with each rule a weight in the semiring A. There are two types (approaches): - top-down; - bottom-up.

Why WTT’s? -MOTIVATION:

Why WTT’s? -MOTIVATION: NLP (machine translation, speech recognition)

Why WTT’s? -MOTIVATION: NLP (machine translation, speech recognition) -TOOLS:

Why WTT’s? -MOTIVATION: NLP (machine translation, speech recognition) -TOOLS: finite-state automata and transducers

Why WTT’s? -MOTIVATION: NLP (machine translation, speech recognition) -TOOLS: finite-state automata and transducers -IDEA:

Why WTT’s? -MOTIVATION: NLP (machine translation, speech recognition) -TOOLS: finite-state automata and transducers -IDEA: the machines probabilistically transform input strings into output strings, and they can be quickly assembled to tackle new jobs via generic mathematical operations like composition and forward application.

Why WTT’s? -MOTIVATION: NLP (machine translation, speech recognition) -TOOLS: finite-state automata and transducers -IDEA: the machines probabilistically transform input strings into output strings, and they can be quickly assembled to tackle new jobs via generic mathematical operations like composition and forward application. -PROBLEM:

Why WTT’s? -MOTIVATION: NLP (machine translation, speech recognition) -TOOLS: finite-state automata and transducers -IDEA: the machines probabilistically transform input strings into output strings, and they can be quickly assembled to tackle new jobs via generic mathematical operations like composition and forward application. -PROBLEM: these machines are a bad fit for many important problems that require syntax-sensitive transformations and large- scale re-ordering.

Why WTT’s? -MOTIVATION: NLP (machine translation, speech recognition) -TOOLS: finite-state automata and transducers -IDEA: the machines probabilistically transform input strings into output strings, and they can be quickly assembled to tackle new jobs via generic mathematical operations like composition and forward application. -PROBLEM: these machines are a bad fit for many important problems that require syntax-sensitive transformations and large- scale re-ordering. -SOLUTION:

Why WTT’s? -MOTIVATION: NLP (machine translation, speech recognition) -TOOLS: finite-state automata and transducers -IDEA: the machines probabilistically transform input strings into output strings, and they can be quickly assembled to tackle new jobs via generic mathematical operations like composition and forward application. -PROBLEM: these machines are a bad fit for many important problems that require syntax-sensitive transformations and large- scale re-ordering -SOLUTION: finite automata have to be replaced by more powerful tools like weighted (tree) automata and that trees should take the place of the strings.

Where can we apply WTT’s? translation systems (the project TREEWORLD at ISI’s Natural Language Group): - more accurate language processing system; - better understanding of how to model language translation more deeply and accurately; - syntactic and lexical translation knowledge can still be acquired fully automatically by the machine; computational biology text recognition (compression, indexing, pattern matching) image processing (filters, image compression); speech recognition (speech synthesis, large-vocabulary); others…?

Work plan MAIN GOAL: BACKGROUND THEORETICAL FOUNDATIONS APPLICATIONS Applications of WTT’s in machine translations weighted transducerstree transducers and tree transformations tree series transducers weighted tree transducers analyse the algorithms developed so far with the above formal models

Work plan (II) MAIN GOAL: BACKGROUND THEORETICAL FOUNDATIONS APPLICATIONS Applications of WTT’s in machine translations various types of weighted bottom-up and top-down tree transducers compare weighted tree transformations defined by different types of such transducers consider compositions of WTTs and closure properties of the various classes w.r.t. composition consider decompositions of WTTs of a given type into compositions of WTTs of simpler types

Work plan (III) MAIN GOAL: BACKGROUND THEORETICAL FOUNDATIONS APPLICATIONS Applications of WTT’s in machine translations design efficient computer science algorithms for generic tree operations design efficient machine learning algorithms for inducing tree automata, tree transducers and probabilities from linguistic data use weighted tree automata and tree transducers to accurately model problems in automatic language

References  Z. Fülöp. A Short Introduction To Tree Transducers. XIX Tarragona Seminar on Formal Syntax and Semantics, FS&S. (2004)  Z. Fülöp, H. Volger. Weighted Tree Transducers. Journal of Automata, Languages and Combinatorics, 9. (2004)  M. Mohri, F. C. N. Pereira, M. Riley. Weighted Finite-State Transducers in Speech Recognition. Computer Speech and Language, 16(1): (2002)  J. Engelfriet, Z. Fülöp, H. Volger. Bottom-up and Top-down Tree Series Transformations. Journal of Automata, Languages and Combinatorics, 7: (2002)  W Kuich. The transducers and formal tree series. Acta Cybernetica, 14(1): (1999)