C SC 620 Advanced Topics in Natural Language Processing Lecture 13 3/4.

Slides:



Advertisements
Similar presentations
Cognitive Linguistics Croft & Cruse 9
Advertisements

Morphology and Lexicon Chapter 3
CS 340 UML Class Diagrams. A model is an abstraction of a system, specifying the modeled system from a certain viewpoint and at a certain level of abstraction.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 6 Advanced Data Modeling.
Database Systems: Design, Implementation, and Management Tenth Edition
What is a corpus?* A corpus is defined in terms of  form  purpose The word corpus is used to describe a collection of examples of language collected.
January 12, Statistical NLP: Lecture 2 Introduction to Statistical NLP.
Introduction to Computability Theory
C SC 620 Advanced Topics in Natural Language Processing Lecture 20 4/8.
C SC 620 Advanced Topics in Natural Language Processing 3/11 Lecture 15.
C SC 620 Advanced Topics in Natural Language Processing Lecture 21 4/13.
C SC 620 Advanced Topics in Natural Language Processing Lecture 19 4/6.
C SC 620 Advanced Topics in Natural Language Processing Lecture 16 3/23.
Let remember from the previous lesson what is Knowledge representation
Modified from Sommerville’s originalsSoftware Engineering, 7th edition. Chapter 8 Slide 1 System models.
Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.
COMP205 Comparative Programming Languages Part 1: Introduction to programming languages Lecture 2: Structure of programs and programming languages as communication.
C SC 620 Advanced Topics in Natural Language Processing 3/9 Lecture 14.
Introduction to C Programming
C SC 620 Advanced Topics in Natural Language Processing Lecture 10 2/19.
C SC 620 Advanced Topics in Natural Language Processing Lecture 17 3/25.
Dictionary.
1.There was no homework. 2.Write In Your Agenda: Topic: Great Readers & Writers Practice: Start Rescued! (M) Writing Plan & Benchmark (LA). Homework: Bring.
Theory Of Automata By Dr. MM Alam
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 7 Slide 1 System models l Abstract descriptions of systems whose requirements are being.
Phonemes A phoneme is the smallest phonetic unit in a language that is capable of conveying a distinction in meaning. These units are identified within.
PETRA – the Personal Embedded Translation and Reading Assistant Werner Winiwarter University of Vienna InSTIL/ICALL Symposium 2004 June 17-19, 2004.
1 Natural Language Processing Lecture Notes 11 Chapter 15 (part 1)
Learning Automata and Grammars Peter Černo.  The problem of learning or inferring automata and grammars has been studied for decades and has connections.
Chapter 8 Data Modeling Advanced Concepts Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition.
CMPF144 FUNDAMENTALS OF COMPUTING THEORY Module 5: Classical Logic.
Modified by Juan M. Gomez Software Engineering, 6th edition. Chapter 7 Slide 1 Chapter 7 System Models.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
Introduction to Linguistics Ms. Suha Jawabreh Lecture # 2.
Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.
Computational linguistics A brief overview. Computational Linguistics might be considered as a synonym of automatic processing of natural language, since.
Programming Languages and Design Lecture 3 Semantic Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
How to Write an Excellent AP English Language and Composition Essay
Parts of Speech Major source: Wikipedia. Adjectives An adjective is a word that modifies a noun or a pronoun, usually by describing it or making its meaning.
Lexical Analysis S. M. Farhad. Input Buffering Speedup the reading the source program Look one or more characters beyond the next lexeme There are many.
Introduction Chapter 1 Foundations of statistical natural language processing.
Natural Language Processing Chapter 2 : Morphology.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
Leonid Iomdin Institute for Information Transmission Problems, Russian Academy of Sciences
MORPHOLOGY. Morphology The study of internal structure of words, and of the rules by which words are formed.
III. MORPHOLOGY. III. Morphology 1. Morphology The study of the internal structure of words and the rules by which words are formed. 1.1 Open classes.
Group 2: Sino-Tibetan Languages Working Group II: Sino-Tibetan Languages Session Report July 2, 2005.
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.
Topics 1 Specific topics to be covered are: Discrete-time signals Z-transforms Sampling and reconstruction Aliasing and anti-aliasing filters Sampled-data.
Category 2 Category 6 Category 3.
Communication Diagrams Lecture 8. Introduction  Interaction Diagrams are used to model system dynamics  How do objects change state?  How do objects.
Composing Music with Grammars. grammar the whole system and structure of a language or of languages in general, usually taken as consisting of syntax.
MORPHOLOGY. PART 1: INTRODUCTION Parts of speech 1. What is a part of speech?part of speech 1. Traditional grammar classifies words based on eight parts.
ENGLISH 5050: English Syntax and Morphology All quotations, unless otherwise noted, are from Chapter 2 of The Grammar Book, 2nd edition. Robert F. van.
Non-finite forms of the verb
Describing Syntax and Semantics
Describing Syntax and Semantics
Introduction to Parsing (adapted from CS 164 at Berkeley)
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Abstract descriptions of systems whose requirements are being analysed
Key concepts and considerations in academic writing
Language Review Topics
Welcome 6th Grade Class To
Língua Inglesa - Aspectos Morfossintáticos
Key concepts and considerations in academic writing
Teori Bahasa dan Automata Lecture 9: Contex-Free Grammars
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Vocabulary/Lexis LEXIS: n., collective, uncountable
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Presentation transcript:

C SC 620 Advanced Topics in Natural Language Processing Lecture 13 3/4

Machine Translation Readings in Machine Translation, Eds. Nirenburg, S. et al. MIT Press Part 1: Historical Perspective Reading list: –Introduction. Nirenburg, S. –1. Translation. Weaver, W. –3. The Mechanical Determination of Meaning. Reifer, E. –5. A Framework for Syntactic Translation. Yngve, V. –6. The Present Status of Automatic Translation of Languages. Bar-Hillel, Y.

Paper 3: The Mechanical Determination of Meaning. E. Reifler MT Linguistics –MT linguist (vs. traditional linguist) Mostly concerned with differences in behavior between a given pair of languages Need not adhere strictly to the results of scientific language research. –When they serve his purpose, he will consider them –He will ignore them when an arbitrary treatment of the language material better suits his purpose

Paper 3: The Mechanical Determination of Meaning. E. Reifler MT Linguistics –MT linguist (vs. traditional linguist) Practicality is a consideration of the highest order First concern is source-target semantic agreement and intelligibility –Semantics: a poor relation of linguistics, re-directed to psychologists and philosophers

Paper 3: The Mechanical Determination of Meaning. E. Reifler The Problem of Editing –Pre-editor Works with the input language Determines the intended nongrammatical meaning Annotates input, resolving ambiguity, specifying which lexeme to pick –Post-editor Works with the output language (only) Selects the preferred translation based on output context

Paper 3: The Mechanical Determination of Meaning. E. Reifler No Editor –Fully automatic –Or a pre-editor who “instructs the operator of the machine to press a special key, with the result that a mechanical memory selects only output equivalents characteristic of that branch of knowledge”

Paper 3: The Mechanical Determination of Meaning. E. Reifler Compound Forms –The mechanical dissection of complexes and their identification via the identification of their constituents means that practically no complex form, all of whose constituents are prolific and/or productive, needs to be coded into the mechanical memory. Only the prolific and productive constituents need be coded. The increase in the number of mechanical operations which such an arrangement implies will be amply compensated for by a reduction in the size of the memory –Examples: sea- in seaside, seaboard, seaway -s in seas, boards, ways

Paper 3: The Mechanical Determination of Meaning. E. Reifler Compound Forms –Three difficulties in extending this analysis Meaning of a compound often cannot be inferred from its components X-factor, letter or letter sequence could be part of the preceding as well as the following constituent –Example (Russian): »Ryb|o|lovu to a fisherman »*Rybolovu to the tin of fishes Extemporized, i.e. unpredictable, compounds –Examples: »Holdability »(German) Mit|giftwith/poisondowry

Paper 3: The Mechanical Determination of Meaning. E. Reifler The Mechanical Determination of Grammatical Meaning –Steps: Meaning of each source form in isolation Determination of semantic coincidences exhibited by syntactically correlated co-ocurrences in the input text Example (German) of grammatical meaning: –den (acc masc sg/dat pl) Männern (dat pl) Example (German) of nongrammatical meaning: –Er bestand die Prüfung/he passed the exam »bestand -> passed

Paper 3: The Mechanical Determination of Meaning. E. Reifler The Mechanical Determination of Grammatical Meaning –Substantives that can also occur as proper names Can only be resolved by pre-editor Examples: –Bauer -> farmer –Gerber -> tanner –The “Pinpointing” of Composite Intended Meanings Mongenetic vs. polygenetic meaning –Pinpointer and pinpointee

Paper 3: The Mechanical Determination of Meaning. E. Reifler Two Groups of Form Classes –Form Classes with a Very Large Membership Substantives Attributive adjectives Principal verbs Invariable attributive adjectives derived from substantives by suffix -er Predicative adjectives Adverbs of adjectival origin Cardinal numbers

Paper 3: The Mechanical Determination of Meaning. E. Reifler Two Groups of Form Classes –Form Classes with a Comparatively Very Small Membership Determiners Pro-substantives Prepositions Verbs that take predicate complements: auxiliaries etc. Separated verb prefixes Adverbs Conjunctions Interjections –Total membership: < 2000

Paper 3: The Mechanical Determination of Meaning. E. Reifler Memory Systems –Large-Drum System 4 units –Capital memory for substantives –Attribute adjective memory –Principal verb memory –Predicate adjective memory –Small-Drum System Individual memory for each operational form class (10-15) –Memory sections Memory equivalents of all low-frequency forms may be grouped according to the number of their component alphabetic and/or non- alphabetic minimal symbols –I.e. use N-symbol sections

Paper 3: The Mechanical Determination of Meaning. E. Reifler Operational Form-Class Filter System –Steps: 1.All free initial capital forms directed to capital memory 2.Input of the initial letter of all other free forms activates the small-drum system 3.All source forms which are members of small operational form classes are identified in processed in the small-drum system 4.The moment a signal has been fed in which occurs in a sequence position not existing in the small-drum system, the latter is disconnected and the large-drum system is connected 5.Forms thus rejected by the small-drum system are first directed to the capital memory

Paper 3: The Mechanical Determination of Meaning. E. Reifler Operational Form-Class Filter System –Steps: 6.All forms identified in the capital memory are processed there. Free source forms rejected by the capital memory are, in a fixed sequence, redirected to the other memories 7.They are first directed to the attributive adjective memory 8.Of forms not identified in 7, the pronominal forms are redirected to the small-drum system 9.All other free forms rejected are directed to the principal verb memory V + separable prefix processed by co-occurrence 10.All forms rejected in 9 are redirected to the memory for predicate adjectives and adverbs of adjectival and numeral origin 11.All source forms not identified so far are forwarded to the output side in their original symbols

Paper 3: The Mechanical Determination of Meaning. E. Reifler Conclusion –More details needed for pinpointers and pinpointees –But the operational form-class filtering system described here, together with the mechanical determination of the constituents of substantive compounds, amply demonstrate the feasibility of a mechanization of the work of a human pre-editor whose intervention had previously been held to be necessary. Nor does it appear from present indication that a human post-editor will be necessary