C SC 620 Advanced Topics in Natural Language Processing Lecture 17 3/25.

Slides:



Advertisements
Similar presentations
Leonid Iomdin Institute for Information Transmission Problems, Russian Academy of Sciences
Advertisements

Units of specialized knowledge* “A unit of specialized knowledge (SKU) is a unit that represents specialized knowledge at the content level, and communicates.
C SC 620 Advanced Topics in Natural Language Processing Lecture 22 4/15.
C SC 620 Advanced Topics in Natural Language Processing Lecture 20 4/8.
Introduction to Computational Linguistics Lecture 2.
C SC 620 Advanced Topics in Natural Language Processing 3/11 Lecture 15.
C SC 620 Advanced Topics in Natural Language Processing Lecture 19 4/6.
C SC 620 Advanced Topics in Natural Language Processing Lecture 16 3/23.
C SC 620 Advanced Topics in Natural Language Processing Lecture 24 4/22.
C SC 620 Advanced Topics in Natural Language Processing 3/9 Lecture 14.
C SC 620 Advanced Topics in Natural Language Processing Lecture 10 2/19.
C SC 620 Advanced Topics in Natural Language Processing Lecture 13 3/4.
Models of Generative Grammar Smriti Singh. Generative Grammar  A Generative Grammar is a set of formal rules that can generate an infinite set of sentences.
Lecture 1 Introduction: Linguistic Theory and Theories
Creation of a Russian-English Translation Program Karen Shiells.
Machine Translation History of Machine Translation Difficulties in Machine Translation Structure of Machine Translation System Research methods for Machine.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Terminology and documentation*  Object of the study of terminology:  analysis and description of the units representing specialized knowledge in specialized.
9/8/20151 Natural Language Processing Lecture Notes 1.
Lemmatization Tagging LELA /20 Lemmatization Basic form of annotation involving identification of underlying lemmas (lexemes) of the words in.
CSCI-383 Object-Oriented Programming & Design Lecture 4.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Automata, Computability, and Complexity Lecture 1 Section 0.1 Wed, Aug 22, 2007.
IV. SYNTAX. 1.1 What is syntax? Syntax is the study of how sentences are structured, or in other words, it tries to state what words can be combined with.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
WORD SENSE DISAMBIGUATION STUDY ON WORD NET ONTOLOGY Akilan Velmurugan Computer Networks – CS 790G.
Postgraduate Diploma in Translation Lecture 1 Computers and Language.
Depth of Knowledge Assessments (D.O.K.) Roseville City School District Leadership Team.
Algorithms, Part 1 of 3 Topics  Definition of an Algorithm  Algorithm Examples  Syntax versus Semantics Reading  Sections
Terminology and documentation*  Object of the study of terminology:  analysis and description of the units representing specialized knowledge in specialized.
Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.
1 CSI 5180: Topics in AI: Natural Language Processing, A Statistical Approach Instructor: Nathalie Japkowicz Objectives of.
NLP ? Natural Language is one of fundamental aspects of human behaviors. One of the final aim of human-computer communication. Provide easy interaction.
Introduction to Linguistics Ms. Suha Jawabreh Lecture # 2.
Detection of Links between Words in the Task of Syntactic-Semantic Analysis of Russian Texts. Dmitry V. Merkuryev Saint-Petersburg State University, Russia.
Semantic Construction lecture 2. Semantic Construction Is there a systematic way of constructing semantic representation from a sentence of English? This.
Computational linguistics A brief overview. Computational Linguistics might be considered as a synonym of automatic processing of natural language, since.
CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?
Topic #1: Introduction EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Programming Languages and Design Lecture 3 Semantic Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
Artificial Intelligence: Natural Language
Data Structures and Algorithms Dr. Tehseen Zia Assistant Professor Dept. Computer Science and IT University of Sargodha Lecture 1.
Terminology and documentation*  Object of the study of terminology:  analysis and description of the units representing specialized knowledge in specialized.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
Leonid Iomdin Institute for Information Transmission Problems, Russian Academy of Sciences
SYNTAX.
Software Requirements Specification (SRS)
SIMS 296a-4 Text Data Mining Marti Hearst UC Berkeley SIMS.
NATURAL LANGUAGE PROCESSING
COURSE AND SYLLABUS DESIGN
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 جامعة الملك فيصل عمادة.
Jan 2012MT Architectures1 Human Language Technology Machine Translation Architectures Direct MT Transfer MT Interlingual MT.
10/31/00 1 Introduction to Cognitive Science Linguistics Component Topic: Formal Grammars: Generating and Parsing Lecturer: Dr Bodomo.
What is cognitive psychology?
Algorithms, Part 1 of 3 Topics Definition of an Algorithm
Approaches to Machine Translation
Distributional analysis
Statistical NLP: Lecture 3
Algorithms I: An Introduction to Algorithms
Mathematical Competencies A Framework for Mathematics Curricula in Engineering Education SEFI MWG Steering committee Burkhard ALPERS, Marie DEMLOVÁ,
2008/09/17: Lecture 4 CMSC 104, Section 0101 John Y. Park
BBI 3212 ENGLISH SYNTAX AND MORPHOLOGY
CSc4730/6730 Scientific Visualization
Approaches to Machine Translation
Algorithms, Part 1 of 3 Topics Definition of an Algorithm
Generative Transformation
Structure of a Lexicon Debasri Chakrabarti 13-May-19.
Algorithms, Part 1 of 3 Topics Definition of an Algorithm
Presentation transcript:

C SC 620 Advanced Topics in Natural Language Processing Lecture 17 3/25

Reading List Readings in Machine Translation, Eds. Nirenburg, S. et al. MIT Press Reading list: –12. Correlational Analysis and Mechanical Translation. Ceccato, S. –13. Automatic Translation: Some Theoretical Aspects and the Design of a Translation System. Kulagina, O. and I. Mel’cuk –16. Automatic Translation and the Concept of Sublanguage. Lehrberger, J. –17. The Proper Place of Men and Machines in Language Translation. Kay, M.

Paper 13. Automatic Translation. Kulagina, O. and I. Mel’cuk 1. The Place of Automatic Translation (AT) Among Problems of Wider Range Observation: –Too broad: quite naturally broken down into a number of simpler tasks which are to be solved autonomously (first) –Too narrow: quite naturally included into broader problems which dominate AT Presuppositions: –Knowledge of the language pairs –Understanding the context –Knowing how to accumulate translation experience to gradually raise the quality

Paper 13. Automatic Translation. Kulagina, O. and I. Mel’cuk 1.1 The Linguistic Problem –Knowledge of Language means ability to do Analysis:T (text) -> M (meaning), and Synthesis:M -> T Notation for specifying meaning (Semantic Language) –Example (invariance of meaning under translation): »We fulfilled your task easily »What you had set us as a task was done by us with ease »It was easy for us to fulfill your rask »Fulfilling your task turned out to be easy for us

Paper 13. Automatic Translation. Kulagina, O. and I. Mel’cuk Broadness –The three AT tasks are also tasks of general linguistics, moreover cardinal problems of any serious theory of language If linguistics had more or less complete solutions to offer here, only some minor (tech) problems would have to be solved to make practical AT possible (Failure of linguistics) –Also important for other applications of language information processing e.g. information retrieval, automatic editing and abstracting (summarization), man-machine communication

Paper 13. Automatic Translation. Kulagina, O. and I. Mel’cuk Conclusion 1 –Any serious progress in AT depends on progress in linguistics on the three tasks –Progress in linguistics possible only if linguistics is transformed on the basis of new approaches and conceptions, in close connection with mathematics

Paper 13. Automatic Translation. Kulagina, O. and I. Mel’cuk 1.2 The Gnostical Problem –Knowledge of language does not guarantee good translation. Knowledge of situational context also needed. A. Different Meanings Correspond to the Same Situation –The largest city of the USSR –The capital of the USSR

Paper 13. Automatic Translation. Kulagina, O. and I. Mel’cuk B. The Same Meaning Corresponds to Different Situations –To this purpose he used the book –To do this he made use of the book –Situations: Read a book to get information or divert oneself Put a book on a ream of sheets to prevent the wind from scattering them Throw a book at a dog to drive the animal away

Paper 13. Automatic Translation. Kulagina, O. and I. Mel’cuk Knowledge of Situation needed (1) Multiple meanings, each of which refers to a certain situation, all of them different –Examples: The box is in the pen(Bar-Hillel) –Pen: enclosure –Pen: writing instrument Slow neutrons and protons (Bar-Hillel) –Wide and narrow scope for slow

Paper 13. Automatic Translation. Kulagina, O. and I. Mel’cuk Knowledge of Situation needed (1) Single meaning, unique situation (a knock at the door), language-particular –Example: Come in! (Russian) Forward! (Italian)

Paper 13. Automatic Translation. Kulagina, O. and I. Mel’cuk Conclusion 2 –Progress in AT dependent on progress in the study of human thinking and cognition

Paper 13. Automatic Translation. Kulagina, O. and I. Mel’cuk 1.3 The Problem of Automating Researchers’ Activity AT System: –Algorithms T->M, M->T, M->S (situation), S->M –Data (for each language)- dynamic Lexical Syntactic Stylistic Distribution and functioning of all items in the whole range of possible contexts Rules of correspondence between these items Encyclopedia

Paper 13. Automatic Translation. Kulagina, O. and I. Mel’cuk Start with imperfect system Need to organize algorithms and data and have maintenance devices that accept man- made corrections and learn by itself Need systems to automatically collect and classify language data

Paper 13. Automatic Translation. Kulagina, O. and I. Mel’cuk Conclusion 3 –Practical solution of AT depends on our ability to automate the scientific activities of humans

Paper 13. Automatic Translation. Kulagina, O. and I. Mel’cuk 2 Principal Components of an AT System 2.1 Analysis Algorithm Lexico-morphological Analysis –“Morphs” –Word form -> Information (distribution and syntactic functions, semantic information) Syntactic Analysis –Sentence -> syntactic tree(s) –Morphological ambiguities may be resolved here

Paper 13. Automatic Translation. Kulagina, O. and I. Mel’cuk Semantic Analysis –Syntactic tree -> semantic structure (SEMS) –Possibly disambiguate syntactic trees here –Representation Example: He drinks warm tea

Paper 13. Automatic Translation. Kulagina, O. and I. Mel’cuk Synthesis –(Situation level excluded) –Replace semantic nodes 1-to-1 Several nodes -> 1 node 1 node -> several nodes –‘Rush along’ -> very/great + fast + move Syntactic node -> single/several semantic nodes –Semantic items to syntactic items Success + great degree -> dramatic success Staff -> staff [lab], personnel [hospital], crew [tank or ship], team [football], troupe [theater]

Paper 13. Automatic Translation. Kulagina, O. and I. Mel’cuk 2.2 Semantic Dictionary Text -> meaning: simplification Basic (English) –A few hundred items (plus technical items) –Other words must be expressible in Basic by means of non-ambiguous and readily understandable paraphrases Merge two Basics into one –Semantic Language, AT Interlingua Multiple stages: Russian of degree N

Paper 13. Automatic Translation. Kulagina, O. and I. Mel’cuk 2.3 Synthesis Algorithm –(Exclude Semantic Synthesis) –Syntactic Synthesis By Primitive Word Groups (PWG) –Head and dependents –Verb, noun, adjective and adverb groups Assemble PWGs into Definitive (Terminal) Word Groups (DWG) –Look at master and place PWG –Finite verb, subject, object, circumstantial complements, adverb and nominal/infinitive complement groups Arrange DWGs to ensure acceptable word order –Preference rules at work

Paper 13. Automatic Translation. Kulagina, O. and I. Mel’cuk