Knowledge Representation for Natural Language Understanding

Slides:



Advertisements
Similar presentations
Jing-Shin Chang National Chi Nan University, IJCNLP-2013, Nagoya 2013/10/15 ACLCLP – Activities ( ) & Text Corpora.
Advertisements

Statistical NLP: Lecture 3
For Friday No reading Homework –Chapter 23, exercises 1, 13, 14, 19 –Not as bad as it sounds –Do them IN ORDER – do not read ahead here.
Chapter 20: Natural Language Generation Presented by: Anastasia Gorbunova LING538: Computational Linguistics, Fall 2006 Speech and Language Processing.
NLP and Speech Course Review. Morphological Analyzer Lexicon Part-of-Speech (POS) Tagging Grammar Rules Parser thethe – determiner Det NP → Det.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Introduction to Computational Linguistics Lecture 2.
Are Linguists Dinosaurs? 1.Statistical language processors seem to be doing away with the need for linguists. –Why do we need linguists when a machine.
Introduction to Cognitive Science Sept 2005 :: Lecture #1 :: Joe Lau :: Philosophy HKU.
Introduction to CL Session 1: 7/08/2011. What is computational linguistics? Processing natural language text by computers  for practical applications.
Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing INTRODUCTION Muhammed Al-Mulhem March 1, 2009.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
ELN – Natural Language Processing Giuseppe Attardi
9/8/20151 Natural Language Processing Lecture Notes 1.
Computational Methods to Vocalize Arabic Texts H. Safadi*, O. Al Dakkak** & N. Ghneim**
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
Introduction to CL & NLP CMSC April 1, 2003.
Artificial Intelligence By Michelle Witcofsky And Evan Flanagan.
Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
1 CSI 5180: Topics in AI: Natural Language Processing, A Statistical Approach Instructor: Nathalie Japkowicz Objectives of.
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
Coarse-to-Fine Efficient Viterbi Parsing Nathan Bodenstab OGI RPE Presentation May 8, 2006.
Linguistic Essentials
Introduction to Computational Linguistics
CSA2050 Introduction to Computational Linguistics Parsing I.
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Friday Finish chapter 24 No written homework.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
Auckland 2012Kilgarriff: NLP and Corpus Processing1 The contribution of NLP: corpus processing.
LING 001 Introduction to Linguistics Spring 2010 Syntactic parsing Part-Of-Speech tagging Apr. 5 Computational linguistics.
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
1 An Introduction to Computational Linguistics Mohammad Bahrani.
8 December 1997Industry Day Applications of SuperTagging Raman Chandrasekar.
Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
NATURAL LANGUAGE PROCESSING
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 جامعة الملك فيصل عمادة.
10/31/00 1 Introduction to Cognitive Science Linguistics Component Topic: Formal Grammars: Generating and Parsing Lecturer: Dr Bodomo.
Trends in NL Analysis Jim Critz University of New York in Prague EurOpen.CZ 12 December 2008.
Approaches to Machine Translation
Computational UIUC Lane Schwartz Student Orientation August 18, 2016.
Statistical NLP: Lecture 3
INAGO Project Automatic Knowledge Base Generation from Text for Interactive Question Answering.
Natural Language Processing (NLP)
Activities on NLP in Mainland of China
Machine Learning in Natural Language Processing
Introduction Artificial Intelligent.
Activities in Mainland of China
Topic Oriented Semi-supervised Document Clustering
Automatic Detection of Causal Relations for Question Answering
Approaches to Machine Translation
CSE 635 Multimedia Information Retrieval
Chunk Parsing CS1573: AI Application Development, Spring 2003
Introduction to Machine Translation
Computational Linguistics: New Vistas
Linguistic Essentials
Ying Dai Faculty of software and information science,
CS246: Information Retrieval
Natural Language Processing (NLP)
Artificial Intelligence 2004 Speech & Natural Language Processing
Information Retrieval
Natural Language Processing (NLP)
Presentation transcript:

Knowledge Representation for Natural Language Understanding Chengqing ZONG Institute of Automation, Chinese Academy of Sciences cqzong@nlpr.ia.ac.cn

Outline CASIA and NLPR Introduction Some Linguistic Knowledge Bases Approaches to NLU Proposal NLPR, CAS-IA 2019/4/15

Institute of Automation (IA), Chinese Academy of Sciences (CAS) CASIA Institute of Automation (IA), Chinese Academy of Sciences (CAS) Founded in 1956 NLPR, CAS-IA 2019/4/15

Personnel Faculty members: 320, including 38 full time professors Post-doc research fellows: 30 Students (Ph.D. and MSc): 600 Visiting researchers: 40+ NLPR, CAS-IA 2019/4/15

NLPR National Laboratory of Pattern Recognition Staff: 29 Ph.D. candidates: 140 MSc: 120 Post-Doc.: 7 NLPR, CAS-IA 2019/4/15

NLPR Directors Academic Committee Management Committee Visual Information Processing Group General Office Biometric Information Processing Group Pattern Recognition and its Cognitive Mechanisms Group Speech and Language Technology Group June 20, 2003 NLPR, CAS-IA 2019/4/15

1. Introduction Natural language understanding is a typical task of knowledge processing Text or speech K.B. Processor Text or speech NLPR, CAS-IA 2019/4/15

1. Introduction For the different tasks or different approaches, the different representations are necessitated. e.g., for document summarization or information extraction, the knowledge for discourse analyzing and topic understanding is necessary. Title Time NLPR, CAS-IA 2019/4/15

1. Introduction For machine translation (MT), the knowledge for sentence analyzing and translating is necessary. e.g., I saw a man with a telescope. NP  Det NN NP  NP PP NP …… Rule-based MT: I saw [a man with a telescope]. I [saw a man] with a telescope. Statistical MT: 我用望远镜看见一个男孩。 我看见一个带望远镜的男孩。 NLPR, CAS-IA 2019/4/15

1. Introduction Questions: How is about the current linguistic K. B. ? Is an algorithm designed according to the K. B. or the representation designed for an algorithm? NLPR, CAS-IA 2019/4/15

? 2. Some Linguistic K. B. 2.1 WordNet (http://wordnet.princeton.edu ) Three basic Preconditions: Separability hypothesis Patterning hypothesis Comprehensiveness hypothesis Take synset as the building block Relationships: synonymy / antonymy / hypernymy / hyponymy / meronymy / entailment ? NLPR, CAS-IA 2019/4/15

2. Some Linguistic K. B. 2.2 HowNet (http://www.keenage.com ) Knowledge, specifically, the form of knowledge that is computer-operable, is a system encompassing the varied relations amongst concepts as well as those amongst the attributes of concepts. As one acquires more concepts, or rather, captures more relations amongst concepts alongside the links between the attributes attached to the concepts, one simply becomes more knowledgeable; On the creation of a knowledge base, a common-sense knowledge base constituting a knowledge system should first be constructed. This database shall describe general concepts and map out the relations among them. NLPR, CAS-IA 2019/4/15

2. Some Linguistic K. B. Some concepts and relationships are defined. NLPR, CAS-IA 2019/4/15

2. Some Linguistic K. B. 2.3 UPenn TreeBank http://www.cis.upenn.edu/~treebank/home.html 一CD 具体JJ 措施NN 策略NN 要点NN NP 和CC 系列M CLP QP NP-OBJ VP 提出VV 还AD ADVP 他PN NP-SBJ IP 。PU NLPR, CAS-IA 2019/4/15

2. Some Linguistic K. B. 2.4 FrameNet and Others FrameNet (frame semantics) http://framenet.icsi.berkeley.edu PropBank、NomBank http://nlp.cs.nyu.edu/meyers/NomBank.html NLPR, CAS-IA 2019/4/15

2. Some Linguistic K. B. Summary: All the presentations motioned above are human-made and human-defined; The different K. B. is built at different level and based on the different grain, such as at lexical level and tagging lexicons, or at sentence level and annotating the syntactic structure, and so on; NLPR, CAS-IA 2019/4/15

2. Some Linguistic K. B. Generally, the K. B. are developed for all-purposes and single linguistic knowledge is expressed in a specific K. B.; However, are the representations sufficient or even complete for a natural language processing system? NLPR, CAS-IA 2019/4/15

3. Approaches to NLU Three methods: Rationalistic Empirical NLPR, CAS-IA 2019/4/15

3. Approaches to NLU Take MT as an example Inter-lingual SL TL Semantic-Tree Syntactic-Tree Chunk Phrase Word Logical-Form Take MT as an example Word-to-Word Phrase-to-Phrase Chunk-to-Chunk Chunk-to-String Tree-to-Tree (Learned, Syntactic or Semantic) Tree-to-String Logical-Form-to-Logical-Form p(t|s) vs. p(s|t)×p(t) NLPR, CAS-IA 2019/4/15

3. Approaches to NLU Rule base Dictionary + Machine Learning Corpus base More data is better data. Performance Years NLPR, CAS-IA 2019/4/15

3. Approaches to NLU So many hard nuts are still remained to crack: Word sense disambiguation Syntactic disambiguation Semantic analysis and translating Automatic evaluation of translation … … NLPR, CAS-IA 2019/4/15

Increasing Number of Chinese Webpages 3. Approaches to NLU The number of webpages is exponentially increased The highest accuracy of Chinese information retrieval (webpage search) in 2006 was only about 36.7% (from 863 report) Increasing Number of Chinese Webpages The data are from the Information Center of China Internet NLPR, CAS-IA 2019/4/15

3. Approaches to NLU What is the problem? NLPR, CAS-IA 2019/4/15

3. Approaches to NLU “One should build the rocket, instead of climbing the tree, if he wants to reach the moon”, Martin Kay Is it building the rocket or climbing the tree? Does it currently take the right way to build the rocket? NLPR, CAS-IA 2019/4/15

3. Approaches to NLU How does a human brain work when it translates a sentence? Input:Speech Text + Affective Computing Semantic Perception Dynamic Vision K. B. June 20, 2003 Output Static NLPR, CAS-IA 2019/4/15

3. Approaches to NLU _ A man can infer the unknown word sense or sentence structure etc. from his common sense (limited knowledge), but a system can not; _ A man can dynamically and synthetically use multiple knowledge sources (lexical/ syntactic/ semantic/ pragmatic) to process a specific language phenomenon. It is easy to determine what knowledge is necessary and what knowledge is unnecessary, but a system usually can not; NLPR, CAS-IA 2019/4/15

3. Approaches to NLU _ A man can easily get the new knowledge and renew his memory, but a system is usually difficult to do. However, a computer can memorize a number of words and phrases, do the very fast computing, and so on, but a man can not. Currently, the models for NLU mainly use the capability of computing, but rarely or hardly simulate the human’s cognitive process. NLPR, CAS-IA 2019/4/15

4. Proposal For a specific task of NLU, such as word sense disambiguation, syntactic parsing, or translating etc., we need to model the cognitive process of human brain; According to the models, to build the task-oriented knowledge base. NLPR, CAS-IA 2019/4/15

4. Proposal e.g., for the speech-to-speech (S2S) translation in a specific domain, the following aspects are addressed: Investigate the effect of rhythm, tone, and accent; Model translation in combination with language model, speech model, and common sense model etc.; Build the knowledge base describing the language, semantic, speech, emotion, and domain-related common sense as well, which are all oriented to the S2S translation and based on the needs of translation model. NLPR, CAS-IA 2019/4/15

thanks 谢谢 ! NLPR, CAS-IA 2019/4/15