Lecture 8 Applications and demos. Building applications Previous lectures have discussed stages in processing: algorithms have addressed aspects of language.

Slides:



Advertisements
Similar presentations
Introduction to Computational Linguistics
Advertisements

Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.
Probabilistic Language Processing Chapter 23. Probabilistic Language Models Goal -- define probability distribution over set of strings Unigram, bigram,
Natural Language and Speech Processing Creation of computational models of the understanding and the generation of natural language. Different fields coming.
Towards an NLP `module’ The role of an utterance-level interface.
Sunita Sarawagi.  Enables richer forms of queries  Facilitates source integration and queries spanning sources “Information Extraction refers to the.
Search Engines and Information Retrieval
Shallow Processing: Summary Shallow Processing Techniques for NLP Ling570 December 7, 2011.
1/7 INFO60021 Natural Language Processing Harold Somers Professor of Language Engineering.
Traditional Information Extraction -- Summary CS652 Spring 2004.
Introduction to CL Session 1: 7/08/2011. What is computational linguistics? Processing natural language text by computers  for practical applications.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
تمرين شماره 1 درس NLP سيلابس درس NLP در دانشگاه هاي ديگر ___________________________ راحله مکي استاد درس: دکتر عبدالله زاده پاييز 85.
Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
Lecture 8 Applications and demos. Building applications Previous lectures have discussed stages in processing: algorithms have addressed aspects of language.
Xiaomeng Su & Jon Atle Gulla Dept. of Computer and Information Science Norwegian University of Science and Technology Trondheim Norway June 2004 Semantic.
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
Overview of Search Engines
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Language Identification of Search Engine Queries Hakan Ceylan Yookyung Kim Department of Computer Science Yahoo! Inc. University of North Texas 2821 Mission.
COMP423: Intelligent Agent Text Representation. Menu – Bag of words – Phrase – Semantics – Bag of concepts – Semantic distance between two words.
Search Engines and Information Retrieval Chapter 1.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
CIG Conference Norwich September 2006 AUTINDEX 1 AUTINDEX: Automatic Indexing and Classification of Texts Catherine Pease & Paul Schmidt IAI, Saarbrücken.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
University of Dublin Trinity College Localisation and Personalisation: Dynamic Retrieval & Adaptation of Multi-lingual Multimedia Content Prof Vincent.
ICS-FORTH January 11, Thesaurus Mapping Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Bath, UK, January.
© Copyright 2013 ABBYY NLP PLATFORM FOR EU-LINGUAL DIGITAL SINGLE MARKET Alexander Rylov LTi Summit 2013 Confidential.
1 Technologies for (semi-) automatic metadata creation Diana Maynard.
Suléne Pilon & Danie Prinsloo Overview: Teaching and Training in South Africa 25 November 2008;
NLP And The Semantic Web Dainis Kiusals COMS E6125 Spring 2010.
Combining terminology resources and statistical methods for entity recognition: an evaluation Angus Roberts, Robert Gaizauskas, Mark Hepple, Yikun Guo.
Lecture 12 Applications and demos. Building applications Previous lectures have discussed stages in processing: algorithms have addressed aspects of language.
Research Topics CSC Parallel Computing & Compilers CSC 3990.
1 CSI 5180: Topics in AI: Natural Language Processing, A Statistical Approach Instructor: Nathalie Japkowicz Objectives of.
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
Talk Schedule Question Answering from Bryan Klimt July 28, 2005.
ICS 482: Natural language Processing Pre-introduction
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
Auckland 2012Kilgarriff: NLP and Corpus Processing1 The contribution of NLP: corpus processing.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 1 (03/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Introduction to Natural.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Acquisition of Categorized Named Entities for Web Search Marius Pasca Google Inc. from Conference on Information and Knowledge Management (CIKM) ’04.
NATURAL LANGUAGE PROCESSING Zachary McNellis. Overview  Background  Areas of NLP  How it works?  Future of NLP  References.
1 An Introduction to Computational Linguistics Mohammad Bahrani.
AUTONOMOUS REQUIREMENTS SPECIFICATION PROCESSING USING NATURAL LANGUAGE PROCESSING - Vivek Punjabi.
Natural Language Processing Group Computer Sc. & Engg. Department JADAVPUR UNIVERSITY KOLKATA – , INDIA. Professor Sivaji Bandyopadhyay
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
NTNU Speech Lab 1 Topic Themes for Multi-Document Summarization Sanda Harabagiu and Finley Lacatusu Language Computer Corporation Presented by Yi-Ting.
AQUAINT Mid-Year PI Meeting – June 2002 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 جامعة الملك فيصل عمادة.
Approaches to Machine Translation
Natural Language Processing (NLP)
Machine Learning in Natural Language Processing
Approaches to Machine Translation
Natural Language Processing (NLP)
CS224N Section 3: Corpora, etc.
Discovering Companies we Know
Natural Language Processing (NLP)
Presentation transcript:

Lecture 8 Applications and demos

Building applications Previous lectures have discussed stages in processing: algorithms have addressed aspects of language modelling. All but the simplest applications combine multiple components. Suitability of application, interoperability, evaluation etc. Avoiding error multiplication: robustness to imperfections in prior modules.

Demos Limited domain systems –CHAT-80 –BusTUC OSCAR: Named entity recognition for Chemistry DELPH-IN: Parsing and generation Automatic construction of research web pages Rhetorical structure: Argumentative Zoning of scientific text Note also: demo systems mentioned in exercises.

CHAT-80 CHAT-80: a micro-world system implemented in Prolog in 1980 CHAT-80 demo –What is the population of India? –which(X:exists(X:(isa(X,population) and of(X,india)))) –have(india,(population=574))

Bus Route Oracle Query bus departures in Trondheim, Norway, built by students and faculty at NTNU. –42 bus lines, 590 stops, 60,000 entries in database –Norwegian and English –in daily use: half a million logged queries Prolog-based, parser analyses to query language, mapped to bus timetable database BusTUC demoBusTUC –When is the earliest bus to the airport? –When is the next bus from Dragvoll to the centre?

Chemistry named entity recognition SciBorg: OSCAR 3 system: recognises chemistry named-entities in documents –(e.g. 2,4-dinitrotoluene; citric acid) Series of classifiers using n-grams, affixes, context plus external dictionaries Used in RSC ProjectProspect Also used as preprocessor for full parsing Precision/recall balance for different uses

Enhanced browsing of chemistry documents: RSC using OSCAR

Precision and recall in OSCAR: from Corbett and Copestake (2008) Modest precision, high recall: text preprocessing High precision, modest recall: text viewing

DELPH-IN DELPH-IN: informal consortium of 18 groups (EU, Asia, US) develops multilingual resources for deep language processing –hand-written grammars in feature structure formalism, plus statistical ranking –English Resource Grammar (ERG): approx 90% coverage of edited text ERG demo Metal reagents are compounds often utilized in synthesis.

Some uses of the ERG Automatic response (YY Corp, commercial use) Machine Translation –LOGON research project: Norwegian to English –smaller-scale MT with other language pairs Semantic search –SciBorg (chemistry, research) –WeSearch (Wikipedia, University of Oslo, new research) English teaching (EPGY, Stanford: 20,000 users) – Smaller-scale projects in question answering, information extraction, paraphrase...

Application and domain- independent DELPH-IN Tools Application - (and maybe domain-) specific

Automatic web page generation Using publication lists to find links between people and to construct summaries –Generating research websites using summarisation techniques gives NPs like summarisation techniques –cluster these terms –locate co-authors, summarise collaborations Web page generation demo

Collaboration summaries Lawrence C Paulson collaborated with Cristiano Longo and Giampaolo Bella from 1997 to 2003 on ‘formal verification’, ‘industrial payment and nonrepudiation protocol’, ‘kerberos authentication system’ and ‘secrecy goals’ and in 2006 on ‘cardholder registration in Set’ and ‘accountability protocols’.

Argumentative Zoning Finding rhetorical structure in scientific texts automatically –Research goals –Criticism and contrast –Intellectual ancestry Robust Argumentative Zoning demo –input text (ASCII via Acrobat)input text (ASCII via Acrobat) Usages: search, bibliometrics, reviewing support, training new researchers

NLP Course conclusions Theme: ambiguity levels: morphology, syntax, semantic, lexical, discourse resolution: local ambiguity, syntax as filter for morphology, selectional restrictions. ranking: parse ranking, WSD, anaphora resolution. processing efficiency: chart parsing

Theme: evaluation training data and test data reproducibility baseline ceiling module evaluation vs application evaluation nothing is perfect!

Modules and algorithms different processing modules different applications blend modules differently many different styles of algorithm: –FSAa and FSTs –Markov models and HMMs –CFG (and probabilistic CFGs) –constraint-based frameworks –inheritance hierarchies (WordNet), decision trees (WSD) –classifiers (Naive Bayes)

More about language and speech processing... Information Retrieval course MPhil in Advanced Computer Science: –language and speech modules –in collaboration with speech group from Engineering – –