Lecture 8 Applications and demos. Building applications Previous lectures have discussed stages in processing: algorithms have addressed aspects of language.

Slides:



Advertisements
Similar presentations
1 Probability and the Web Ken Baclawski Northeastern University VIStology, Inc.
Advertisements

Introduction to Computational Linguistics
Jing-Shin Chang National Chi Nan University, IJCNLP-2013, Nagoya 2013/10/15 ACLCLP – Activities ( ) & Text Corpora.
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
Natural Language and Speech Processing Creation of computational models of the understanding and the generation of natural language. Different fields coming.
Towards an NLP `module’ The role of an utterance-level interface.
Applications Chapter 9, Cimiano Ontology Learning Textbook Presented by Aaron Stewart.
Information Retrieval in Practice
Sunita Sarawagi.  Enables richer forms of queries  Facilitates source integration and queries spanning sources “Information Extraction refers to the.
Collaborative Filtering in iCAMP Max Welling Professor of Computer Science & Statistics.
Search Engines and Information Retrieval
Shallow Processing: Summary Shallow Processing Techniques for NLP Ling570 December 7, 2011.
1/7 INFO60021 Natural Language Processing Harold Somers Professor of Language Engineering.
Introduction to CL Session 1: 7/08/2011. What is computational linguistics? Processing natural language text by computers  for practical applications.
1 LM Approaches to Filtering Richard Schwartz, BBN LM/IR ARDA 2002 September 11-12, 2002 UMASS.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
تمرين شماره 1 درس NLP سيلابس درس NLP در دانشگاه هاي ديگر ___________________________ راحله مکي استاد درس: دکتر عبدالله زاده پاييز 85.
Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
Lecture 8 Applications and demos. Building applications Previous lectures have discussed stages in processing: algorithms have addressed aspects of language.
Xiaomeng Su & Jon Atle Gulla Dept. of Computer and Information Science Norwegian University of Science and Technology Trondheim Norway June 2004 Semantic.
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
Overview of Search Engines
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Lucent Technologies – Proprietary Use pursuant to company instruction Learning Sequential Models for Detecting Anomalous Protocol Usage (work in progress)
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
9/8/20151 Natural Language Processing Lecture Notes 1.
Language Identification of Search Engine Queries Hakan Ceylan Yookyung Kim Department of Computer Science Yahoo! Inc. University of North Texas 2821 Mission.
Search Engines and Information Retrieval Chapter 1.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
CIG Conference Norwich September 2006 AUTINDEX 1 AUTINDEX: Automatic Indexing and Classification of Texts Catherine Pease & Paul Schmidt IAI, Saarbrücken.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
University of Dublin Trinity College Localisation and Personalisation: Dynamic Retrieval & Adaptation of Multi-lingual Multimedia Content Prof Vincent.
Experiments on Building Language Resources for Multi-Modal Dialogue Systems Goals identification of a methodology for adapting linguistic resources for.
ICS-FORTH January 11, Thesaurus Mapping Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Bath, UK, January.
Suléne Pilon & Danie Prinsloo Overview: Teaching and Training in South Africa 25 November 2008;
NLP And The Semantic Web Dainis Kiusals COMS E6125 Spring 2010.
Some Probability Theory and Computational models A short overview.
Abstract Question answering is an important task of natural language processing. Unification-based grammars have emerged as formalisms for reasoning about.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
Combining terminology resources and statistical methods for entity recognition: an evaluation Angus Roberts, Robert Gaizauskas, Mark Hepple, Yikun Guo.
Lecture 12 Applications and demos. Building applications Previous lectures have discussed stages in processing: algorithms have addressed aspects of language.
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
Talk Schedule Question Answering from Bryan Klimt July 28, 2005.
ICS 482: Natural language Processing Pre-introduction
1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.
Auckland 2012Kilgarriff: NLP and Corpus Processing1 The contribution of NLP: corpus processing.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 1 (03/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Introduction to Natural.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
Detection of Spelling Errors in Swedish Clinical Text Nizamuddin Uddin and Hercules Dalianis Department of Computer and Systems Sciences, (DSV)
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
AUTONOMOUS REQUIREMENTS SPECIFICATION PROCESSING USING NATURAL LANGUAGE PROCESSING - Vivek Punjabi.
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
NTNU Speech Lab 1 Topic Themes for Multi-Document Summarization Sanda Harabagiu and Finley Lacatusu Language Computer Corporation Presented by Yi-Ting.
AQUAINT Mid-Year PI Meeting – June 2002 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 جامعة الملك فيصل عمادة.
Approaches to Machine Translation
Natural Language Processing (NLP)
Machine Learning in Natural Language Processing
Approaches to Machine Translation
Natural Language Processing (NLP)
Natural Language Processing (NLP)
Presentation transcript:

Lecture 8 Applications and demos

Building applications Previous lectures have discussed stages in processing: algorithms have addressed aspects of language modelling. All but the simplest applications combine multiple components. Suitability of application, interoperability, evaluation etc. Avoiding error multiplication: robustness to imperfections in prior modules.

Demos Limited domain systems –CHAT-80 –BusTUC OSCAR: Named entity recognition for Chemistry DELPH-IN: Parsing and generation Automatic construction of research web pages Rhetorical structure: Argumentative Zoning of scientific text Note also: demo systems mentioned in exercises.

CHAT-80 CHAT-80: a micro-world system implemented in Prolog in 1980 CHAT-80 demo –What is the population of India? –which(X:exists(X:(isa(X,population) and of(X,india)))) –have(india,(population=574))

Bus Route Oracle Query bus departures in Trondheim, Norway, built by students and faculty at NTNU. –42 bus lines, 590 stops, 60,000 entries in database –Norwegian and English –in daily use: half a million logged queries Prolog-based, parser analyses to query language, mapped to bus timetable database BusTUC demoBusTUC –When is the earliest bus to the airport? –When is the next bus from Dragvoll to the centre?

Chemistry named entity recognition OSCAR 3 system: recognises chemistry named-entities in documents –(e.g. 2,4-dinitrotoluene; citric acid) Series of classifiers using n-grams, affixes, context plus external dictionaries Used in RSC ProjectProspectProjectProspect –Example: DNA templated...DNA templated... Also used as preprocessor for full parsing Precision/recall balance for different uses

DELPH-IN DELPH-IN: informal consortium of 16 groups (EU, Asia, US) develops multilingual resources for deep language processing –hand-written grammars in feature structure formalism, plus statistical ranking –English Resource Grammar (ERG): approx 90% coverage of edited text ERG demo Metal reagents are compounds often utilized in synthesis.

Automatic web page generation Using publication lists to find links between people and to construct summaries –Generating research websites using summarisation techniques gives NPs like summarisation techniques –cluster these terms –locate co-authors, summarise collaborations Web page generation demo

Collaboration summaries Lawrence C Paulson collaborated with Cristiano Longo and Giampaolo Bella from 1997 to 2003 on ‘formal verification’, ‘industrial payment and nonrepudiation protocol’, ‘kerberos authentication system’ and ‘secrecy goals’ and in 2006 on ‘cardholder registration in Set’ and ‘accountability protocols’.

Argumentative Zoning Finding rhetorical structure in scientific texts automatically –Research goals –Criticism and contrast –Intellectual ancestry Robust Argumentative Zoning demo –input text (ASCII via Acrobat)input text (ASCII via Acrobat) Usages: search, bibliometrics, reviewing support, training new researchers

Theme: ambiguity levels: morphology, syntax, semantic, lexical, discourse resolution: local ambiguity, syntax as filter for morphology, selectional restrictions. ranking: parse ranking, WSD, anaphora resolution. processing efficiency: chart parsing

Theme: evaluation training data and test data reproducibility baseline ceiling module evaluation vs application evaluation nothing is perfect!

Modules and algorithms different processing modules different applications blend modules differently many different styles of algorithm: –FSAa and FSTs –Markov models and HMMs –CFG (and probabilistic CFGs) –constraint-based frameworks –inheritance hierarchies (WordNet), decision trees (WSD) –classifiers (Naive Bayes)

More about language and speech processing... Information Retrieval course MPhil in Advanced Computer Science: –language and speech modules –in collaboration with speech group from Engineering – –more info soon on ACS pages