October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview
October 2005CSA3180 NLP2 Acknowledgement Material for some of these slides taken from J Nivre, University of Gotheborg, Sweden
October 2005CSA3180 NLP3 Why Language and Computers Engineering –NLP is concerned with the design and implementation of effective NL input and output components for computational systems (Robert Dale 2000) Scientific –The use of computers for linguistic research and applications
October 2005CSA3180 NLP4 NLP is Interdisciplinary Linguistics –Theoretical –Applied Computer Science –Algorithms –Compiling Techniques Artificial Intelligence –Understanding, reasoning –Intelligent Action
October 2005CSA3180 NLP5 Uszkoreit’s (2000) Five Points Solving the human language puzzle –by implementing complex theories directly Teaching computers to communicate with people –by exploiting natural modes of communication Friendly software should listen and speak –through development of multimodal communication Machines can help people communicate with each other. –by developing multilingual applications Language is the fabric of the web –through language technology for knowledge management
October 2005CSA3180 NLP6 Application Areas Document Processing –Classification –Summarisation –Information Extraction Question Answering –Information Retrieval –Dialogue Multilinguality –Machine Translation –Translation tools Multimodality –speech –intonation –image
October 2005CSA3180 NLP7 Basic Problems Analysis –Conversion of NL input to internal representations Generation –Conversion of internal representations to NL output Issues –What kind of input/output/representations –Evaluation –Learning
October 2005CSA3180 NLP8 Levels of Linguistic Knowledge Phonetics/Phonology: sound structure Morphology: word structure Syntax: sentence structure Semantics: meanings Pragmatics: use of language in context Discourse: paragraphs, texts, dialogues
October 2005CSA3180 NLP9 Ambiguity Morpho-Syntactic We saw her duck Lexical Semantic They went to the bank Structural semantic Young men and women Referential She did it Pragmatic Can you pass the salt
October 2005CSA3180 NLP10 Ways of Studying NLP By Application MT, IE, IR etc. By Approach rational vs. empirical By Linguistic Level morphology, syntax etc. By Algorithm
October 2005CSA3180 NLP11 Algorithms State Machines –automata and transducers Rule Systems –regular and context free grammars Search –top-down/bottom-up parsing Probabilistic algorithms
October 2005CSA3180 NLP12 Approach in this Course Part I - Algorithms Words [3] –Finite State Algorithms –Morphological Processing Sentences [3] –Parsing –(Generation) Texts [3] –Tagging –Chunking
October 2005CSA3180 NLP13 Approach in this Course Part II – Topics and Tools Semantics [6] Statistics [6] Information Extraction [6] Machine Translation [4] Information Retrieval [3]
October 2005CSA3180 NLP14 Course Information Course Website Reference Text Jurafsky and Martin Tools –Prolog: SWI Prolog –NLTK: nltk.sourceforge.net