Download presentation
Presentation is loading. Please wait.
Published byJuliana Welch Modified over 7 years ago
1
Natural Language Processing [05 hours/week, 09 Credits] [Theory]
Eighth Semester: Computer Science & Engineering Dr.M.B.Chandak
2
Course Contents The course is divided into following major components
Basics of Natural Language Processing and Language modeling techniques. Syntactic and Semantic Parsing NLP: Applications: Information Extraction & Machine Translation Total units: 6 Unit 1 and 2: Basics and modeling techniques Unit 3 and 4: Syntactic and Semantic Parsing Unit 5 and 6: Information Extraction & Machine Translation
3
Course Pre-requisite Basic knowledge of English Grammar
Theoretical foundations of Computer Science [TOFCS] Extension of Language Processing Python and Open Source tools Active Class participation and Regularity
4
Unitized course Unit-I: Introduction NLP tasks in syntax, semantics, and pragmatics. Key issues &Applications such as information extraction, question answering, and machine translation. The problem of ambiguity. The role of machine learning. Brief history of the field UNIT-II: N-gram Language Models Role of language models. Simple N-gram models. Estimating parameters and smoothing. Evaluating language models. Part Of Speech Tagging and Sequence Labeling Lexical syntax. Hidden Markov Models. Maximum Entropy models.
5
Unitized course Unit-III: Grammar formalisms and tree banks. Efficient parsing for context-free grammars (CFGs). Statistical parsing and probabilistic CFGs (PCFGs). Lexicalized PCFGs. UNIT-IV: Lexical semantics and word-sense disambiguation. Compositional semantics. Semantic Role Labeling and Semantic Parsing. Unit - V Named entity recognition and relation extraction. IE using sequence labeling. Automatic summarization Subjectivity and sentiment analysis. Unit - VI Basic issues in MT. Statistical translation, word alignment, phrase-based translation, and synchronous grammars.
6
Text and Reference books
D. Jurafsky and R. Martin; Speech and Language Processing; 2nd edition, Pearson Education, 2009. 2. Allen and James; Natural Language Understanding; Second Edition, Benjamin/Cumming, Charniack & Eugene, Statistical Language Learning, MIT Press, 1993 Web Resources
7
Course Outcomes CO Course Outcome Unit 1
Ability to differentiate various NLP tasks and understand the problem of ambiguity. Unit 1 2 Ability to model and preprocess language Unit 2 3 Ability to perform syntactical parsing using different grammars. Unit 3 4 Ability to perform semantic parsing and word sense disambiguation. Unit 4 5 Ability to perform Information Extraction and Machine translation. Unit 5,6
8
Grading Scheme: Internal Examination
Total: 40 marks Three Test: Best TWO [15 x 2 = 30 marks] Generally Third test will be complex. 10 marks distribution: (i) Class participation: 03 marks [may include attendance] (ii)Assignment – 1: 04 marks [Design/Coding] {After T1} (iii)Assignment– 2: 03 marks [Objective/Coding] {After T2} (iv) Challenging problems: [Individual : 07: marks]
9
Introduction: Basics Natural Language Processing (NLP) is the study of the computational treatment of natural (human) language. In other words, teaching computers how to understand (and generate) human language. It is field of Computer Science, Artificial Intelligence and Computational Linguistics. Natural language processing systems take strings of words (sentences) as their input and produce structured representations capturing the meaning of those strings as their output. The nature of this output depends heavily on the task at hand.
10
Introduction: NLP tasks
Processing Language is complex task Modular approach is followed Conferences: ACL/NAACL, EMNLP, SIGIR, AAAI/IJCAI, Coling, HLT, EACL/NAACL, AMTA/MT Summit, ICSLP/Eurospeech Journals: Computational Linguistics, TACL, Natural Language Engineering, Information Retrieval, Information Processing and Management, ACM Transactions on Information Systems, ACM TALIP, ACM TSLP University centers: Berkeley, Columbia, Stanford, CMU, JHU, Brown, UMass, MIT, UPenn, USC/ISI, Illinois, Michigan, UW, Maryland, etc. Toronto, Edinburgh, Cambridge, Sheffield, Saarland, Trento, Prague, QCRI, NUS, and many others Industrial research sites: Google, MSR, Yahoo!, FB, IBM, SRI, BBN, MITRE, AT&T Labs The ACL Anthology The ACL Anthology Network (AAN)
11
Why NLP is complex Natural language is extremely rich in form and structure, and very ambiguous. How to represent meaning, Which structures map to which meaning structures. One input can mean many different things. Ambiguity can be at different levels. Lexical (word level) ambiguity -- different meanings of words Syntactic ambiguity -- different ways to parse the sentence Interpreting partial information -- how to interpret pronouns Contextual information -- context of the sentence may affect the meaning of that sentence.
12
Example: Ambiguity Consider Sentence: “I made her duck”
Various levels of ambiguity How many different interpretations does this sentence have? What are the reasons for the ambiguity? The categories of knowledge of language can be thought of as ambiguity resolving components. How can each ambiguous piece be resolved? Does speech input make the sentence even more ambiguous? Yes – deciding word boundaries
13
Example: Ambiguity Some interpretations of : I made her duck.
I cooked duck for her. I cooked duck belonging to her. I created a toy duck which she owns. I caused her to quickly lower her head or body. I used magic and turned her into a duck. duck – morphologically and syntactically ambiguous: noun or verb. her – syntactically ambiguous: dative or possessive. make – semantically ambiguous: cook or create. make – syntactically ambiguous
14
Example: Ambiguity Resolution
Ambiguity resolution is possible by modeling language. For example: part-of-speech tagging -- Deciding whether duck is verb or noun. word-sense disambiguation -- Deciding whether make is create or cook or action lexical disambiguation -- Resolution of part-of-speech and word-sense ambiguities are two important kinds of lexical disambiguation. syntactic ambiguity -- her duck is an example of syntactic ambiguity, and can be addressed by probabilistic parsing
15
Language: Knowledge components
Phonology – concerns how words are related to the sounds for realization. Morphology – concerns how words are constructed from more basic meaning units called morphemes. A morpheme is the primitive unit of meaning in a language. Syntax – concerns how can be put together to form correct sentences and determines what structural role each word plays in the sentence and what phrases are subparts of other phrases. Semantics – concerns what words mean and how these meaning combine in sentences to form sentence meaning. The study of context-independent meaning.
16
Language: Knowledge components
Pragmatics – concerns how sentences are used in different situations and how use affects the interpretation of the sentence. Discourse – concerns how the immediately preceding sentences affect the interpretation of the next sentence. For example, interpreting pronouns and interpreting the temporal aspects of the information. World Knowledge – includes general knowledge about the world. What each language user must know about the other’s beliefs and goals.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.