Download presentation
Presentation is loading. Please wait.
1
Natural Language Processing
2
NLP Intro Language is meant for Communicating about the world.
By studying language, we can come to understand more about the world. If we can succeed at building computational mode of language, we will have a powerful tool for communicating about the world. We look at how we can exploit knowledge about the world, in combination with linguistic facts, to build computational natural language systems. NLP problem can be divided into two tasks: Processing written text, using lexical, syntactic and semantic knowledge of the language as well as the required real world information. Processing spoken language, using all the information needed above plus additional knowledge about phonology as well as enough added information to handle the further ambiguities that arise in speech.
3
NLP Intro The Problem : English sentences are incomplete descriptions of the information that they are intended to convey. Some dogs are outside is incomplete – it can mean Some dogs are on the lawn. Three dogs are on the lawn. Moti, Hira & Buzo are on the lawn. The good side : Language allows speakers to be as vague or as precise as they like. It also allows speakers to leave out things that the hearers already know.
4
NLP Intro The Problem : The same expression means different things in different context. Where’s the water? ( In a lab, it must be pure) Where’s the water? ( When you are thirsty, it must be potable or drinkable) Where’s the water? ( Dealing with a leaky roof, it can be filthy) The good side : Language lets us communicate about an infinite world using a finite number of symbols. The problem : There are lots of ways to say the same thing : Mary was born on October 11. Mary’s birthday is October 11. The good side : When you know a lot, facts imply each other. Language is intended to be used by agents who know a lot.
5
Steps in NLP 1. Morphological Analysis:
Individual words are analyzed into their components and non-word tokens such as punctuation are separated from the words. Tries to extract root word from decline or inflectional form of word after removing suffices and prefixes. Ex: getting the root “push ” from declined from pushes, pushed, pushing, etc. Assign appropriate syntactic categories such as noun, verb, adjective etc. to all words in the sentence. 2. Syntactic Analysis: Use the result of Morphological analysis to build a structure description of sentence based on grammatical rules. This step is called parsing. Creating a parse tree is the first step towards understanding a sentence.
6
Steps in NLP 3. Semantic Analysis:
The structures created by the syntactic analyzer are assigned meanings. It maps individual words in to corresponding object in the knowledge base and combine the words with each other with semantic rules. 4. Discourse integration: The meaning of an individual sentence may depend on the sentences that precede it and may influence the meanings of the sentences that follow it. Ex: the word “it” in sentence “John wanted it”, depends up on prior discourse context.
7
Steps in NLP 5. Pragmatic Analysis:
It refers to intended meaning of sentences used in different contexts. The context affects the interpretation of the sentence. Ex: John saw Mike in the garden with a cat. The structure representing what was said is reinterpreted to determine what was actually meant. EX: “Do you know what time it is?” should be interpreted as a request to be told the time.
8
Summary Results of each of the main processes combine to form a natural language system. All of the processes are important in a complete natural language understanding system. Not all programs are written with exactly these components. Sometimes two or more of them are collapsed.
9
Syntactic Processing Syntactic Processing is the step in which a flat input sentence is converted into a hierarchical structure that corresponds to the units of meaning in the sentence. This process is called parsing. It plays an important role in natural language understanding systems for two reasons: Semantic processing must operate on sentence constituent. If there is no syntactic parsing step, then the semantics system must decide on its own constituents. If parsing is done, on the other hand, it constrains the number of constituents that semantics can consider. Syntactic parsing is computationally less expensive than is semantic processing. Thus it can play a significant role in reducing overall system complexity.
10
Syntactic Processing Although it is often possible to extract the meaning of a sentence without using grammatical facts, it is not always possible to do so. Consider the examples: The satellite orbited Mars Mars orbited the satellite Almost all the systems that are actually used have two main components: A declarative representation, called a grammar, of the syntactic facts about the language. A procedure, called parser, that compares the grammar against input sentences to produce parsed structures.
11
Grammars and Parsers In Context of NLP , Parsing implies analyzing a sentence syntactically to assign syntactic tags (subject, verb, object etc.) to provide constituent structure (noun phrase, verb phrase etc.) or to characterize the syntactic relations between two words. Parsing technique is further divided in to rule-based parsing and statistical parsing. Rule-based parsing In rule-based parsing, syntactic structure of language is provided in form of linguistics rules which can be coded as production rules that are similar to context free rules. Production rules are defined using non-terminal and terminal symbols. Statistical parsing Require large corpora and linguistic knowledge is represented as statistical parameters or probability.
12
Grammars and Parsers Once the grammar rules are defined , a sentence is parsed using the grammar and tree kind of structure is built, if sentence is syntactically correct. This tree is called parse tree. Parsing can be done in two methods: top-down parsing and bottom up parsing. Bottom-up parsing: we start with the words in the sentence and apply grammar rules in the backward direction until a single tree is produced whose root matches with start symbol. Top-down parsing: we start with start symbol and apply grammar rules in forward direction until the terminal symbol of the parse tree corresponds to the word in the sentence. The choice between these two approaches is similar to the choice between forward and backward reasoning in other problem-solving tasks. The most important consideration is the branching factor. Is it greater going backward or forward? Sometimes these two approaches are combined to a single method called “bottom-up parsing with top-down filtering”.
13
Grammars and Parsers Consider the simple context free grammar for English language. The symbol -> is used for ‘defined as’. Vertical bar | for alternative definitions (OR) The S for sentence NP for noun phrase VP for verb phrase. The simple sentence recognized by this grammar is: a) the boy eats an apple b) The cute girl sings a song Rules Dictionary Words <s> -> <NP><VP> <Det> -> a |an | the <NP> -> <Det><Noun> <Noun> -> girl | apple | song <NP> -> <Det><Adj><Noun> <Adj> -> cute | smart <NP> -> <Adj><Noun> <verb> -> sings | eats | ate <VP> -> <Verb> <VP> -> <verb><NP>
14
A parse tree John ate the apple. S -> NP VP VP -> V NP
NP -> NAME NP -> ART N NAME -> John V -> ate ART-> the N -> apple S NP VP NAME John V ate ART N the apple
15
Parse tree The cute girl ate an apple S Top-Down Parsing NP VP
Bottom - Up Parsing NP Det Adj Noun Verb det Noun The cute Girl eat an apple
16
Grammars and Parsers First rule can be read as “ A sentence is composed of a noun phrase followed by Verb Phrase”; Vertical bar is OR ; ε represents empty string. Symbols that are further expanded by rules are called non terminal symbols. Symbols that correspond directly to strings that must be found in an input sentence are called terminal symbols. Pure context free grammars are not effective for describing natural languages. NLPs have less in common with computer language processing systems such as compilers.
17
A parse tree John ate the apple. S -> NP VP VP -> V NP
NP -> NAME NP -> ART N NAME -> John V -> ate ART-> the N -> apple S NP VP NAME John V ate ART N the apple
18
Semantic Analysis Define Clause Grammar (DCG)
19
DCG
22
Advantages of Semantic grammars
When the parse is complete, the result can be used immediately without the additional stage of processing that would be required if a semantic interpretation had not already been performed during the parse. Many ambiguities that would arise during a strictly syntactic parse can be avoided since some of the interpretations do not make sense semantically and thus cannot be generated by a semantic grammar. Syntactic issues that do not affect the semantics can be ignored. The drawbacks of use of semantic grammars are: The number of rules required can become very large since many syntactic generalizations are missed. Because the number of grammar rules may be very large, the parsing process may be expensive.
23
Case grammars Case grammars provide a different approach to the problem of how syntactic and semantic interpretation can be combined. Grammar rules are written to describe syntactic rather than semantic regularities. But the structures, that the rules produce correspond to semantic relations rather than to strictly syntactic ones Consider two sentences: Susan printed the file. The file was printed by Susan. The case grammar interpretation of the two sentences would both be : ( printed ( agent Susan) ( object File ))
24
Conceptual Parsing Conceptual parsing is a strategy for finding both the structure and meaning of a sentence in one step. Conceptual parsing is driven by dictionary that describes the meaning of words in conceptual dependency (CD) structures. The parsing is similar to case grammar. CD usually provides a greater degree of predictive power.
25
Discourse and Pragmatic processing
There are a number of important relationships that may hold between phrases and parts of their discourse contexts, including: Identical entities. Consider the text: Bill had a red balloon. John wanted it. The word “it” should be identified as referring to red balloon. This type of references are called anaphora. Parts of entities. Consider the text: Sue opened the book she just bought. The title page was torn. The phrase “title page” should be recognized as part of the book that was just bought.
26
Discourse and pragmatic processing
Parts of actions. Consider the text: John went on a business trip to New York. He left on an early morning flight. Taking a flight should be recognized as part of going on a trip. Entities involved in actions. Consider the text: My house was broken into last week. They took the TV and the stereo. The pronoun “they” should be recognized as referring to the burglars who broke into the house. Elements of sets. Consider the text: The decals we have in stock are stars, the moon, item and a flag. I’ll take two moons. Moons means moon decals
27
Discourse and Pragmatic processing
Names of individuals: Dave went to the movies. Causal chains There was a big snow storm yesterday. The schools were closed today. Planning sequences: Sally wanted a new car She decided to get a job. Implicit presuppositions: Did Joe fail CSE402?
28
Discourse and Pragmatic processing
We focus on using following kinds of knowledge: The current focus of the dialogue A model of each participant’s current beliefs The goal-driven character of dialogue The rules of conversation shared by all participants.
29
Thank You!!!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.