Natural Language Processing. According to research at an Elingsh uinervtisy, it deosn’t mttaer in what oredr the ltteers in a wrod are, the olny iprmoetnt.

Slides:



Advertisements
Similar presentations
Introduction to Computational Linguistics
Advertisements

Motivating readersMotivating readers Reading in schoolReading in school Reading at homeReading at home Comprehension skillsComprehension skills.
Reading Instruction (NOT Instructions!) Key Concepts for Teaching Reading at the Secondary Level.
ITEC 1010 Information and Organizations Artificial Intelligence.
Fall 2008Programming Development Techniques 1 Topic 9 Symbol Manipulation Generating English Sentences Section This is an additional example to symbolic.
Grammars.
Dr. Orla Murphy School of English 27 May 2011
1 Words and the Lexicon September 10th 2009 Lecture #3.
Introduction to Computational Linguistics Lecture 2.
CSE (c) S. Tanimoto, 2008 Natural Language Understanding 1 Natural Language Understanding Outline: Motivation Structural vs Statistical Approaches.
Lect. 11Phrase structure rules Learning objectives: To define phrase structure rules To learn the forms of phrase structure rules To compose new sentences.
What is science? Science: is a process by which we gain knowledge deals only with the natural world collects & organizes information (data/evidence) gives.
Lecture 1 Introduction: Linguistic Theory and Theories
Phonetics, Phonology, Morphology and Syntax
* What is reading? * Challenges for older readers and writers * What can I do to help? * What is available to support me? * Questions * Reading and writers.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing INTRODUCTION Muhammed Al-Mulhem March 1, 2009.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Three Generative grammars
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Exposure and Attention
Inclusive Learning Through Technology Damian Gordon.
Grammars.
Natural Language Processing June 2013 Michel Bruley.
Natural Language Processing Rada Mihalcea Fall 2008.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Neural Networks AI – Week 21 Sub-symbolic AI One: Neural Networks Lee McCluskey, room 3/10
Sensation.
 The nugger was flinp.  The nugger was flinp and wugnet.  The nugger was flinp, wugnet and manple in my waslet.  What was flinp?  How else does the.
Natural Language Processing Introduction. 2 Natural Language Processing We’re going to study what goes into getting computers to perform useful and interesting.
Natural Language Processing Introduction. Any Light at The End of The Tunnel ? Yahoo, Google, Microsoft  Information Retrieval Monster.com, HotJobs.com.
SYNTAX Lecture -1 SMRITI SINGH.
A Procedural Model of Language Understanding Terry Winograd in Schank and Colby, eds., Computer Models of Thought and Language, Freeman, 1973 발표자 : 소길자.
Human-to-Human Communication A model for human-computer interaction? Important scope limitation: problem solving Why look at human-human interface? – The.
Introduction to CL & NLP CMSC April 1, 2003.
Teaching reading.
ToK - Language How much could you know about the world if you had no language or means of communicating with other people?
Technical Reading Presented by Beatrice Moore Luchin NUMBERS Mathematics Professional Development NUMBERSmpd.com.
Parsing. Language A set of strings from an alphabet which may be empty, finite or infinite. A language is defined by a grammar and we may also say that.
Artificial Intelligence: Natural Language
What you have learned and how you can use it : Grammars and Lexicons Parts I-III.
The phenomenal power of the human mind   I cdnuolt blveiee taht I cluod aulaclty uesdnatnrd waht I was rdanieg.The phaonmneal pweor of the hmuan mnid!
Rules, Movement, Ambiguity
Artificial Intelligence: Natural Language
Natural Language Processing Chapter 1 : Introduction.
Illusions and Other Visual Defects CITA 6016 Food Sensory Analysis University of Puerto Rico Food Science & Technology.
Bible Study for Dummies the Rest of Us I can do this!
SYNTAX.
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
1.
Natural Language Processing (NLP)
Understanding Naturally Conveyed Explanations of Device Behavior Michael Oltmans and Randall Davis MIT Artificial Intelligence Lab.
Flowers for Algernon By Daniel Keyes. Story Background Revolves around the main character, Charlie Gordon (who is the narrator) – a thirty-two-year-old.
NATURAL LANGUAGE PROCESSING
Syntax By WJQ. Syntax : Syntax is the study of the rules governing the way words are combined to form sentences in a language, or simply, the study of.
Natural Language Processing Tasneem Ghnaimat Spring 2013.
MENTAL GRAMMAR Language and mind. First half of 20 th cent. – What the main goal of linguistics should be? Behaviorism – Bloomfield: goal of linguistics.
Inspiring Youth to Live their Dreams! Scott Shickler Founder & CEO.
AS English Language Word Classes – The nuts and bolts of language.
SYNTAX.
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 King Faisal University.
Language (Verbal Communication)
Aoccdrnig to rscheearch at Cmabrigde Uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoatnt tihng is taht the frist.
Natural Language Processing
SYNTAX.
Part I: Basics and Constituency
Science and the Scientific Method
Science and the Scientific Method
Sabotage Effective Communication
Science and the Scientific Method
Artificial Intelligence 2004 Speech & Natural Language Processing
Presentation transcript:

Natural Language Processing

According to research at an Elingsh uinervtisy, it deosn’t mttaer in what oredr the ltteers in a wrod are, the olny iprmoetnt tihng is that the frist and lsat ltteer are in the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit a porbelm. Tihs is bcuseae we do not raed ervey lteter by islelf but the wrod as a wlohe. Ceehiro. That’ll sutff the splelchekcer. (Exert from Sideswipe NZ Herald 22/09/03)

1970s Example: SHRDLU Person: PICK UP A BIG RED BLOCK. Computer: OK. (does it) Person: GRASP THE PYRAMID. Computer: I DON'T UNDERSTAND WHICH PYRAMID YOU MEAN. Person: FIND A BLOCK WHICH IS TALLER THAN THE ONE YOU ARE HOLDING AND PUT IT INTO THE BOX. Computer: BY "IT", I ASSUME YOU MEAN THE BLOCK WHICH IS TALLER THAN THE ONE I AM HOLDING. Computer: OK. (does it) Person: WHAT DOES THE BOX CONTAIN? Computer: THE BLUE PYRAMID AND THE BLUE BLOCK. Person: WHAT IS THE PYRAMID SUPPORTED BY? Computer: THE BOX. Person: HOW MANY BLOCKS ARE NOT IN THE BOX? Computer: FOUR OF THEM. Person: IS AT LEAST ONE OF THEM NARROWER THAN THE ONE WHICH I TOLD YOU TO PICK UP? Computer: YES, THE RED CUBE. Terry Winograd MIT Ph.D. Thesis. Terry Winograd

Pomegranade

Natural language processing (NLP) Human Language Technology (HLT), Natural Language Engineering (NLE) is considered a sub-field of artificial intelligence and has significant overlap with the field of computational linguistics. It is concerned with the interactions between computers and human (natural) languages. Natural language generation systems convert information from computer databases into readable human language. Natural language understanding systems convert human language into representations that are easier for computer programs to manipulate. The term natural language is used to distinguish human languages (e.g. English, Persian, Swedish) from formal or computer languages (e.g. C++, Prolog). NLP encompasses both text and speech, but work on speech processing has evolved into a separate field.

Where does it fit in the CS taxonomy? Computers Artificial Intelligence AlgorithmsDatabasesNetworking Robotics Search Natural Language Processing Information Retrieval Machine Translation Language Analysis SemanticsParsing … …

Yahoo, Google, Microsoft Information Retrieval Monster.com, HotJobs.com (Job finders) Information Extraction & Information Retrieval Systran powers Babelfish, Google Machine Translation Ask Jeeves Question Answering Myspace, Facebook, Blogspot Processing of User- Generated Content Tools for “business intelligence” All “Big Companies” have (several) strong NLP research labs: IBM, Microsoft, AT&T, Xerox, Sun, etc. Academia: research in an university environment Applications

What is NLP? Combination of computational linguistics, artificial intelligence & cognitive science. Concentrates on interpreting text using a combination of lexical, syntactic, semantic and real world knowledge. Applications include intelligent translators, speech recognition software, information management tools and other types of communication software.

Grammar The grammar of a language is a description of the structure of that language. Grammars provide a scheme for specifying the structure of sentences and rules for combining words into correct phrases and clauses.

English Grammar English word order follows a Subject- Object-Verb (SVO) linguistic topology. The subject of a verb is the “doer” of the verb, and the object is the “doee”. The catis drinkingthe milk. SubjectVerbObject

Syntax Syntax is the study of the rules, or patterns, that govern the way the words in a sentence come together. Syntax deals with how different words which are categorised into “parts of speech” (nouns, adjectives, verbs etc), and how they are combined into clauses, or phrases, which in turn combine into sentences.

Syntactic Analysis Syntactic analysis involves isolating phrases and sentences into a hierarchical structure, allowing the study of its constituents. For example the sentence “the big cat is drinking milk” can be broken up into the following constituents:

Syntactic Analysis The big cat is drinking milk Noun PhraseVerb Phrase DeterminerAdjective Phrase NounAuxiliaryVerbNoun Phrase Thebigcatisdrinkingmilk

A Grammar for a very small fragment of English sentence --> noun_phrase, verb_phrase. noun_phrase --> determiner, noun. noun_phrase --> proper_noun. determiner --> [the]. determiner --> [a]. proper_noun --> [pedro]. noun --> [man]. noun --> [apple]. verb_phrase --> verb, noun_phrase. verb_phrase --> verb. verb --> [eats]. verb --> [sings]. Implementation- Prolog

?- phrase(sentence, [the, man, eats]). yes ?- phrase(sentence, [the, man, eats, the, apple]). yes ?- phrase(sentence, [the, apple, eats, a, man]). yes ?- phrase(sentence, [pedro, sings, the, pedro]). no ?- phrase(sentence,[eats, apple, man]). no ?- phrase(sentence,L).

L = [the, man, eats, the, man] ; L = [the, man, eats, the, apple] ; L = [the, man, eats, a, man] ; L = [the, man, eats, a, apple] ; L = [the, man, eats, pedro] ; L = [the, man, sings, the, man] ; L = [the, man, sings, the, apple] ; L = [the, man, sings, a, man] ; L = [the, man, sings, a, apple] ; L = [the, man, sings, pedro] ; L = [the, man, eats] ; L = [the, man, sings] ; L = [the, apple, eats, the, man] ; L = [the, apple, eats, the, apple] ; L = [the, apple, eats, a, man] ; L = [the, apple, eats, a, apple] ; L = [the, apple, eats, pedro] ; L = [the, apple, sings, the, man] ; L = [the, apple, sings, the, apple] ; L = [the, apple, sings, a, man] ;

Issues in Syntax “the dog ate my homework” - Who did what? Identify the part of speech (POS) –Dog = noun ; ate = verb ; homework = noun –English POS tagging Identify collocations mother in law, hot dog

Chomsky’s Grammars Chomsky introduced transformational grammars (also called transformational generative grammars or generative grammars). He introduced the idea of “deep structures” which provide a syntactic base of language and consist of:

Chomsky’s Grammars –a series of phrase-structure (rewrite) rules –a series of (possibly universal) rules that generates the underlying phrase- structure of a sentence –a series of transformations that act upon the phrase-structure, producing more complex sentences –a series of morphophonemic rules controlling pronunciation.

Chomsky’s Lexicon The lexicon, which can be thought of as a dictionary of the language in a particular form, lists all of the vocabulary words in the language and associates them with their syntactic, semantic and phonological information. This information is represented in terms of “features”.

Chomsky’s Feature Terms For example, the entry for “cat” might have the following syntactic features: Cat: [+ Noun], [+ Count], [+ Common], [+ Animate] These features are used to fill “slots” in a set of phrase markers. For example, a phrase marker requiring an animate noun ([+ Animate]) would find “cat” eligible for lexical subsitiution into that slot, as it fulfils the requirements of being an animate noun.

Syntactics vs Semantics One of the most controversial topics in the development of transformational grammar is the reationship between syntax and semantics. There is a considerable degree of interdependence between the two, and the problem is how to formalise this relationship.

Phrase Structure Grammars Phrase-structure rules are used to describe a given language's syntax by attempting to break language down into its constituent parts (also known as syntactic categories) namely phrasal categories and lexical categories (parts of speech). There are many kinds of phrase-structure rules, which themselves can be combined to generate additional phrase-structure rules.

Phrase Structure Grammars In particlar phrase-structure rules must account for the following characteristics: 1.All languages combine nouns (N) and verbs (V) to express ideas about the universe. 2.All languages have rules determining how these are combined into meaningful units.

Phrase Structure Grammars 3.All languages have recursion, i.e. at least one rule that can be repeated ad infinitum: –An example of this is the English use of "and", which can link any series of two or more nouns or two or more verbs: "His and hers and theirs and Mary's and John's... etc. " "He ran and jumped and played and skipped and danced and.. etc. "

Phrase Structure Grammar –This would be described in Transfomational Grammar as: A noun phrase (NP) consists of a N or NP, the word ‘and’, and another N or NP. A verb phrase (VP) consists of a V or VP, the word ‘and’, and another V or VP.

Phrase Structure Tree Sentence Noun PhraseVerb Phrase DeterminerNounVerbNoun Phrase DeterminerNoun Amonkeyclimbsthetrees

Problems with Traditional Grammars They are Grammar based when natural language isn’t strictly ‘Grammar based’. Most don’t take into account language variations and dialects. Humans have a built in natural language processor that can handle things machine natural language processors cannot.

Yoda “When 900 years old you reach, look as good you will not.” “With you the force is.” “A brave man your Father was.” Yoda (typically) uses the OSV linguistic topology which is characteristic of some of the Brazilian languages.

Inherent Complexity To understand a sentence you must do more than combine the dictionary meanings of it’s constituents. A large amount of human knowledge is assumed and communication takes place between complex agents in complex environments.

Statistical approach Statistical Machine Translation