Jeff Howbert Introduction to Machine Learning Winter 2014 1 Machine Learning Natural Language Processing.

Slides:



Advertisements
Similar presentations
INFO624 - Week 2 Models of Information Retrieval Dr. Xia Lin Associate Professor College of Information Science and Technology Drexel University.
Advertisements

1 Latent Semantic Mapping: Dimensionality Reduction via Globally Optimal Continuous Parameter Modeling Jerome R. Bellegarda.
For Friday No reading Homework –Chapter 23, exercises 1, 13, 14, 19 –Not as bad as it sounds –Do them IN ORDER – do not read ahead here.
What is missing? Reasons that ideal effectiveness hard to achieve: 1. Users’ inability to describe queries precisely. 2. Document representation loses.
Hinrich Schütze and Christina Lioma
1 Latent Semantic Indexing Jieping Ye Department of Computer Science & Engineering Arizona State University
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 4 March 30, 2005
1 CS 502: Computing Methods for Digital Libraries Lecture 12 Information Retrieval II.
Singular Value Decomposition in Text Mining Ram Akella University of California Berkeley Silicon Valley Center/SC Lecture 4b February 9, 2011.
TFIDF-space  An obvious way to combine TF-IDF: the coordinate of document in axis is given by  General form of consists of three parts: Local weight.
Properties of Text CS336 Lecture 3:. 2 Information Retrieval Searching unstructured documents Typically text –Newspaper articles –Web pages Other documents.
IR Models: Latent Semantic Analysis. IR Model Taxonomy Non-Overlapping Lists Proximal Nodes Structured Models U s e r T a s k Set Theoretic Fuzzy Extended.
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Information Retrieval: Models and Methods October 15, 2003 CMSC Gina-Anne Levow.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 6 May 7, 2006
Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.
Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.
Chapter 5: Information Retrieval and Web Search
Mining and Summarizing Customer Reviews
Presented By Wanchen Lu 2/25/2013
9/8/20151 Natural Language Processing Lecture Notes 1.
CC 2007, 2011 attrbution - R.B. Allen Text and Text Processing.
1 Vector Space Model Rong Jin. 2 Basic Issues in A Retrieval Model How to represent text objects What similarity function should be used? How to refine.
Automated Essay Grading Resources: Introduction to Information Retrieval, Manning, Raghavan, Schutze (Chapter 06 and 18) Automated Essay Scoring with e-rater.
Toman, Steinberger, Ježek Searching and Summarizing in a Multilingual Environment Michal Toman, Josef Steinberger, Karel Ježek University of West Bohemia.
A brief overview of Speech Recognition and Spoken Language Processing Advanced NLP Guest Lecture August 31 Andrew Rosenberg.
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
Presentation to VII International Workshop on Advanced Computing and Analysis Techniques in Physics Research October, 2000.
Automatic Detection of Tags for Political Blogs Khairun-nisa Hassanali Vasileios Hatzivassiloglou The University.
Information Retrieval and Web Search Cross Language Information Retrieval Instructor: Rada Mihalcea Class web page:
WORD SENSE DISAMBIGUATION STUDY ON WORD NET ONTOLOGY Akilan Velmurugan Computer Networks – CS 790G.
Latent Semantic Analysis Hongning Wang Recap: vector space model Represent both doc and query by concept vectors – Each concept defines one dimension.
CpSc 881: Information Retrieval. 2 Recall: Term-document matrix This matrix is the basis for computing the similarity between documents and queries. Today:
Introduction to CL & NLP CMSC April 1, 2003.
Pseudo-supervised Clustering for Text Documents Marco Maggini, Leonardo Rigutini, Marco Turchi Dipartimento di Ingegneria dell’Informazione Università.
Katrin Erk Vector space models of word meaning. Geometric interpretation of lists of feature/value pairs In cognitive science: representation of a concept.
Generic text summarization using relevance measure and latent semantic analysis Gong Yihong and Xin Liu SIGIR, April 2015 Yubin Lim.
Chapter 6: Information Retrieval and Web Search
Latent Semantic Indexing: A probabilistic Analysis Christos Papadimitriou Prabhakar Raghavan, Hisao Tamaki, Santosh Vempala.
1 Learning Sub-structures of Document Semantic Graphs for Document Summarization 1 Jure Leskovec, 1 Marko Grobelnik, 2 Natasa Milic-Frayling 1 Jozef Stefan.
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 7. Topic Extraction.
SINGULAR VALUE DECOMPOSITION (SVD)
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.
1 Sentence Extraction-based Presentation Summarization Techniques and Evaluation Metrics Makoto Hirohata, Yousuke Shinnaka, Koji Iwano and Sadaoki Furui.
Computational linguistics A brief overview. Computational Linguistics might be considered as a synonym of automatic processing of natural language, since.
Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Kevin Heinrich, Lai Wei, and Michael W. Berry University of Tennessee.
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 6. Dimensionality Reduction.
1 Business Proprietary © 2009 Oculus Info Inc. Everyone’s a Critic: Memory Models and Uses for an Artificial Turing Judge W. Joseph MacInnes, Blair C.
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Friday Finish chapter 24 No written homework.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
1 Latent Concepts and the Number Orthogonal Factors in Latent Semantic Analysis Georges Dupret
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
Web Search and Text Mining Lecture 5. Outline Review of VSM More on LSI through SVD Term relatedness Probabilistic LSI.
Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.
Text segmentation Amany AlKhayat. Before any real processing is done, text needs to be segmented at least into linguistic units such as words, punctuation,
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
Oracle Advanced Analytics
Best pTree organization? level-1 gives te, tf (term level)
Information Retrieval: Models and Methods
Sentiment analysis algorithms and applications: A survey
Information Retrieval: Models and Methods
Information Retrieval
Design open relay based DNS blacklist system
Text Mining & Natural Language Processing
Restructuring Sparse High Dimensional Data for Effective Retrieval
Latent Semantic Analysis
Presentation transcript:

Jeff Howbert Introduction to Machine Learning Winter Machine Learning Natural Language Processing

Jeff Howbert Introduction to Machine Learning Winter l Watson is an open-domain question-answering system. l In 2011 Watson beat the two highest ranked human players in a nationally televised two-game Jeopardy! match. IBM’s Watson computer

Jeff Howbert Introduction to Machine Learning Winter Modeling document similarity l Vector space models l Latent semantic indexing

Jeff Howbert Introduction to Machine Learning Winter Vector space models l Vector of features for each document –Word frequencies (usually weighted) –Meta attributes, e.g. title, URL, PageRank l Can use vector as document descriptor for classification tasks l Can measure document relatedness by cosine similarity of vectors –Useful for clustering, ranking tasks

Jeff Howbert Introduction to Machine Learning Winter Vector space models l Term frequency-inverse document frequency (tf- idf) –Very widely used model –Each feature represents a single word (term) –Feature value is product of:  tf = proportion of counts of that term relative to all terms in document  idf = log( total number of documents / number of documents that contain the term)

Jeff Howbert Introduction to Machine Learning Winter Vector space models l Example of tf-idf vectors for a collection of documents

Jeff Howbert Introduction to Machine Learning Winter Vector space models l Vector space models cannot detect: –Synonymy – multiple words with same meaning –Polysemy – single word with multiple meanings (e.g. play, table)

Jeff Howbert Introduction to Machine Learning Winter Latent semantic indexing l Aggregate document term vectors into matrix X –Rows represent terms –Columns represent documents l Factorize X using singular value decomposition (SVD) –X = T  S  D T

Jeff Howbert Introduction to Machine Learning Winter Latent semantic indexing l Factorize X using singular value decomposition (SVD) –X = T  S  D T  Columns of T are orthogonal and contain eigenvectors of XX T  Columns of D are orthogonal and contain eigenvectors of X T X  S is diagonal and contains singular values (analogous to eigenvalues)  Rows of T represent terms in a new orthogonal space  Rows of D represent documents in a new orthogonal space –In the new orthogonal spaces of T and D, the correlations originally present in X are captured in separate dimensions  Better exposes relationships among data items  Identifies dimensions of greatest variation within data

Jeff Howbert Introduction to Machine Learning Winter l Dimensional reduction with SVD –Select k largest singular values and corresponding columns from T and D –X k = T k  S k  D k T is a reduced rank reconstruction of full X Latent semantic indexing

Jeff Howbert Introduction to Machine Learning Winter Latent semantic indexing l Dimensional reduction with SVD –Reduced rank representations of terms and documents referred to as “latent concepts” –Compare documents in lower dimensional space  classification, clustering, matching, ranking –Compare terms in lower dimensional space  synonymy, polysemy, other cross-term semantic relationships

Jeff Howbert Introduction to Machine Learning Winter Latent semantic indexing l Unsupervised l May not learn a matching score that works well for a task of interest

Jeff Howbert Introduction to Machine Learning Winter l For most tasks on following slides, there are: –Well-defined problem setting –Large volume of research –Standard metric for evaluating the task –Standard corpora on which to evaluate –Competitions devoted to the specific task Major tasks in NLP (Wikipedia)

Jeff Howbert Introduction to Machine Learning Winter l Automatically translate text from one human language to another. –This is one of the most difficult problems, and is a member of a class of problems colloquially termed "AI-complete", i.e. requiring all of the different types of knowledge that humans possess (grammar, semantics, facts about the real world, etc.) in order to solve properly. Machine translation

Jeff Howbert Introduction to Machine Learning Winter l The grammar for natural languages is ambiguous and typical sentences have multiple possible analyses. For a typical sentence there may be thousands of potential parses (most of which will seem completely non- sensical to a human). Parse tree (grammatical analysis)

Jeff Howbert Introduction to Machine Learning Winter l Given a sentence, determine the part of speech for each word. Many words, especially common ones, can serve as multiple parts of speech. For example, "book" can be a noun or verb, and "out" can be any of at least five different parts of speech. Part-of-speech tagging

Jeff Howbert Introduction to Machine Learning Winter l Given a sound clip of a person or people speaking, determine the textual representation of the speech. –Another extremely difficult problem, also regarded as "AI- complete". –In natural speech there are hardly any pauses between successive words, and thus speech segmentation (separation into words) is a necessary subtask of speech recognition. –In most spoken languages, the sounds representing successive letters blend into each other in a process termed coarticulation, so the conversion of the analog signal to discrete characters can be a very difficult process. Speech recognition

Jeff Howbert Introduction to Machine Learning Winter l Hidden Markov model (HMM) for phoneme extraction Speech recognition

Jeff Howbert Introduction to Machine Learning Winter l Extract subjective information, usually from a set of documents like online reviews, to determine "polarity" about specific objects. –Especially useful for identifying trends of public opinion in the social media, for the purpose of marketing. Sentiment analysis

Jeff Howbert Introduction to Machine Learning Winter l Concerned with extraction of semantic information from text. –Named entity recognition –Coreference resolution –Relationship extraction –Word sense disambiguation –Automatic summarization –etc. Information extraction (IE)

Jeff Howbert Introduction to Machine Learning Winter l Given a stream of text, determine which items in the text map to proper names, such as people or places, and what the type of each such name is (e.g. person, location, organization). –In English, capitalization can aid in recognizing named entities, but cannot aid in determining the type of named entity, and in any case is often insufficient. For example, the first word of a sentence is also capitalized, and named entities often span several words. –German capitalizes all nouns. –French and Spanish do not capitalize names that serve as adjectives. –Many languages (e.g. Chinese or Arabic) do not have capitalization at all. Named entity recognition

Jeff Howbert Introduction to Machine Learning Winter l Given a chunk of text, determine which words ("mentions") refer to the same objects ("entities"). –Anaphora resolution is a specific example of this task, concerned with matching up pronouns with the nouns or names that they refer to. –The more general task of coreference resolution also includes identifying so-called "bridging relationships" involving referring expressions.  For example, in a sentence such as "He entered John's house through the front door", "the front door" is a referring expression and the bridging relationship to be identified is the fact that the door being referred to is the front door of John's house (rather than of some other structure that might also be referred to). Coreference resolution

Jeff Howbert Introduction to Machine Learning Winter l Given a chunk of text, identify the relationships among named entities (e.g. who is the wife of whom). Relationship extraction

Jeff Howbert Introduction to Machine Learning Winter l Many words have more than one meaning; we have to select the meaning which makes the most sense in context. –For this problem, we are typically given a list of words and associated word senses, e.g. from a dictionary or from an online resource such as WordNet.WordNet Word sense disambiguation

Jeff Howbert Introduction to Machine Learning Winter l Produce a readable summary of a chunk of text. Often used to provide summaries of text of a known type, such as articles in the financial section of a newspaper. Automatic summarization

Jeff Howbert Introduction to Machine Learning Winter l Convert information from computer databases into readable human language. Natural language generation

Jeff Howbert Introduction to Machine Learning Winter l Convert chunks of text into more formal representations such as first-order logic structures that are easier for computer programs to manipulate. –Natural language understanding involves the identification of the intended semantic from the multiple possible semantics which can be derived from a natural language expression which usually takes the form of organized notations of natural languages concepts. Introduction and creation of language metamodel and ontology are efficient however empirical solutions. An explicit formalization of natural languages semantics without confusions with implicit assumptions such as closed world assumption (CWA) vs. open world assumption, or subjective Yes/No vs. objective True/False is expected for the construction of a basis of semantics formalization. Natural language understanding

Jeff Howbert Introduction to Machine Learning Winter l Given an image representing printed text, determine the corresponding text. Optical character recognition (OCR)

Jeff Howbert Introduction to Machine Learning Winter l Separate a chunk of continuous text into separate words. –For English, this is fairly trivial, since words are usually separated by spaces. However, some written languages like Chinese, Japanese, and Thai do not mark word boundaries in such a fashion, and in those languages text segmentation is a significant task requiring knowledge of the vocabulary and morphology of words in the language. Word segmentation

Jeff Howbert Introduction to Machine Learning Winter l Speech recognition l Text-to-speech Speech processing

Jeff Howbert Introduction to Machine Learning Winter Automated essay scoring