1 Latent Semantic Mapping: Dimensionality Reduction via Globally Optimal Continuous Parameter Modeling Jerome R. Bellegarda.

Slides:



Advertisements
Similar presentations
Introduction to Information Retrieval Outline ❶ Latent semantic indexing ❷ Dimensionality reduction ❸ LSI in information retrieval 1.
Advertisements

Dimensionality Reduction PCA -- SVD
INF 141 IR METRICS LATENT SEMANTIC ANALYSIS AND INDEXING Crista Lopes.
Comparison of information retrieval techniques: Latent semantic indexing (LSI) and Concept indexing (CI) Jasminka Dobša Faculty of organization and informatics,
What is missing? Reasons that ideal effectiveness hard to achieve: 1. Users’ inability to describe queries precisely. 2. Document representation loses.
Latent Semantic Analysis
Hinrich Schütze and Christina Lioma
DIMENSIONALITY REDUCTION BY RANDOM PROJECTION AND LATENT SEMANTIC INDEXING Jessica Lin and Dimitrios Gunopulos Ângelo Cardoso IST/UTL December
Query Operations: Automatic Local Analysis. Introduction Difficulty of formulating user queries –Insufficient knowledge of the collection –Insufficient.
Chapter 5: Query Operations Baeza-Yates, 1999 Modern Information Retrieval.
1 Latent Semantic Indexing Jieping Ye Department of Computer Science & Engineering Arizona State University
Dimensional reduction, PCA
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 4 March 30, 2005
1 Query Language Baeza-Yates and Navarro Modern Information Retrieval, 1999 Chapter 4.
Indexing by Latent Semantic Analysis Scot Deerwester, Susan Dumais,George Furnas,Thomas Landauer, and Richard Harshman Presented by: Ashraf Khalil.
1/ 30. Problems for classical IR models Introduction & Background(LSI,SVD,..etc) Example Standard query method Analysis standard query method Seeking.
IR Models: Latent Semantic Analysis. IR Model Taxonomy Non-Overlapping Lists Proximal Nodes Structured Models U s e r T a s k Set Theoretic Fuzzy Extended.
Introduction to Information Retrieval Introduction to Information Retrieval Hinrich Schütze and Christina Lioma Lecture 18: Latent Semantic Indexing 1.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 6 May 7, 2006
Paper Summary of: Modelling Retrieval and Navigation in Context by: Massimo Melucci Ahmed A. AlNazer May 2008 ICS-542: Multimedia Computing – 072.
Dimension of Meaning Author: Hinrich Schutze Presenter: Marian Olteanu.
Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.
1 CS 430 / INFO 430 Information Retrieval Lecture 9 Latent Semantic Indexing.
Automatic Collection “Recruiter” Shuang Song. Project Goal Given a collection, automatically suggest other items to add to the collection  Design a process.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.
1 Vector Space Model Rong Jin. 2 Basic Issues in A Retrieval Model How to represent text objects What similarity function should be used? How to refine.
Automated Essay Grading Resources: Introduction to Information Retrieval, Manning, Raghavan, Schutze (Chapter 06 and 18) Automated Essay Scoring with e-rater.
A Phonotactic-Semantic Paradigm for Automatic Spoken Document Classification Bin MA and Haizhou LI Institute for Infocomm Research Singapore.
NL Question-Answering using Naïve Bayes and LSA By Kaushik Krishnasamy.
Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.
Query Operations J. H. Wang Mar. 26, The Retrieval Process User Interface Text Operations Query Operations Indexing Searching Ranking Index Text.
Topic Modelling: Beyond Bag of Words By Hanna M. Wallach ICML 2006 Presented by Eric Wang, April 25 th 2008.
Latent Semantic Analysis Hongning Wang Recap: vector space model Represent both doc and query by concept vectors – Each concept defines one dimension.
CpSc 881: Information Retrieval. 2 Recall: Term-document matrix This matrix is the basis for computing the similarity between documents and queries. Today:
Pseudo-supervised Clustering for Text Documents Marco Maggini, Leonardo Rigutini, Marco Turchi Dipartimento di Ingegneria dell’Informazione Università.
Katrin Erk Vector space models of word meaning. Geometric interpretation of lists of feature/value pairs In cognitive science: representation of a concept.
Generic text summarization using relevance measure and latent semantic analysis Gong Yihong and Xin Liu SIGIR, April 2015 Yubin Lim.
June 5, 2006University of Trento1 Latent Semantic Indexing for the Routing Problem Doctorate course “Web Information Retrieval” PhD Student Irina Veredina.
A Multi-span Language Modeling Frame Work For Speech Recognition Jimmy Wang Speech Lab, NTU.
SINGULAR VALUE DECOMPOSITION (SVD)
Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Kevin Heinrich, Lai Wei, and Michael W. Berry University of Tennessee.
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 6. Dimensionality Reduction.
Latent Semantic Mapping (LSA) 資工四 阮鶴鳴 資工四 李運寰 “A multi-span language modeling framework for large vocabulary speech recognition,” J.R. Bellegarda -, 1998.
Latent Semantic Indexing and Probabilistic (Bayesian) Information Retrieval.
Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.
Levels of Image Data Representation 4.2. Traditional Image Data Structures 4.3. Hierarchical Data Structures Chapter 4 – Data structures for.
Modern information retreival Chapter. 02: Modeling (Latent Semantic Indexing)
A pTree organization for text mining... Position are April apple and an always. all again a... Term (Vocab)
1 Latent Concepts and the Number Orthogonal Factors in Latent Semantic Analysis Georges Dupret
V. Clustering 인공지능 연구실 이승희 Text: Text mining Page:82-93.
1 CS 430: Information Discovery Lecture 11 Latent Semantic Indexing.
Web Search and Text Mining Lecture 5. Outline Review of VSM More on LSI through SVD Term relatedness Probabilistic LSI.
10.0 Latent Semantic Analysis for Linguistic Processing References : 1. “Exploiting Latent Semantic Information in Statistical Language Modeling”, Proceedings.
Natural Language Processing Topics in Information Retrieval August, 2002.
Matrix Factorization & Singular Value Decomposition Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Algebraic Techniques for Analysis of Large Discrete-Valued Datasets 
Maximum Entropy techniques for exploiting syntactic, semantic and collocational dependencies in Language Modeling Sanjeev Khudanpur, Jun Wu Center for.
Vector Semantics Dense Vectors.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
A Self-organizing Semantic Map for Information Retrieval Xia Lin, Dagobert Soergel, Gary Marchionini presented by Yi-Ting.
Search Engine and Optimization 1. Agenda Indexing Algorithms Latent Semantic Indexing 2.
Best pTree organization? level-1 gives te, tf (term level)
Latent Semantic Indexing
Vector-Space (Distributional) Lexical Semantics
LSI, SVD and Data Management
Design open relay based DNS blacklist system
HCC class lecture 13 comments
Restructuring Sparse High Dimensional Data for Effective Retrieval
Topic: Semantic Text Mining
Latent Semantic Analysis
Presentation transcript:

1 Latent Semantic Mapping: Dimensionality Reduction via Globally Optimal Continuous Parameter Modeling Jerome R. Bellegarda

2 Outline Introduction LSM Applications Conclusions

3 Introduction LSA in IR: –Words of queries and documents –Recall and precision Assumption: There is some underlying latent semantic structure in the data –Latent structure is conveyed by correlation patterns –Documents: bag-of-words model LSA improves separability among different topics

4 Introduction

5 Success of LSA: –Word clustering –Document clustering –Language modeling –Automated call routing –Semantic Inference for spoken interface control These solutions all leverage LSA’s ability to expose global relationships in context and meaning

6 Introduction Three unique factors for LSA: –The mapping of discrete entries –The dimensionality reduction –The intrinsically global outlook The change of terminology to latent semantic mapping (LSM) to convey increased reliance on the general properties

7 Latent Semantic Mapping LSA defines a mapping between the discrete sets –M: an inventory of M individual units, such as words –N: an collection of N meaningful compositions of units, such as documents –L: a continuous vector space –r i : unit in M –c j : composition in N

8 Feature Extraction Construction of a matrix W of co-occurrences between units and compositions The cell of W:

9 Feature Extraction The entropy of r i : Value of Entropy Close to 0 means that the unit is present only in a few specific compositions. The global weight is therefore a measure of the indexing power of the unit r i

10 Singular Value Decomposition The MxN unit-composition matrix W defines two vector representations for the units and the compositions r i : a row factor of dimension N c j : a column factor of dimension M Unpractical: –M,N can be extremely large –Vector r i, c j are typically sparse –Two spaces are distinct from each other

11 Singular Value Decomposition Employ SVD: U: MxR left singular matrix with row vectors u i S: RxR diagonal matrix of singular values V: NxR right singular matrix with row vector v j U, V are column-orthonormal – U T U=V T V=I R R<min(M, N)

12 Singular Value Decomposition

13 Singular Value Decomposition captures the major structural associations in and ignores higher order effects The closeness of vector in L: –Unit-unit comparison –Composition-composition comparison –Unit-Composition comparison

14 Closeness Measure WW T : co-occurrences between units W T W: co-occurrences between compositions r i, r j : units which have similar pattern of occurrence across the composition c i, c j : compositions which have similar pattern of occurrence across the unit

15 Closeness Measure Unit-Unit Comparisons: Cosine measure: Distance: [0, π]

16 Unit-Unit Comparisons

17 Closeness Measure Composition-Composition Comparisons: Cosine measure: Distance: [0, π]

18 Closeness Measure Unit-Composition Comparisons: Cosine measure: Distance: [0, π]

19 LSM Framework Extension Observe a new composition, p>N, the tilde symbol reflects the fact that the composition was not part of the original N, a column vector of dimension M, can be thought of as an additional column of the matrix W U, S do not change:

20 LSM Framework Extension : pseudo-composition : pseudo-composition vector If the addition of causes the major structural associations in W to shift in some substantial manner, the singular vectors will become inadequate.

21 LSM Framework Extension It would be necessary to re-compute SVD to find a proper representation for

22 Salient Characteristics of LSM A single vector embedding for both units and compositions in the same continuous vector space L A relatively low dimensionality, which make operations such as clustering meaningful and practical An underlying structure reflecting globally meaningful relationships, with natural similarity metrics to measure the distance between units, between compositions or between units and compositions in L

23 Applications Semantic classification Multi-span language modeling Junk filtering Pronunciation modeling TTS Unit Selection

24 Semantic Classification Semantic classification refers to determine which one of predefined topic a given document is most closely aligned with The centroid of each clusters can be viewed as the semantic representation of this outcome in LSM space –Semantic anchor A newly observed word sequence measures by computing the distance between the document and semantic anchor, and pick minimum

25 Semantic Classification Domain knowledge is automatically encapsulated in the LSM space in a data-driven fashion For Desktop interface control: –Semantic inference

26 Semantic Inference

27 Multi-Span Language Modeling In a standard n-gram, the history is string In LSM language modeling, the history is the current document up to word Pseudo-document: –Continually updated as q increases

28 Multi-Span Language Modeling An Integrated n-gram + LSM formulation for the overall language model probability: –Different syntactic constructs can be used to carry the same meaning (content words)

29 Multi-Span Language Modeling Assume that the probability of the document History given the current word is not affected by immediate context preceding it

30 Multi-Span Language Modeling

31 Junk Filtering It can be viewed as a degenerate case of semantic classification (two categories) –Legitimate –Junk M: an inventory of words, symbols N: a binary collection of messages Two semantic anchors

32 Pronunciation Modeling Also called grapheme-to-phoneme conversion (GPC) Orthographic anchors –(one for each in-vocabulary word) Orthographic neighborhood –In-vocabulary word with High closeness for out- vocabulary word

33 Pronunciation Modeling

34 Conclusions Descriptive Power –Forgoing local constraints is not acceptable in some situations Domain Sensitivity –Depend on the quality of the training data –polysemy Updating the LSM Space –SVD on the fly is not practical Success of LSM for three characteristics