Semantic Processing with Context Analysis

Slides:



Advertisements
Similar presentations
Answering Approximate Queries over Autonomous Web Databases Xiangfu Meng, Z. M. Ma, and Li Yan College of Information Science and Engineering, Northeastern.
Advertisements

Feature Selection Advanced Statistical Methods in NLP Ling 572 January 24, 2012.
Chapter 5: Introduction to Information Retrieval
Introduction to Information Retrieval
2 Information Retrieval System IR System Query String Document corpus Ranked Documents 1. Doc1 2. Doc2 3. Doc3.
SI485i : NLP Set 11 Distributional Similarity slides adapted from Dan Jurafsky and Bill MacCartney.
Ranking models in IR Key idea: We wish to return in order the documents most likely to be useful to the searcher To do this, we want to know which documents.
Introduction to Information Retrieval (Manning, Raghavan, Schutze) Chapter 6 Scoring term weighting and the vector space model.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Information Retrieval Ling573 NLP Systems and Applications April 26, 2011.
Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.
Query Operations: Automatic Local Analysis. Introduction Difficulty of formulating user queries –Insufficient knowledge of the collection –Insufficient.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
Compare&Contrast: Using the Web to Discover Comparable Cases for News Stories Presenter: Aravind Krishna Kalavagattu.
1 CS 430 / INFO 430 Information Retrieval Lecture 3 Vector Methods 1.
Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.
Query Operations: Automatic Global Analysis. Motivation Methods of local analysis extract information from local set of documents retrieved to expand.
Chapter 5: Information Retrieval and Web Search
CS344: Introduction to Artificial Intelligence Vishal Vachhani M.Tech, CSE Lecture 34-35: CLIR and Ranking in IR.
Modeling (Chap. 2) Modern Information Retrieval Spring 2000.
1 Vector Space Model Rong Jin. 2 Basic Issues in A Retrieval Model How to represent text objects What similarity function should be used? How to refine.
Concept Clustering, Summarization and Annotation Qiaozhu Mei.
A Two Tier Framework for Context-Aware Service Organization & Discovery Wei Zhang 1, Jian Su 2, Bin Chen 2,WentingWang 2, Zhiqiang Toh 2, Yanchuan Sim.
Text Classification, Active/Interactive learning.
Outline Quick review of GS Current problems with GS Our solutions Future work Discussion …
Feature selection LING 572 Fei Xia Week 4: 1/29/08 1.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Latent Semantic Analysis Hongning Wang Recap: vector space model Represent both doc and query by concept vectors – Each concept defines one dimension.
Term Frequency. Term frequency Two factors: – A term that appears just once in a document is probably not as significant as a term that appears a number.
Katrin Erk Vector space models of word meaning. Geometric interpretation of lists of feature/value pairs In cognitive science: representation of a concept.
Chapter 6: Information Retrieval and Web Search
Ranking in Information Retrieval Systems Prepared by: Mariam John CSE /23/2006.
Beespace Component: Filtering and Normalization for Biology Literature Qiaozhu Mei
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
1 Opinion Retrieval from Blogs Wei Zhang, Clement Yu, and Weiyi Meng (2007 CIKM)
A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.
Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.
Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Kevin Heinrich, Lai Wei, and Michael W. Berry University of Tennessee.
Instance-based mapping between thesauri and folksonomies Christian Wartena Rogier Brussee Telematica Instituut.
Motivation  Methods of local analysis extract information from local set of documents retrieved to expand the query  An alternative is to expand the.
CIS 530 Lecture 2 From frequency to meaning: vector space models of semantics.
Application of latent semantic analysis to protein remote homology detection Wu Dongyin 4/13/2015.
Divided Pretreatment to Targets and Intentions for Query Recommendation Reporter: Yangyang Kang /23.
NLP.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
MMM2005The Chinese University of Hong Kong MMM2005 The Chinese University of Hong Kong 1 Video Summarization Using Mutual Reinforcement Principle and Shot.
IR 6 Scoring, term weighting and the vector space model.
From Frequency to Meaning: Vector Space Models of Semantics
Plan for Today’s Lecture(s)
Korean version of GloVe Applying GloVe & word2vec model to Korean corpus speaker : 양희정 date :
Detecting Semantic Concepts In Consumer Videos Using Audio Junwei Liang, Qin Jin, Xixi He, Gang Yang, Jieping Xu, Xirong Li Multimedia Computing Lab,
Document Clustering Based on Non-negative Matrix Factorization
Information Retrieval: Models and Methods
CSC 594 Topics in AI – Natural Language Processing
Vector-Space (Distributional) Lexical Semantics
Applying Key Phrase Extraction to aid Invalidity Search
Learning Emoji Embeddings Using Emoji Co-Occurrence Network Graph
Information Organization: Clustering
Representation of documents and queries
From frequency to meaning: vector space models of semantics
CS 430: Information Discovery
CS224N: Query Focused Multi-Document Summarization
Enriching Taxonomies With Functional Domain Knowledge
Automatic Global Analysis
Word embeddings (continued)
Retrieval Utilities Relevance feedback Clustering
Information Retrieval and Web Design
Natural Language Processing Is So Difficult
Presented by Nick Janus
VECTOR SPACE MODEL Its Applications and implementations
Presentation transcript:

Semantic Processing with Context Analysis Qiaozhu Mei 2006.10.4

Motivating Examples “find the gene names similar to foraging in fly literature ” Target units Ranking problem Context Conditions Source units Semantic Relations “find the entities related to “egg laying” about honeybees ”

Semantic Processing “find the gene names similar to foraging in fly literature ” “find the sentences similar to aspect 1 about gene A ” “find the themes related to “nursing” in bee literature ” “find the terms related to “queen” in document A ” …… These problems are similar in form  can we model them in a general way?

What We can Do with Semantic Processing User select: Source Unit Type; Target Unit type; User specify: Source Unit; Context Conditions End-User Applications: Retrieval Categorization/Taxonomy/Encoding Annotation Summarization Synonym Extraction

Context Analysis “You shall know a word by the company it keeps.” - Firth 1957 We report the molecular characterization of the spineless (ss) gene of Drosophila, and present evidence that it plays a central role in defining the distal regions of both the antenna and leg. ss encodes the closest known homolog of the mammalian dioxin receptor, a transcription factor of the bHLH-PAS family. -- biosis:199800271084 Replaceable: Semantically similar Co-occurring: Semantically Related Originally reported by Bridges in 1914, spineless plays a central role in defining the distal regions of both the antenna and leg. spineless encodes the closest known homolog of the mammalian dioxin receptor, a transcription factor of the bHLH-PAS family. -- http://www.sdbonline.org/fly/dbzhnsky/spinles1.htm

Two Basic Types of Semantic Relations Semantically related Co-occurring in the same context E.g., spineless  antenna, leg Semantically similar Replaceable in the same context E.g., spineless  ss Other types of semantic relations are in between.

Dimensions of the Semantic Processing Units: Source, target words, phrases, entities, concepts, group of words (clusters), sentences, themes, documents, etc.

Dimensions of the Semantic Processing (II) Contexts: Natural contexts: sentences, documents, etc Local contexts: 2-grams; n-grams; window size = k; Conditional contexts: Context conditional on “Drosophila” Local Context: w = 3 We report the molecular characterization of the spineless (ss) gene of Drosophila, and present evidence that it plays a central role in defining the distal regions of both the antenna and leg. ss encodes the closest known homolog of the mammalian dioxin receptor, a transcription factor of the bHLH-PAS family. Local Context: w = 5 Natural Context: sentences

Dimensions of the Semantic Processing (III) Context Representation: Sets; Vectors; Graphs; Language Models, etc

Dimensions of the Semantic Processing (IV) Context unit weighting: frequency based: co-occurrence, tf-idf, conditional probability, etc MI based: mutual information, pointwise mutual information; conditional mutual information, etc. Hypothesis testing based: Chi-square test, t-test, etc. Can be used to measure semantic relevance

Dimensions of the Semantic Processing (Cont.) Semantic Relations: Semantically related; Semantically similar; Similarity functions: Jaccard distance; Cosine; Pearson coefficient; KL divergence; etc. Semantic Problems: Ranking Classification Clustering

The Space of Semantic Processing Context Selection Natural context W = 2 W = 4 sentence documents conditional tight context vector graph language model set Units Context Representation words phrases entities concepts themes sentences Frequency based Semantical relevance only N/A Set: Jaccard Cosine Pearson KL-Divergence Others Co-occurrence Tf-idf Mutual information Conditional MI Chi-square T-score Semantical Similarity information based Context Similarity Context weighing

A General Procedure of Semantic Process Problems Input: What are the source units; what are the target units Formalization: A ranking problem? Classification problem? Clustering problem? Solution (a ranking problem for example): Select an appropriate context; Select a representation of the context Select a weighting for the units in the context Select a similarity measure to compare contexts Ranking based on the weight or similarity

Example: Query Expansion Problem: ranking Source Units term Target Units: Context: document Vector Context unit weight: Tf-idf, MI Context Similarity: N/A Units Context weighing Context Representation Context Similarity words phrases entities concepts themes sentences W = 2 W = 4 sentence documents conditional vector graph language model Co-occurrence Tf-idf Mutual information Conditional MI Chi-square T-score N/A Set: Jaccard Cosine Pearson KL-Divergence set Context Selection others

Example: N-gram Clustering Problem: clustering Source Units Term Target Units: group of terms Context: Local, w = 3 Vector Context unit weight: MI Context Similarity: Drop of total MI if merged Units Context weighing Context Representation Context Similarity words phrases entities concepts themes sentences W = 2 W = 4 sentence documents conditional vector graph language model Co-occurrence Tf-idf Mutual information Conditional MI Chi-square T-score N/A Set: Jaccard Cosine Pearson KL-Divergence set Context Selection others

Example: Synonym Extraction (Mei et al. 06) Units Context weighing Context Representation Context Similarity words phrases entities concepts themes sentences W = 2 W = 4 sentence documents conditional vector graph language model Co-occurrence Tf-idf Mutual information Conditional MI Chi-square T-score N/A Set: Jaccard Cosine Pearson KL-Divergence set Context Selection others Problem: ranking Source Units entity Target Units: Context: Natural, sentences Vector Context unit weight: Co-occurrence, MI, Context Similarity: Cosine

Example: Concept Generation (Brad’s) Problem: clustering Source Units Group of words Target Units: Context: Documents Graph Context unit weight: MI Context Similarity: Utility function Units Context weighing Context Representation Context Similarity words phrases entities concepts themes sentences W = 2 W = 4 sentence documents conditional vector graph language model Co-occurrence Tf-idf Mutual information Conditional MI Chi-square T-score N/A Set: Jaccard Cosine Pearson KL-Divergence set Context Selection others

Example: Theme Labeling Source Units: themes Target Units: phrases Two step: phrase generation  label selection

Theme Labeling: Phrase Generation Problem: ranking Source Units words Target Units: phrases Context: 2-grams pairs Context unit weight: MI, t-score, etc Context Similarity: N/A Units Context weighing Context Representation Context Similarity words phrases entities concepts themes sentences W = 2 W = 4 sentence documents conditional vector graph language model Co-occurrence Tf-idf Mutual information Conditional MI Chi-square T-score N/A Set: Jaccard Cosine Pearson KL-Divergence set Context Selection others

Theme Labeling: Label Selection Problem: ranking Source Units themes Target Units: phrases Context: Sentences/docs Language models Context unit weight: PMI Context Similarity: KL Divergence Units Context weighing Context Representation Context Similarity words phrases entities concepts themes sentences W = 2 W = 4 sentence documents conditional vector graph language model Co-occurrence Tf-idf Mutual information Conditional MI Chi-square T-score N/A Set: Jaccard Cosine Pearson KL-Divergence set Context Selection others

A General Procedure of Semantic Process Problems Context Selection Context Representation Context weighting Context similarity Co- occurrence N/A Sets Local Context Tf-Idf Ranking? Classification? Clustering? Jaccard Vectors Source Units MI Natural Context Cosine Target Units Graphs PMI Pearson Conditional Context LMs CHI2 KL Others T-score Others Others

What’s Remaining? Problem: A possible solution: Too many possible paths from the source units to the target units Given a real problem: each path is a possible solution. Which path in the space is better for a specific problem? Few real problems can be satisfied with one path. A possible solution: Model each path as a feature Use training/validation data and learning algorithms to choose good features/combinations of features.

Example: Phrase Generation Units Context weighing Context Representation Context Similarity words phrases entities concepts themes sentences W = 2 W = 4 sentence documents conditional vector graph language model Co-occurrence Tf-idf Mutual information Conditional MI Chi-square T-score N/A Set: Jaccard Cosine Pearson KL-Divergence set Context Selection others How to select or combine?

Example: Synonym Extraction Units Context weighing Context Representation Context Similarity words phrases entities concepts themes sentences W = 2 W = 4 sentence documents conditional vector graph language model Co-occurrence Tf-idf Mutual information Conditional MI Chi-square T-score N/A Set: Jaccard Cosine Pearson KL-Divergence set Context Selection others Most features can be computed offline Learning Algorithms: e.g., Ranking SVM 0.5 0.1 0.3 0.0 Training data: e.g. ranked pairs For each specific type of problem, do this training once

Summary User select: System Select: Type of Source Units; Type of Target Units; Context Conditions System Select: Context type; Context representation; Context unit weighting; Context similarity; Weighing/Combination of different features Many features can be computed offline; For specific selection of user, best combination of features learnt with learning algorithms and training data.