Query Expansion Presented By: Usha M.Tech(IT) MIT-876-2k11.

Query Expansion Presented By: Usha M.Tech(IT) MIT-876-2k11

Personalized Web Search  Personalized web search means search engine results would be different for different user according to their interests.  It may be possible that two persons issue same query but with different information need behind that.  So it focus whether an individual user would be satisfied with the document and not on whether documents are topically relevant to the query.

What is Query Expansion?  Query expansion assists the user in formulating a better query, by appending additional keywords to the initial search request.  It includes set of keywords that are somehow linked to the query words.  The goal is to improve precision and/or recall.

Example Suppose you enter query “Canon Book” on a search engine. Top 3 results:  The Canon: A Whirligig Tour of the Beautiful Basics of Science (Hardcover) @ Amazon  Western Canon @ Wikipedia  Biblical Canon @ Wikipedia

Result after Query Expansion Expanded query: “canon book bible” Top 3 results:  Biblical Canon @ Wikipedia  Books of the Bible @ Wikipedia  The Canon of the Bible @ catholicapologetics.org This query yields much better results, more related to the user’s preferences.

Social Bookmarking  It allow users to remember visited URLs and to share them with others.  social tagging is the process by which many users add metadata in the form of keywords to shared content.  Famous Social Bookmarking services are del.icio.us and StumbleUpon which guide user towards valuable content.

Modules of System  Query Expansion  Search  Persistence  User Model o Tag Finder o Parser

User Model Creation & Update 1 begin 2 //co-occurrence global matrix initialization, represented by a //map of maps 3 M <- Map([]) 4 // training documents analysis 5 for (doc; query) in D do { 6 //term occurrence map initialization (stemming and // stopword removing) 7 doc = parse(doc) 8 // term frequency calculation 9 terms <- Map([]) 10 // term frequency calculation for every terms in the //document 11 terms = frequency_occurrences(doc) 12 //co-occurrence matrix initialization 13 co occ<- Map([]) 14 //co-occurrence document matrix initialization 15 co occ = co_occurrences(terms) 16 //get sites list of Social Bookmarking for tag search 17 sites = get social bookmarking sites()

User Model Creation & Update (cont) 18 //initialization of URL list tags 19 tags <- Set([]) 20 //retrieve tags by URL 21 for i = 0; i < sites:size() & tags:size() = 0; i + + Do { 22 tags = retrieve tags(url; sites[i]) } 23 // update intermediate matrix M 24 update(M,tags,terms) } 25 //initialization of all terms in documents 26 all terms<- Set([]) 27 // get unique terms set 28 all terms = get_term_set(M) 29 //get the subset of user model 30 user matrix<- get user matrix(all terms) 31 //update user model by the intermediate matrix 32 update(user_ matrix,M,all terms) 33 //saving the updated user model 34 save(user_ matrix) 35 end

Algorithm Explanation  A temporary map M is initialized, where it is possible to record the extracted data. (lines 2-3)  For each visited URL one obtains the corresponding html page, from which the textual information is extracted through a parser and then terms are filtered and stemmed. System also records relation between stemmed terms and original terms. (lines 6-7)  The co-occurrence matrix corresponding to the most relevant keywords is evaluated. (lines 8-15)  Tags concerning the visited URLs are obtained, by accessing different sites of Social Bookmarking. (lines 16-22)

Explanation (cont)  Update the temporary map M by tags and co- occurrence matrix. (lines 23-24)  The set terms is calculated, containing all terms encountered during the update of the temporary map M. (lines 25-26)  Then 3-D matrix of co-occurrence corresponding to terms in set terms is obtained and user model is updated. (lines 27-34)

Example of Three-Dimensional correlation Matrix used for User Model

Algorithm for Query Expansion  Extract terms from submitted query Q, perform stemming on each term and form the new query Q`.  For each term from Q` extract corresponding tag and co-occurrence matrix.  Relevance score is calculated and tags are sorted according to relevance score. Then top rank tags are evaluated.  From co-occurrence matrix corresponding to these tags highly relevant terms are extracted.  And then these terms are used to expand the query.

EXAMPLE OF MULTIPLE EXPANSIONS original querycategorization tags expansions amazonnature(rivers OR river) AND amazon shoppingbuy AND (books OR book) AND amazon

CONCLUSION In this we presented an Information Retrieval system, based on Query Expansion and Personalization, that help the user to search for information on the Web, according to his/her information needs.

REFERENCES  Potential for Personalization:Jaime Teevan, Susan T.Dumais and Eric Horvitz.  Social Tagging in Query Expansion: a New Way for Personalized Web Search(Claudio Biancalana Department of Computer Science and Automation Artificial Intelligence Laboratory Roma Tre University)  Personalized Query Expansion for the Web:Paul- Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl

THANK YOU

QUERIES???

Query Expansion Presented By: Usha M.Tech(IT) MIT-876-2k11.

Similar presentations

Presentation on theme: "Query Expansion Presented By: Usha M.Tech(IT) MIT-876-2k11."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Query Expansion Presented By: Usha M.Tech(IT) MIT-876-2k11.

Similar presentations

Presentation on theme: "Query Expansion Presented By: Usha M.Tech(IT) MIT-876-2k11."— Presentation transcript:

Similar presentations

About project

Feedback