Download presentation
Presentation is loading. Please wait.
Published byMerryl Byrd Modified over 8 years ago
1
Personalized Ontology for Web Search Personalization S. Sendhilkumar, T.V. Geetha Anna University, Chennai India 1st ACM Bangalore annual Compute conference, 2008 2008. 05. 15. Summarized by Jaehui Park, IDS Lab., Seoul National University Presented by Jaehui Park, IDS Lab., Seoul National University
2
Copyright 2008 by CEBT introduction Personalization in Web Search Personalization is the process of customizing the web environment according to the user’s interests – Context oriented The context can be derived from the terms used in a query – Individual oriented The way user clicks and moves from or to, saving, printing and so on User Conceptual Index (UCI) (2005) – UCI utilizes the relationship between the query and the pages visited by the user This provide both context as well as individual oriented – Client side data collection by tracking the user interactions from browser – Relevancy of a page with respect to the user’s context of search Page-Query relevancy Page-Interest relevancy Query-Interest relevancy 2
3
Copyright 2008 by CEBT introduction Ontology Help in selecting the context by user interests in some categories The system semantically annotates Web pages via the use of Yahoo! Categories SHOE : Simple HTML Ontology Extensions – Allow users to annotate their pages with semantic expression This paper focuses the construction of page ontology from the pages visited by the user And focuses the construction of personalized ontology based on automatically identified user profile Pages viewed, page view time, action performed on a page, etc. 3
4
Copyright 2008 by CEBT System Architecture The proposed knowledge- based personalized search system is made up of Presentation Preprocessing Data Knowledge Analysis 4
5
Copyright 2008 by CEBT Presentation Layer The place where the users give their search queries It keeps track of all the user data Search queries Pages visited Time spent on a page The scroll speed Save, copy, print, bookmark, and so on This user data is collected implicitly (without any user intervention) 5
6
Copyright 2008 by CEBT Preprocessing Layer HTML conversion parsing POS tagging Noun extraction Noun represents the various concepts TF computation Index Word (IW) selection Feature word, concepts Thus, pages are semantically(?) tagged with the concepts
7
Copyright 2008 by CEBT Data Layer The pages are stored and indexed by the index words (IW) Which are extracted during preprocessing Every user search is modeled as a transaction, and session is identified by a set of transaction Two matrices, Transaction-Feature words Session-Transaction By Comparing the two, SF can be get A user session can be represented as a content feature vector, reflecting the user’s interest through the session
8
Copyright 2008 by CEBT Data Layer The computation User Conceptual Index Conceptual relation between the search query and the relevant pages – SKW – search query – IW – index word – W – sum of the weight of any search query (SQ) and its relevant IWs Recommend a matching (SQ, IW) pairs that has the highest UCI value – Considering time for each factors SQ factors : freq of terms used in the query, the query usage IW factor : page hits, page view – UCI based search systems takes into account these various factors
9
Copyright 2008 by CEBT Data Layer T : relevant N : non-relevant TP : relevant pages in the top 10 ranking FN : relevant pages are not in the top ranking TN : non-relevant pages not in the top 10 ranking FP : non-relevant pages in the top 10 ranking P : precision R : recall S : sensitivity Sp : specificity E : overall efficiency
10
Copyright 2008 by CEBT Data Layer User profiles are saved in DB User’s interests are identified automatically from the collection of various search queries, and the relevant pages visited by the user – Generated automatically – Evolves over time – Represented by ontology : personalized ontology Toward personalization, Result set were saved and analyzed (page similarity with query)
11
Copyright 2008 by CEBT Knowledge Layer Domain knowledge and personalized knowledge are being generated and represented by ontology Page ontology : constructed based on the set of pages visited by the user – Generated to exploit the semantic relation between the various pages visited by the user during a search session – Every concept in this ontology is assigned a Link weight The similarity between pages How one page leads to another relevant page to the user Scrolling time Personalized ontology : constructed from user profile data
12
Copyright 2008 by CEBT Page Ontology Construction Knowledge Layer - Page Ontology Construction ODP taxonomy Used for extracting the relation between the various concepts – Hyponyms -> is-a – Homonyms -> part-of Protégé generates OWL Modified UCI (with refined link weight) Link weight : cosine similarity between the concepts in pages Takes care of both current and previous interests of users
13
Copyright 2008 by CEBT Personalized Ontology Construction Knowledge Layer - Personalized Ontology Construction Personalized Ontology Construction Matching concepts-interests pairs and their relations are extracted with the help of ODP User actions and respective weight
14
Copyright 2008 by CEBT Conclusion Personalized search by recording user profile from User’s browsing pattern Retrieve more related documents that are semantically related to the given search query Ontology is developed for Understanding the semantics of the search query Automatic construction saves time
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.