Download presentation
Presentation is loading. Please wait.
1
Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer Science The University of Kansas
2
Department of Electrical Engineering and Computer Science I T T C Professor Susan GauchDecember 1999 Outline Motivation User profiles creation and maintenance evaluation Applications re-ranking (and filtering) search results Web caching Conclusions
3
Department of Electrical Engineering and Computer Science I T T C Professor Susan GauchDecember 1999 Motivation Decrease access time for Web pages Server approaches use access logs to decrease access times for popular pages not tailored to individuals doesn’t decrease network traffic Network approaches cache popular pages multiple places in the network not tailored to individuals
4
Department of Electrical Engineering and Computer Science I T T C Professor Susan GauchDecember 1999 Personalization Different information needs for different users can we learn user’s interest? Explicitly? Implicitly can we use this information? improved search improved browsing faster Web page access
5
Department of Electrical Engineering and Computer Science I T T C Professor Susan GauchDecember 1999 Intelligent Web Caching Improved (and faster) search results pre-caching all search results expensive Internet search engines return 50% irrelevant pages improved knowledge of user’s likely behavior intelligent pre-caching use past behaviors to predict future behaviors pre-cache “best” pages close to individuals
6
Department of Electrical Engineering and Computer Science I T T C Professor Susan GauchDecember 1999 Context ProFusion: www.profusion.com OBIWAN: distributed content based IR Web clustered into regions clustering criteria: content, location, company search: query brokered to “best” regions; within region brokered to most promising sites browsing a region means browsing its sites simultaneously www.ittc.ukases.edu/obiwan
7
Department of Electrical Engineering and Computer Science I T T C Professor Susan GauchDecember 1999 User Profiles Applications Usenet news filtering recommendation services: web browsing, books intelligent pre-caching Should accurately reflect actual interests require as little feedback as possible be dynamic
8
Department of Electrical Engineering and Computer Science I T T C Professor Susan GauchDecember 1999 User profiles: Creation Obvious and often used: keywords not structured (ambiguous) static have to be explicitly mentioned Our approach watch over a user's shoulder while surfing automatically determine documents’ content central: large ontology (concept hierarchy)
9
Department of Electrical Engineering and Computer Science I T T C Professor Susan GauchDecember 1999 Document Classification Documents as weighted keyword vectors: n different words -> n dimensions weights based on word frequency and rarity Browsing hierarchy: 10 web pages per node Concatenate them -> keyword vector Content of a page: most similar vector
10
Department of Electrical Engineering and Computer Science I T T C Professor Susan GauchDecember 1999 Updating profiles Static: document related content: weights of top nodes for surfed document length of page Dynamic: time spent Combine them for instance: weight * (time/length) changes in interest in the five categories User profile: weighted ontology
11
Department of Electrical Engineering and Computer Science I T T C Professor Susan GauchDecember 1999 Profile evaluation Accordance with actual user interests 10/20 interest categories describe actual interests describe interests “pretty well”: 3.5/5 Convergence stabilization of # of categories over time? do converge after 320 surfed pages!
12
Department of Electrical Engineering and Computer Science I T T C Professor Susan GauchDecember 1999 Profiles: Summary Stored as weighted ontologies Profiles represent actual interests quite well Up to 150 top categories Two adjustment functions make profiles converge after 320 pages length of page doesn't really matter, but time spent does
13
Department of Electrical Engineering and Computer Science I T T C Professor Susan GauchDecember 1999 Personalizing Search Results 50% of top 20 results irrelevant Same search mechanism for 200 million people? Goal: identify relevant documents and put them on top of the result list (pre-fetch relevant results) Difficult problem: 10% increase is very good
14
Department of Electrical Engineering and Computer Science I T T C Professor Susan GauchDecember 1999 Re-Ranking Ranking a function of: search engine's original ranking extents to which top 5 categories describe document's content personal interest in each of these top categories “More relevant items on top of result list”: system’s ability to present all relevant items system’s ability to present only relevant items
15
Department of Electrical Engineering and Computer Science I T T C Professor Susan GauchDecember 1999 Recall and Precision Combination: Recall/Precision graphs Example: ranked documents 1,…,20 relevant 2,5,10,14,19 recall points 1/5, 2/5, 3/5, 4/5, 5/5 precisions 1/2, 2/5, 3/10, 4/14, 5/19
16
Department of Electrical Engineering and Computer Science I T T C Professor Susan GauchDecember 1999 Re-Ranking: Evaluation Overall performance increase of up to 8% at each recall cutoff, up to 10% more relevant documents have been retrieved
17
Department of Electrical Engineering and Computer Science I T T C Professor Susan GauchDecember 1999 Browsing Assistance Analyze current page locate links Identify which links are most likely to be followed by the user popularity of the link overall relevance of linked page to user’s interests Problem if you have to download the whole page to analyze it, you’ve increased the network utilization
18
Department of Electrical Engineering and Computer Science I T T C Professor Susan GauchDecember 1999 Privacy Is the user aware that their behavior is being monitored? Can users turn it off? Where are profiles stored? With whom are profiles shared? How are profiles protected? How are profiles used?
19
Department of Electrical Engineering and Computer Science I T T C Professor Susan GauchDecember 1999 Conclusions Automatic creation of structured user profiles is possible Profiles are reasonably accurate Applications in improving the search quality and Web page access efficiency Evaluation of re-ranking search results: performance increase of up to 8%
20
Department of Electrical Engineering and Computer Science I T T C Professor Susan GauchDecember 1999 Future Work Incorporating profile generator into browser Connect system to ProFusion, OBIWAN Personalize structure of ontology Re-train classifier More applications: recommendation service, web caching, browsing,... Explicit user feedback?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.