A novel Web usage mining approach for search engines

Slides:



Advertisements
Similar presentations
Information Retrieval (IR) on the Internet. Contents  Definition of IR  Performance Indicators of IR systems  Basics of an IR system  Some IR Techniques.
Advertisements

Psychological Advertising: Exploring User Psychology for Click Prediction in Sponsored Search Date: 2014/03/25 Author: Taifeng Wang, Jiang Bian, Shusen.
Date: 2014/05/06 Author: Michael Schuhmacher, Simon Paolo Ponzetto Source: WSDM’14 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Knowledge-based Graph Document.
Processing XML Keyword Search by Constructing Effective Structured Queries Jianxin Li, Chengfei Liu, Rui Zhou and Bo Ning Swinburne University of Technology,
Date : 2013/05/27 Author : Anish Das Sarma, Lujun Fang, Nitin Gupta, Alon Halevy, Hongrae Lee, Fei Wu, Reynold Xin, Gong Yu Source : SIGMOD’12 Speaker.
A Phrase Mining Framework for Recursive Construction of a Topical Hierarchy Date : 2014/04/15 Source : KDD’13 Authors : Chi Wang, Marina Danilevsky, Nihit.
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
WSCD INTRODUCTION  Query suggestion has often been described as the process of making a user query resemble more closely the documents it is expected.
PROBLEM BEING ATTEMPTED Privacy -Enhancing Personalized Web Search Based on:  User's Existing Private Data Browsing History s Recent Documents 
1 Statistical correlation analysis in image retrieval Reporter : Erica Li 2004/9/30.
Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presented By: Talin Kevorkian Summer June
A novel log-based relevance feedback technique in content- based image retrieval Reporter: Francis 2005/6/2.
Re-ranking Documents Segments To Improve Access To Relevant Content in Information Retrieval Gary Madden Applied Computational Linguistics Dublin City.
Link Structure and Web Mining Shuying Wang
Honglei Zhuang1, Jing Zhang2, George Brova1,
Overview of Search Engines
Authoritative Sources in a Hyperlinked Environment Jon M. Kleinberg Presentation by Julian Zinn.
NUITS: A Novel User Interface for Efficient Keyword Search over Databases The integration of DB and IR provides users with a wide range of high quality.
Page 1 WEB MINING by NINI P SURESH PROJECT CO-ORDINATOR Kavitha Murugeshan.
Citation Recommendation 1 Web Technology Laboratory Ferdowsi University of Mashhad.
Aardvark Anatomy of a Large-Scale Social Search Engine.
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.
Personalized Web Search by Mapping User Queries to Categories Fang Liu Presented by Jing Zhang CS491CXZ February 26, 2004.
Presented by, Lokesh Chikkakempanna Authoritative Sources in a Hyperlinked environment.
南台科技大學 資訊工程系 A web page usage prediction scheme using sequence indexing and clustering techniques Adviser: Yu-Chiang Li Speaker: Gung-Shian Lin Date:2010/10/15.
Probabilistic Query Expansion Using Query Logs Hang Cui Tianjin University, China Ji-Rong Wen Microsoft Research Asia, China Jian-Yun Nie University of.
Presenter: Lung-Hao Lee ( 李龍豪 ) January 7, 309.
Web Mining Class Nam Hoai Nguyen Hiep Tuan Nguyen Tri Survey on Web Structure Mining
Keyword Searching and Browsing in Databases using BANKS Seoyoung Ahn Mar 3, 2005 The University of Texas at Arlington.
Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.
Q2Semantic: A Lightweight Keyword Interface to Semantic Search Haofen Wang 1, Kang Zhang 1, Qiaoling Liu 1, Thanh Tran 2, and Yong Yu 1 1 Apex Lab, Shanghai.
Keyword Query Routing.
Improving Web Search Results Using Affinity Graph Benyu Zhang, Hua Li, Yi Liu, Lei Ji, Wensi Xi, Weiguo Fan, Zheng Chen, Wei-Ying Ma Microsoft Research.
Data Mining for Web Intelligence Presentation by Julia Erdman.
1 Opinion Retrieval from Blogs Wei Zhang, Clement Yu, and Weiyi Meng (2007 CIKM)
CIKM Opinion Retrieval from Blogs Wei Zhang 1 Clement Yu 1 Weiyi Meng 2 1 Department of.
Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.
Introduction to Information Retrieval Aj. Khuanlux MitsophonsiriCS.426 INFORMATION RETRIEVAL.
Query Suggestions in the Absence of Query Logs Sumit Bhatia, Debapriyo Majumdar,Prasenjit Mitra SIGIR’11, July 24–28, 2011, Beijing, China.
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
Date: 2013/4/1 Author: Jaime I. Lopez-Veyna, Victor J. Sosa-Sosa, Ivan Lopez-Arevalo Source: KEYS’12 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang KESOSD.
SEARCH ENGINE OPTIMIZATION. What is Search Engine Optimization?  Search engine optimization ( SEO ) is the process of affecting the visibility of a website.
The Development of a search engine & Comparison according to algorithms Sung-soo Kim The final report.
Predicting Short-Term Interests Using Activity-Based Search Context CIKM’10 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.
Contextual Text Cube Model and Aggregation Operator for Text OLAP
WEB STRUCTURE MINING SUBMITTED BY: BLESSY JOHN R7A ROLL NO:18.
Information Retrieval in Practice
Information Organization: Overview
Search Engine Architecture
Learning to Personalize Query Auto-Completion
DATA MINING Introductory and Advanced Topics Part III – Web Mining
JPEG Compressed Image Retrieval via Statistical Features
Clustering medical and biomedical texts – document map based approach
Privacy Preserving Ranked Multi-Keyword
Parametric calibration of speed–density relationships in mesoscopic traffic simulator with data mining Adviser: Yu-Chiang Li Speaker: Gung-Shian Lin Date:2009/10/20.
Fred Dirkse CEO, OIC Group, Inc.
Information Retrieval
Information Retrieval
Declarative Creation of Enterprise Applications
Web Information retrieval (Web IR)
Web Mining Department of Computer Science and Engg.
International Marketing and Output Database Conference 2005
Mashup Service Recommendation based on User Interest and Service Network Buqing Cao ICWS2013, IJWSR.
Date: 2012/11/15 Author: Jin Young Kim, Kevyn Collins-Thompson,
A new chaotic algorithm for image encryption
Information Retrieval and Web Design
Discussion Class 9 Google.
Inductive Clustering: A technique for clustering search results Hieu Khac Le Department of Computer Science - University of Illinois at Urbana-Champaign.
Presentation transcript:

A novel Web usage mining approach for search engines Authors: Dell Zhang, Yisheng Dong Source: Computer Networks, Vol. 39, Issue 3, June 21 2002, pp. 303-310 Speaker: Pei-Yu Lin Data: 8-May-03

User will only look an extremely small part of the search results only Search engines locate information based on the textual similarity of a query The search engine returns thousands of Web resource pointers to user after a general query User will only look an extremely small part of the search results only 歡迎光臨張真誠網頁 http://www.cs.ccu.edu.tw/~ccc/ 資訊安全上課投影片 http://filter.cs.ccu.edu.tw/courses/image_processing/slides/report.php 歷年研究計畫一覽表 http://www.automation.ccu.edu.tw/9.htm 專利 http://ics.stic.gov.tw/Patent/index.php?action=show&year=2001 「自然科學博物館國家典藏數位化計畫」 ... http://www.ndap.org.tw/active/report/910625.shtml 旗標學校叢書目錄 http://www.flag.com.tw/school/book.htm 張真誠系統與軟體工程傑出人才 http://www.tcssh.tc.edu.tw/news/administration/20010815.htm … …一直無法忘懷芬蘭的大自然氣息,以及那一張張真誠的年輕 ... …沉澱一年多,Tanya再度交出一張真誠動人的成跡單。 …小曼彷彿看到瞭志摩那張真誠得幾乎能夠感化世人所有 ...

Exploit the relationships among users, queries and resources The output of the query is a list consisting of the resources with the highest quality weights(authority & freshness)

MASEL(matrix analysis on search engine log) For the set of all users who have issued the query q* are constructed All resources relevant to these queries can be constructed through traditional keyword-base IR. Compute the numerical quality estimates of the found resources Web resources with the highest quality weights are returned in order for the search topic

Ex: 210.74.165.87 970813 ‘Car’ http://www.hello.com.tw/~w372/img1.jpg http://www.hello.com.tw/~w372/img2.jpg time-window’s width = week User Timestamp Query Results : Tom 970813 ‘Car’ img1, img2, img7, img4, img5, … Tom 970817 ‘Auto’ img9, img3, img10, … Tom 970818 ‘Bus’ img3, img6, img17, img13, … Jack 970814 ‘Car’ img7, img1, img2, img4, img9, img6, … Jack 970814 ‘Bus’ img1, img5, img4, img9, img2, … Rose 970813 ‘Car’ img3, img1, img10, img9, img1, img6, … Rose 970814 ‘Car’ img10, img1, img12, img14, img9, img6, … Rose 970815 ‘Auto’ img14, img5, img3, img4, img9, img6, …

User Timestamp Query Accessed images Tom 970813 Car img1, img1, img2 970817 Auto 970818 Bus Jack 970814 img1, img2 img4 Rose img1 970815 img3 DB

A = num(ui, qj) B = sim(qj, rk) C = hitq(rk, ui) User Timestamp Query Accessed images Tom 970813 Car img1, img1, img2 970817 Auto 970818 Bus Jack 970814 img1, img2 img4 Rose img1 970815 img3 A = num(ui, qj) u q B = sim(qj, rk) q r C = hitq(rk, ui) 1 u r

return the well-ordered list of image: Rose, Tom, Jack (1) u ← ABC u q q r 1 u r Tom Jack Rose r = (0.41, 0.4, 0.69)T return the well-ordered list of image: Rose, Tom, Jack

return the well-ordered list of image: Car, Auto, Bus (2) q ← BCA q r 1 u r u q Car Auto Bus r = (0.87, 0.14, 0.09)T return the well-ordered list of image: Car, Auto, Bus

return the well-ordered list of image: img1, img2, img3, img4 (3) r ← CAB 1 u r u q q r img1 img2 img3 img4 r = (0.96, 0.28, 0.03, 0.01)T return the well-ordered list of image: img1, img2, img3, img4

Application in MASEL(eeFind) Side effect in MASEL Return some images labeled with ‘BMW’, ‘Porsche’ or ‘Rolls Royse’… because they are often queried by the uses with similar interests recently query ‘Car’

Conclusions The algorithm, MASEL, can exploit the relationships among users, queries and resources The proposed approach reveals its power to achieve better ranking and query expansion effects