Fuzzy Set Approach for Improving Web Log Mining Sajitha Naduvil-Vadukootu Csc 8810 : Computational Intelligence Instructor: Dr. Yanqing Zhang Dec 4, 2006.

Slides:



Advertisements
Similar presentations
Web Mining.
Advertisements

Mining Frequent Patterns II: Mining Sequential & Navigational Patterns Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
شهره کاظمی 1 آزمايشکاه سيستم های هوشمند ( گزارش پيشرفت کار پروژه مدل مارکف.
Data e Web Mining Paolo Gobbo
Chase Repp.  knowledge discovery  searching, analyzing, and sifting through large data sets to find new patterns, trends, and relationships contained.
Berendt: Advanced databases, winter term 2007/08, 1 Advanced databases – Inferring implicit/new.
Our New Progress on Frequent/Sequential Pattern Mining We develop new frequent/sequential pattern mining methods Performance study on both synthetic and.
Chapter 12: Web Usage Mining - An introduction
Data Mining Association Rules Yao Meng Hongli Li Database II Fall 2002.
WebMiningResearch ASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007.
WebKDD 2001 Aristotle University of Thessaloniki 1 Effective Prediction of Web-user Accesses: A Data Mining Approach Nanopoulos Alexandros Katsaros Dimitrios.
Building an Intelligent Web: Theory and Practice Pawan Lingras Saint Mary’s University Rajendra Akerkar American University of Armenia and SIBER, India.
Mining Sequential Patterns Rakesh Agrawal Ramakrishnan Srikant Proc. of the Int’l Conference on Data Engineering (ICDE) March 1995 Presenter: Phil Schlosser.
LinkSelector: A Web Mining Approach to Hyperlink Selection for Web Portals Xiao Fang University of Arizona 10/18/2002.
Web Usage Mining: Processes and Applications
The Web is perhaps the single largest data source in the world. Due to the heterogeneity and lack of structure, mining and integration are challenging.
WebMiningResearchASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007 Revised.
Association Rule Mining (Some material adapted from: Mining Sequential Patterns by Karuna Pande Joshi)‏
2/8/00CSE 711 data mining: Apriori Algorithm by S. Cha 1 CSE 711 Seminar on Data Mining: Apriori Algorithm By Sung-Hyuk Cha.
1 A DATA MINING APPROACH FOR LOCATION PREDICTION IN MOBILE ENVIRONMENTS* by Gökhan Yavaş Feb 22, 2005 *: To appear in Data and Knowledge Engineering, Elsevier.
Application of Apriori Algorithm to Derive Association Rules Over Finance Data Set Presented By Kallepalli Vijay Instructor: Dr. Ruppa Thulasiram.
2015/7/21 Incremental Clustering for Mining in a Data Warehousing Environment Martin Ester Hans-Peter Kriegel J.Sander Michael Wimmer Xiaowei Xu Proceedings.
Performance and Scalability: Apriori Implementation.
Data Mining for Web Personalization
Web Usage Mining Sara Vahid. Agenda Introduction Web Usage Mining Procedure Preprocessing Stage Pattern Discovery Stage Data Mining Approaches Sample.
FALL 2012 DSCI5240 Graduate Presentation By Xxxxxxx.
CS 401 Paper Presentation Praveen Inuganti
Dr. Guandong Xu Intelligent Web & Information Systems (IWIS) Department of Computer Science, Aalborg University Web Usage Mining & Personalization.
Data Mining on the Web via Cloud Computing COMS E6125 Web Enhanced Information Management Presented By Hemanth Murthy.
VMT Workshop, Philadelphia, June, 9-11, Stefan Trausan-Matu Computer Science Department, Bucharest "Politehnica" University, Human-Computer Interaction,
Page 1 WEB MINING by NINI P SURESH PROJECT CO-ORDINATOR Kavitha Murugeshan.
Web Usage Patterns Ryan McFadden IST 497E December 5, 2002.
Use of Hierarchical Keywords for Easy Data Management on HUBzero HUBbub Conference 2013 September 6 th, 2013 Gaurav Nanda, Jonathan Tan, Peter Auyeung,
Course on Data Mining: Seminar Meetings Page 1/17 Course on Data Mining ( ): Seminar Meetings Ass. Rules EpisodesEpisodes Text Mining
Data Mining By Dave Maung.
Web Personalization Based on Static Information and Dynamic User Behavior Center for E-Business Technology Seoul National University Seoul, Korea Nam,
Discovery of Aggregate Usage Profiles for Web Personalization Bamshad Mobasher, Honghua Dai, Tao Luo, Miki Nakagawa, Yuqing Sun, Jim Wiltshire WebKDD 2000.
Chapter 12: Web Usage Mining - An introduction Chapter written by Bamshad Mobasher Many slides are from a tutorial given by B. Berendt, B. Mobasher, M.
Srivastava J., Cooley R., Deshpande M, Tan P.N.
1 Murat Ali Bayır Middle East Technical University Department of Computer Engineering Ankara, Turkey A New Reactive Method for Processing Web Usage Data.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Juan D.Velasquez Richard Weber Hiroshi Yasuda 國立雲林科技大學 National.
Web Mining Issues Size Size –>350 million pages –Grows at about 1 million pages a day Diverse types of data Diverse types of data.
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
Introduction to Data Mining by Yen-Hsien Lee Department of Information Management College of Management National Sun Yat-Sen University March 4, 2003.
Web Usage Mining A case study of the GoMercer.com website Martin Zhao Mar 16, 2007.
Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya.
A RESEARCH SUPPORT SYSTEM FRAMEWORK FOR WEB DATA MINING Jin Xu, Yingping Huang, Gregory Madey Department of Computer Science and Engineering University.
Web Analytics Xuejiao Liu INF 385F: WIRED Fall 2004.
WEB USAGE MINING Web Usage Mining 1. Contents Web Usage Mining 2  Web Mining  Web Mining Taxonomy  Web Usage Mining  Web analysis tools  Pattern.
© Prentice Hall1 DATA MINING Web Mining Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist University Companion slides.
1 Top Down FP-Growth for Association Rule Mining By Ke Wang.
Personalizing the Web Todd Lanning Project 1 - Presentation CSE 8331 Dr. M. Dunham.
Information Overload on the Internet: The Web Mining Techniques Approach UNIVERSITI UTARA MALAYSIA COLLEGE OF ARTS AND SCIENCES RESEARCH METHODOLOGY (SZRZ6014)
Data mining in web applications
Effective Prediction of Web-user Accesses: A Data Mining Approach
Jian Pei and Runying Mao (Simon Fraser University)
MIS 451 Building Business Intelligence Systems
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Lin Lu, Margaret Dunham, and Yu Meng
A Linear Method for Deviation Detection in Large databases
Boštjan Kožuh Statistical Office of the Republic of Slovenia,
Mining Sequential Patterns
Web Mining Department of Computer Science and Engg.
Mining Path Traversal Patterns with User Interaction for Query Recommendation 龚赛赛
Effective Prediction of Web-user Accesses: A Data Mining Approach
Discovery of Significant Usage Patterns from Clickstream Data
Closed Itemset Mining CSCI-7173: Computational Complexity & Algorithms, Final Project - Spring 16 Supervised By Dr. Tom Altman Presented By Shahab Helmi.
Web Mining Research: A Survey
FREERIDE: A Framework for Rapid Implementation of Datamining Engines
Presentation transcript:

Fuzzy Set Approach for Improving Web Log Mining Sajitha Naduvil-Vadukootu Csc 8810 : Computational Intelligence Instructor: Dr. Yanqing Zhang Dec 4, 2006

Agenda Introduction to Web Log Mining Episode Identification : Existing techniques Improvement: Fuzzy Set Approach Simulations & Results Challenges & Future Work Questions

Web Log Mining: Introduction Site Structure Access Log Web Crawler Association Mining Association Rules Extracting& Filtering, User Identification, Session Identification, Path Completion, Episode Identification

Episode Identification: Maximal Forward Reference  {A,B,C,D,C,B,A,E,F}  Episodes: {A,B,C,D} {A,E,F}  Rules generated :{B->A,C->A,D->A,…} Maximal Reference Length  {(A,1),(B,1),(C,20),(D,80),(C,1),(B,1),(A,1),(E,30),(F,6 0)}  Episodes: {A,B,C} {D} {A,E} {F}  Rules: {B->A,C->A,…}

Page Request Classification Navigational requests and Content Requests Request Time Interval as a classification aid Maximal Reference Length Method for Episode Identification What should be the cut off time interval ?

Fuzzy Set Approach Consider Request Time Interval as linguistic variable. We define two linguistic values : High and Low for request time interval.  High => Request is Content  Low => Request is Navigational “High” Member ship function is triangular. Slope=3.33e Navigational Content

Fuzzy Set Approach Consider “content” function value as support weight for that request. To calcuate page 7447’s support:  Select avg(support) where targetid = 7447 support ({7447,7448}) = max(support(7447)+ support(7448)) ID TIME INTERVA LTARGETIDSUPPORT

Simulation & Results Configuration:  Support Count = 5  Confidence = DataS et size Number of Rules DiscoveredRunning Time (seconds)Relevant Rules (limit = 10 sec) Maximal Forward Referenc e Max Referenc e Length (cut off = 1 sec) Fuzzy Hybrid Maximal Forward Referen ce Max Reference Length (cut off = 1 sec) Fuzzy Hybrid Maximal Forward Reference Max Reference Length (cut off = 1 sec) Fuzzy Hybrid

Challenges & Future Work Improved Metrics for measuring “Relevance” / “Interestingness” Determining a more suitable membership function Performance on Very Large Datasets

References 1) J. Srivastava, R. Cooley, M. Deshpande, P-T. Tan. Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data. 2) R. Cooley, B. Mobasher, and J. Srivastava. Data preparation for mining world wide web browsing patterns. Knowledge and Information Systems, 1(1), February ) Rakesh Agrawal and Ramakrishnan Srikant. Fast Algorithms for Mining Association Rules. In Proc. of the 20th Int'l Conference on Very Large Databases, Santiago, Chile, September ) Rakesh Agrawal and Ramakrishnan Srikant. Mining Sequential Patterns. In Proc. of the 11th Int'l Conference on Data Engineering, Taipei, Taiwan, March ) R. Cooley, B. Mobasher, and J. Srivastava. Data preparation for mining world wide web browsing patterns. Knowledge and Information Systems, 1(1), February ) Robert Cooley, Bamshad Mobasher, and Jaideep Srivastava. Grouping web page references into transactions for mining world wide web browsing patterns. Technical Report TR , Dept. of Computer Science, Univ. of Minnesota, Minneapolis, USA, June ) Myra Spiliopoulou and Lukas Faulstich, C. WUM: A Tool for Web Utilization Analysis. In EDBT Workshop WebDB'98, Valencia, Spain, Mar