Chapter 12: Web Usage Mining - An introduction

Slides:



Advertisements
Similar presentations
Experience Guided Shopping & Search Guiders ® Deliver Measurable ROI Through Reports Metric Reports Deliver Unique Customer Decision Insights Guiders offer.
Advertisements

Web Mining.
Web Usage Mining Web Usage Mining (Clickstream Analysis) Mark Levene (Follow the links to learn more!)
WEB DESIGN TABLES, PAGE LAYOUT AND FORMS. Page Layout Page Layout is an important part of web design Why do you think your page layout is important?
WEB USAGE MINING FRAMEWORK FOR MINING EVOLVING USER PROFILES IN DYNAMIC WEBSITE DONE BY: AYESHA NUSRATH 07L51A0517 FIRDOUSE AFREEN 07L51A0522.
Mining Frequent Patterns II: Mining Sequential & Navigational Patterns Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Interception of User’s Interests on the Web Michal Barla Supervisor: prof. Mária Bieliková.
Digital Marketing Analytics v10. Introduction  Name / job role  What company are you with  How much experience do you have using Webtrends  Create.
Back to Table of Contents
Web Mining Research: A Survey Authors: Raymond Kosala & Hendrik Blockeel Presenter: Ryan Patterson April 23rd 2014 CS332 Data Mining pg 01.
Copyright © 2004 Pearson Education, Inc. Slide 7-1 E-commerce Kenneth C. Laudon Carol Guercio Traver business. technology. society. Second Edition.
Web Usage Mining: Processes and Applications
Software Testing and Quality Assurance Testing Web Applications.
The Web is perhaps the single largest data source in the world. Due to the heterogeneity and lack of structure, mining and integration are challenging.
12/11/01 Matt Bridges Advisor: Ralph Morelli. What is Web Analytics? In traditional commerce, store owners can observe their customers habits: What time.
Discovery of Aggregate Usage Profiles for Web Personalization
Web Usage Mining - W hat, W hy, ho W Presented by:Roopa Datla Jinguang Liu.
SESSION 9 THE INTERNET AND THE NEW INFORMATION NEW INFORMATIONTECHNOLOGYINFRASTRUCTURE.
© Copyright , Blue Martini Software. San Mateo California, USA 1 1 Integrating E-Commerce and Data Mining: Architecture and Challenges Llew Mason.
Electronic Commerce Systems
_______________________________________________________________________________________________________________ E-Commerce: Fundamentals and Applications1.
Alexander Hartmann.  Free service offered by Google that generates detailed statistics about the visitors to a website. A premium version is also available.
WEB ANALYTICS Prof Sunil Wattal. Business questions How are people finding your website? What pages are the customers most interested in? Is your website.
Prof. Vishnuprasad Nagadevara Indian Institute of Management Bangalore
Section 13.1 Add a hit counter to a Web page Identify the limitations of hit counters Describe the information gathered by tracking systems Create a guest.
Overview of Web Data Mining and Applications Part II
1.Understand the decision-making process of consumer purchasing online. 2.Describe how companies are building one-to-one relationships with customers.
Web Mining ( 網路探勘 ) WM12 TLMXM1A Wed 8,9 (15:10-17:00) U705 Web Usage Mining ( 網路使用挖掘 ) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information.
Discovering Web Access Patterns and Trends by Applying OLAP and Data Mining Technology on Web logs Data Engineering Lab 성 유 진.
FALL 2012 DSCI5240 Graduate Presentation By Xxxxxxx.
CS 401 Paper Presentation Praveen Inuganti
Consumer Behavior, Market Research
ITIS 1210 Introduction to Web-Based Information Systems Chapter 48 How Internet Sites Can Invade Your Privacy.
Fall 2006 Davison/LinCSE 197/BIS 197: Search Engine Strategies 6-1 Module II Overview PLANNING: Things to Know BEFORE You Start… Why SEM? Goal Analysis.
Web mining Web mining deals with mining of patterns from web and e-commerce data. Web data –Web pages –Web structures –Web logs –E-commerce sites – .
Web Design, 4 th Edition 7 Promoting and Maintaining a Web Site.
Aurora: A Conceptual Model for Web-content Adaptation to Support the Universal Accessibility of Web-based Services Anita W. Huang, Neel Sundaresan Presented.
Research paper: Web Mining Research: A survey SIGKDD Explorations, June Volume 2, Issue 1 Author: R. Kosala and H. Blockeel.
Copyright © 2009 Pearson Education, Inc. Slide 6-1 Chapter 6 E-commerce Marketing Concepts.
© 2008 Thomson, a part of the Thomson Corporation. Thomson, the Star logo, and Atomic Dog are trademarks used herein under license. All rights reserved.
© 2006 KDnuggets [16/Nov/2005:16:32: ] "GET /jobs/ HTTP/1.1" "
10 Adding Interactivity to a Web Site Section 10.1 Define scripting Summarize interactivity design guidelines Identify scripting languages Compare common.
Generating Intelligent Links to Web Pages by Mining Access Patterns of Individuals and the Community Benjamin Lambert Omid Fatemieh CS598CXZ Spring 2005.
Google Confidential and Proprietary 1 Google University Google Analytics and Website Optimiser Dyana Najdi, Customer Analytics Manager, EMEA Lee Hunter,
©2010 John Wiley and Sons Chapter 12 Research Methods in Human-Computer Interaction Chapter 12- Automated Data Collection.
WebMining Web Mining By- Pawan Singh Piyush Arora Pooja Mansharamani Pramod Singh Praveen Kumar 1.
Sustainability: Web Site Statistics Marieke Napier UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by: URL
Data Mining By Dave Maung.
Discovery of Aggregate Usage Profiles for Web Personalization Bamshad Mobasher, Honghua Dai, Tao Luo, Miki Nakagawa, Yuqing Sun, Jim Wiltshire WebKDD 2000.
Log files presented to : Sir Adnan presented by: SHAH RUKH.
Chapter 12: Web Usage Mining - An introduction Chapter written by Bamshad Mobasher Many slides are from a tutorial given by B. Berendt, B. Mobasher, M.
1 Tools for Website Effectiveness. What is your site producing? Sales PR Expanding client base Brand awareness Feedback.
Srivastava J., Cooley R., Deshpande M, Tan P.N.
Web Site Statistics A Metric for Measuring Engagement.
Evaluating & Maintaining a Site Domain 6. Conduct Technical Tests Dreamweaver provides many tools to assist in finalizing and testing your website for.
Web Mining Issues Size Size –>350 million pages –Grows at about 1 million pages a day Diverse types of data Diverse types of data.
Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining.
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
8 Chapter Eight Server-side Scripts. 8 Chapter Objectives Create dynamic Web pages that retrieve and display database data using Active Server Pages Process.
Glossary of Terms Sessions - (old name: Visits) Users - (old name: Unique Visitors) Pageviews Pages/Session Avg. Session Duration Bounce Rate %New Sessions.
Information Design Trends Unit Five: Delivery Channels Lecture 2: Portals and Personalization Part 2.
Ecommerce Basics Standard 2 Objective 1. Ecommerce Business conducted on the internet.
Introduction Web analysis includes the study of users’ behavior on the web Traffic analysis – Usage analysis Behavior at particular website or across.
Web Analytics and Reporting Michal Neuwirth Product Manager – Kentico Software.
Science data sharing user behavior mining: an approach combining Web Usage Mining and GIS Mo Wang, Juanle Wang, Yongqing Bai Institute of Geographic Sciences.
Chapter 8: Web Analytics, Web Mining, and Social Analytics
Data mining in web applications
What is Google Analytics?
Chapter 12: Automated data collection methods
Presentation transcript:

Chapter 12: Web Usage Mining - An introduction Chapter written by Bamshad Mobasher Many slides are from a tutorial given by B. Berendt, B. Mobasher, M. Spiliopoulou

Introduction Web usage mining: automatic discovery of patterns in clickstreams and associated data collected or generated as a result of user interactions with one or more Web sites. Goal: analyze the behavioral patterns and profiles of users interacting with a Web site. The discovered patterns are usually represented as collections of pages, objects, or resources that are frequently accessed by groups of users with common interests.

Introduction Data in Web Usage Mining: Web server logs Site contents Data about the visitors, gathered from external channels Further application data Not all these data are always available. When they are, they must be integrated. A large part of Web usage mining is about processing usage/ clickstream data. After that various data mining algorithm can be applied. Bing Liu

Web server logs Bing Liu

Web usage mining process Bing Liu

Data preparation Bing Liu

Pre-processing of web usage data Bing Liu

Data cleaning Data cleaning remove irrelevant references and fields in server logs remove references due to spider navigation remove erroneous references add missing references due to caching (done after sessionization) Bing Liu

Identify sessions (sessionization) In Web usage analysis, these data are the sessions of the site visitors: the activities performed by a user from the moment she enters the site until the moment she leaves it. Difficult to obtain reliable usage data due to proxy servers and anonymizers, dynamic IP addresses, missing references due to caching, and the inability of servers to distinguish among different visits. Bing Liu

Sessionization strategies Bing Liu

Sessionization heuristics Bing Liu

Sessionization example Bing Liu

User identification Bing Liu

User identification: an example Bing Liu

Pageview A pageview is an aggregate representation of a collection of Web objects contributing to the display on a user’s browser resulting from a single user action (such as a click-through). Conceptually, each pageview can be viewed as a collection of Web objects or resources representing a specific “user event,” e.g., reading an article, viewing a product page, or adding a product to the shopping cart. Bing Liu

Path completion Client- or proxy-side caching can often result in missing access references to those pages or objects that have been cached. For instance, if a user returns to a page A during the same session, the second access to A will likely result in viewing the previously downloaded version of A that was cached on the client-side, and therefore, no request is made to the server. This results in the second reference to A not being recorded on the server logs. Bing Liu

Missing references due to caching Bing Liu

Path completion The problem of inferring missing user references due to caching. Effective path completion requires extensive knowledge of the link structure within the site Referrer information in server logs can also be used in disambiguating the inferred paths. Problem gets much more complicated in frame-based sites. Bing Liu

Integrating with e-commerce events Either product oriented or visit oriented Used to track and analyze conversion of browsers to buyers. Major difficulty for E-commerce events is defining and implementing the events for a site, however, in contrast to clickstream data, getting reliable preprocessed data is not a problem. Another major challenge is the successful integration with clickstream data Bing Liu

Product-Oriented Events Product View Occurs every time a product is displayed on a page view Typical Types: Image, Link, Text Product Click-through Occurs every time a user “clicks” on a product to get more information Bing Liu

Product-Oriented Events Shopping Cart Changes Shopping Cart Add or Remove Shopping Cart Change - quantity or other feature (e.g. size) is changed Product Buy or Bid Separate buy event occurs for each product in the shopping cart Auction sites can track bid events in addition to the product purchases Bing Liu

Web usage mining process Bing Liu

Integration with page content Bing Liu

Integration with link structure Bing Liu

E-commerce data analysis Bing Liu

Session analysis Simplest form of analysis: examine individual or groups of server sessions and e-commerce data. Advantages: Gain insight into typical customer behaviors. Trace specific problems with the site. Drawbacks: LOTS of data. Difficult to generalize. Bing Liu

Session analysis: aggregate reports Bing Liu

OLAP Bing Liu

Data mining Bing Liu

Data mining (cont.) Bing Liu

Some usage mining applications Bing Liu

Personalization application Bing Liu

Standard approaches Bing Liu

Summary Web usage mining has emerged as the essential tool for realizing more personalized, user-friendly and business-optimal Web services. The key is to use the user-clickstream data for many mining purposes. Traditionally, Web usage mining is used by e-commerce sites to organize their sites and to increase profits. It is now also used by search engines to improve search quality and to evaluate search results, etc, and by many other applications. Bing Liu