Guide to the Clickstream Data

Slides:



Advertisements
Similar presentations
Web Center Certification Administration Web Center Certification Training Intuit Financial Services University.
Advertisements

Web Mining.
Web Usage Mining Web Usage Mining (Clickstream Analysis) Mark Levene (Follow the links to learn more!)
WEB USAGE MINING FRAMEWORK FOR MINING EVOLVING USER PROFILES IN DYNAMIC WEBSITE DONE BY: AYESHA NUSRATH 07L51A0517 FIRDOUSE AFREEN 07L51A0522.
Web Store Training. Table of Contents Sign In : Accessing the site My Profile : Managing your account Catalog Navigation : Finding items and ordering.
ECML/PKDD Discovery Challenges Petr Berka University of Economics, Prague
Dave Krause ANRCS Web Action Team.  Data is collected from a web site based on what the user does during the visit.
Chapter 12: Web Usage Mining - An introduction
LinkSelector: Select Hyperlinks for Web Portals Prof. Olivia Sheng Xiao Fang School of Accounting and Information Systems University of Utah.
Web Usage Mining - W hat, W hy, ho W Presented by:Roopa Datla Jinguang Liu.
: is a web site, usually maintained by an individual with regular entries of commentary, descriptions of events, or other material such as graphics or.
Website Introduction  Plant a Seed, Watch it Grow web guide  Request a Garden Consultant  Explore Existing Gardens  Grant Calendar Log on to our website.
WEB ANALYTICS Prof Sunil Wattal. Business questions How are people finding your website? What pages are the customers most interested in? Is your website.
Prof. Vishnuprasad Nagadevara Indian Institute of Management Bangalore
FALL 2012 DSCI5240 Graduate Presentation By Xxxxxxx.
WEEK 2 TOPIC : INTERNET (CONTINUED) This is the distribution of messages, information, fascimiles of documents e.t.c from one computer terminal.
ITIS 1210 Introduction to Web-Based Information Systems Chapter 48 How Internet Sites Can Invade Your Privacy.
VSS – VENDOR SELF SERVICE Registration Guide. The first step of VSS Registration is to open and review the 3 documents linked on the Purchasing & Contracts.
HOW ACCESS TO WWW Student Name : Hussein Alkhaldi.
XHTML Introductory1 Linking and Publishing Basic Web Pages Chapter 3.
Q-CallShop Complete VoIP-based Call Shop Solution/Service provided on an ASP model to facilitate the building and to.
The World Wide Web (abbreviated as WWW or W3 and commonly known as the Web) is a system of interlinked hypertext documents accessed via the Internet.
1 Lies, damn lies and Web statistics A brief introduction to using and abusing web statistics Paul Smith, ILRT July 2006.
MinnesotaHelp.info Provider Portal (2011) Service of the MN Board on Aging on behalf of State of Minnesota 1999 legislative mandate for a long-term care.
COOKIES. INTERNET COOKIES What are they Where are they found What should you do about them.
© All Rights Reserved Understanding URLs During this unit, you will be finding out about some of the following things: What a URL means.
© All Rights Reserved
Discovery of Aggregate Usage Profiles for Web Personalization Bamshad Mobasher, Honghua Dai, Tao Luo, Miki Nakagawa, Yuqing Sun, Jim Wiltshire WebKDD 2000.
Log files presented to : Sir Adnan presented by: SHAH RUKH.
Mining Click-stream Data With Statistical and Rule-based Methods Martin Labský, Vladimír Laš, Petr Berka University of Economics, Prague.
Chapter 12: Web Usage Mining - An introduction Chapter written by Bamshad Mobasher Many slides are from a tutorial given by B. Berendt, B. Mobasher, M.
Srivastava J., Cooley R., Deshpande M, Tan P.N.
How the Web Works Building a Website – Lesson 1. How People Access the Web Browsers People access websites using software called a web browser. To view.
Analysing Clickstream Data: From Anomaly Detection to Visitor Profiling Peter I. Hofgesang Wojtek Kowalczyk ECML/PKDD Discovery.
Browser Wars (Click on the logo to see the performance)
Web Mining Issues Size Size –>350 million pages –Grows at about 1 million pages a day Diverse types of data Diverse types of data.
Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya.
Performance Testing Test Complete. Performance testing and its sub categories Performance testing is performed, to determine how fast some aspect of a.
Web Measurement. The Web is Different from other Commuication Media More precise measurement of activity on Web sites is available More precise measurement.
Benefits of InterSite Pre-Processing and Clustering Methods in E-Commerce Domain Sergiu Chelcea, Alzennyr Da Silva, Yves Lechevallier, Doru Tanasa, Brigitte.
Ecommerce Applications 2009/10 Session 41 E-Commerce Applications Step by step building of a shop in Shopcreator.
Complete Ordering System for Promotional Literature and Samples Quick Reference and Training Guide.
UNITED IN SERVICE TO OUR NATION UNCLASSIFIED Download Requirement Package Template.
27.1 Chapter 27 WWW and HTTP Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Realestateby.net Logo Goes Here. Web Sites Internet exposure for yourself or agency at your own domain Show your listings.
Fundamentals of Web DevelopmentRandy Connolly and Ricardo HoarFundamentals of Web DevelopmentRandy Connolly and Ricardo Hoar Fundamentals of Web DevelopmentRandy.
Objectives At the end of this session students will: Define the following terms in two sentences or less Website Web page Browser Html URL Hyperlink Explain.
® Scan Point Management System (SPMS) Administrative Training Guide Course #: Course Length: 2 hours.
Setting and Upload Products
CompTIA Network+ N Authorized Cert Guide
Enabling Secure Internet Access with TMG
Domain Name System DNS - A system for converting host names and domain names into IP addresses on the Internet or on local networks that use the TCP/IP.
Technologies and Applications
Internet Data Exchange - General Navigation - View and Confirm Purchase Orders and Scheduling Agreements NOVEMBER2016 REV C.
Timesheet Entry ews.incacaa.org.
Oshopsoft oshopsoft.gridaxis.in Gridaxis Softwares
COOKIES.
Latest Updates on BlackHawk Mines Music : Privacy Policy
Repair in Control Center
Unit 27 Web Server Scripting Extended Diploma in ICT
Objectives To understand the about types of computer network
What is a Search Engine EIT, Author Gay Robertson, 2017.
Adding your VUMC account to the Outlook App
Patient Access to Electronic Medical Records
INFS 230 L Internet Technology
Read this to find out how the internet works!
WJEC GCSE Computer Science
COMPUTER NETWORKS AND THE INTERNET Chapter 6
Your computer is the client
CMP Creating Your Personal and Small Business Web Sites
Presentation transcript:

Guide to the Clickstream Data Petr Berka University of Economics, Prague berka@vse.cz

Web Usage Mining Domain click-stream - a sequential series of page view (displays on user’s browser at one time) requests, server session - a click-stream of page views for a single user for a particular web site, user session - is the click-stream of page views for a single user across the entire web. Clickstream Data, Discovery Challenge 2005

Clickstream Data, Discovery Challenge 2005 The Clickstream Data ~3Millions of records (24 days) from a www shop web server log Contains information about time; IP address; session ID; page request; referer There are hundreds of thousands of sessions; most of them very short, on average 16 pages Each page request in this www shop has the same structure – page type / content ID (product ID) Page types are for example dp (detail of product), sb (shopping basket), ct (contact) Clickstream Data, Discovery Challenge 2005

Clickstream Data, Discovery Challenge 2005 Example of the Data unix time ;IP address ; session ID ; page request; referee 1074589200;193.179.144.2 ;1993441e8a0a4d7a4407ed9554b64ed1;/dp/?id=124 ;www.google.cz; 1074589201;194.213.35.234;3995b2c0599f1782e2b40582823b1c94;/dp/?id=182 ; 1074589202;194.138.39.56 ;2fd3213f2edaf82b27562d28a2a747aa;/ ;www.seznam.cz; 1074589233;193.179.144.2 ;1993441e8a0a4d7a4407ed9554b64ed1;/dp/?id=148 ;/dp/?id=124; 1074589245;193.179.144.2 ;1993441e8a0a4d7a4407ed9554b64ed1;/sb/ ;/dp/?id=148; 1074589248;194.138.39.56 ;2fd3213f2edaf82b27562d28a2a747aa;/contacts/ ; /; 1074589290;193.179.144.2 ;1993441e8a0a4d7a4407ed9554b64ed1;/sb/ ;/sb/; Clickstream Data, Discovery Challenge 2005

Clickstream Data, Discovery Challenge 2005 Data Description table “obchod” (shop) - name of the internet shop (7 entries), table “kategorie” (category) - info about category of products (64 entries), table “list” (sheet) - info about a specific product of a more detailed type (157 entries), table “znacka” (brand) - name of the producer or brand of a product (197 entries), table “tema” (theme) - info about themes discussed in the on-line advice (36 entries) Clickstream Data, Discovery Challenge 2005

Clickstream Data, Discovery Challenge 2005 Data Summary (1/3) 3 617 171 page requests 522 410 sessions 318 523 single page 203 887 length > 1 avg. length 16 median 8 modus 2 longest 15454 Clickstream Data, Discovery Challenge 2005

Clickstream Data, Discovery Challenge 2005 Data Summary (2/3) time spent during a session avg. time 00:24:46 median 00:03:08 modus 00:00:09 longest 433:27:53 Clickstream Data, Discovery Challenge 2005

Clickstream Data, Discovery Challenge 2005 Data Summary (3/3) distribution of sessions with length > 1 Clickstream Data, Discovery Challenge 2005