Download presentation
Presentation is loading. Please wait.
1
Guide to the Clickstream Data
Petr Berka University of Economics, Prague
2
Web Usage Mining Domain
click-stream - a sequential series of page view (displays on user’s browser at one time) requests, server session - a click-stream of page views for a single user for a particular web site, user session - is the click-stream of page views for a single user across the entire web. Clickstream Data, Discovery Challenge 2005
3
Clickstream Data, Discovery Challenge 2005
The Clickstream Data ~3Millions of records (24 days) from a www shop web server log Contains information about time; IP address; session ID; page request; referer There are hundreds of thousands of sessions; most of them very short, on average 16 pages Each page request in this www shop has the same structure – page type / content ID (product ID) Page types are for example dp (detail of product), sb (shopping basket), ct (contact) Clickstream Data, Discovery Challenge 2005
4
Clickstream Data, Discovery Challenge 2005
Example of the Data unix time ;IP address ; session ID ; page request; referee ; ; e8a0a4d7a4407ed9554b64ed1;/dp/?id=124 ; ; ;3995b2c0599f1782e2b b1c94;/dp/?id=182 ; ; ;2fd3213f2edaf82b27562d28a2a747aa;/ ; ; ; e8a0a4d7a4407ed9554b64ed1;/dp/?id=148 ;/dp/?id=124; ; ; e8a0a4d7a4407ed9554b64ed1;/sb/ ;/dp/?id=148; ; ;2fd3213f2edaf82b27562d28a2a747aa;/contacts/ ; /; ; ; e8a0a4d7a4407ed9554b64ed1;/sb/ ;/sb/; Clickstream Data, Discovery Challenge 2005
5
Clickstream Data, Discovery Challenge 2005
Data Description table “obchod” (shop) - name of the internet shop (7 entries), table “kategorie” (category) - info about category of products (64 entries), table “list” (sheet) - info about a specific product of a more detailed type (157 entries), table “znacka” (brand) - name of the producer or brand of a product (197 entries), table “tema” (theme) - info about themes discussed in the on-line advice (36 entries) Clickstream Data, Discovery Challenge 2005
6
Clickstream Data, Discovery Challenge 2005
Data Summary (1/3) page requests sessions single page length > 1 avg. length 16 median 8 modus 2 longest 15454 Clickstream Data, Discovery Challenge 2005
7
Clickstream Data, Discovery Challenge 2005
Data Summary (2/3) time spent during a session avg. time 00:24:46 median 00:03:08 modus 00:00:09 longest 433:27:53 Clickstream Data, Discovery Challenge 2005
8
Clickstream Data, Discovery Challenge 2005
Data Summary (3/3) distribution of sessions with length > 1 Clickstream Data, Discovery Challenge 2005
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.