Download presentation
Presentation is loading. Please wait.
Published byDwain Stewart Modified over 9 years ago
1
Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya Kumar Dea, P. Radha Krishnab. Adviser : RC. Chen Present : Yu-Hsiang Fu ( 傅昱翔 ) Date :2006/12/14 Chaoyang University of Technology
2
2006/12/142 Outline Abstract Introduction Rough Set Rough Set Approximation Experimental Results Conclusions References
3
Chaoyang University of Technology 2006/12/143 Abstract Web usage mining is the application of data mining techniques Discovering user access patterns from web access log Using rough sets can effectively mine web log records to discover web page access patterns
4
Chaoyang University of Technology 2006/12/144 Introduction (1/2) WWW includes a huge number of hyperlinks,access and usage information. Web Mining –Web content mining –Web structure mining –Web usage mining
5
Chaoyang University of Technology 2006/12/145 Introduction (2/2) User’s behaviors –Click stream is the sequence of clicks or pages requested as a visitor explores a Web site. Web transaction –A user session is the click-stream of page views for a single user across the entire web. The usage patterns are different for different users that navigates the same pattern in different ways.
6
Chaoyang University of Technology 2006/12/146 Rough Set (1/5) The Rough Set theory was introduced by Zdzislaw Pawlak in the early 1980s. Rough Set deals with the classification analysis of data table. Rough Set develop efficient searching for relevant tolerance relations and extract interesting patterns in data.
7
Chaoyang University of Technology 2006/12/147 Rough Set (2/5) Universe and Relation
8
Chaoyang University of Technology 2006/12/148 Rough Set (3/5) Lower and Upper Approximation ( surely ) ( possible )
9
Chaoyang University of Technology 2006/12/149 Rough Set (4/5) Boundary and Negative region
10
Chaoyang University of Technology 2006/12/1410 Rough Set (5/5)
11
Chaoyang University of Technology 2006/12/1411 Rough Set Approximation (1/7) A user transaction is a sequence of items Let there be m users and the user transactions be Let U be the set of distinct n clicks (hyperlinks/URLs) clicked by users
12
Chaoyang University of Technology 2006/12/1412 Rough Set Approximation (2/7)
13
Chaoyang University of Technology 2006/12/1413 Rough Set Approximation (3/7)
14
Chaoyang University of Technology 2006/12/1414 Rough Set Approximation (4/7)
15
Chaoyang University of Technology 2006/12/1415 Rough Set Approximation (5/7)
16
Chaoyang University of Technology 2006/12/1416 Rough Set Approximation (6/7)
17
Chaoyang University of Technology 2006/12/1417 Rough Set Approximation (7/7)
18
Chaoyang University of Technology 2006/12/1418 Experimental Results (1/2) Log files form www.idrbt.ac.in.www.idrbt.ac.in –The web sites consists of 62 web pages and 283 links. –Log files record every click that user make. –Session time is 30 min.
19
Chaoyang University of Technology 2006/12/1419 Experimental Results (2/2) Steps : –First, the data is preprocessed and transformed. –Second, computing similarity upper approximation for each transaction. –Finally, clusters of transactions using rough approximation (threshold = 0.5).
20
Chaoyang University of Technology 2006/12/1420 Conclusion This paper presented a novel algorithm for clustering using rough approximation to cluster the web transactions of user access. This approach is useful to find interesting user access patterns in web log. The result can be helpful for building up adaptive web according to the user’s behavior.
21
Chaoyang University of Technology 2006/12/1421 References Zdzislaw Pawlak,Jerzy Grzymala-Busse,Roman Slowinski, and Wojciech Ziarko, Rough Sets, COMMUNICATIONS OF THE ACM November 1995/Vol. 38, No. 11, 88-95 Zdzislaw Pawlak, Rough Sets (Abstract),262-264 Zdzisław Pawlak, Andrzej Skowron, Rudiments of rough sets, Information Sciences 177 (2007) 3–27 Nils Kammenhuber, Julia Luxenburger, Anja Feldmann, Gerhard Weikum, Web Search Clickstreams, IMC’06, October 25–27, 2006, A, Jain, Data Clustering : A Review, ACM Computing Suversy, Vol 31, No 3, September 1999,274-275,281-285
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.