On Improving Website Connectivity by Using Web-Log Data Streams Edmond HaoCun Wu, Michael KwokPo Ng, Joshua ZheXue Huang From DASFAA 2004, LNCS 2973 Advisor:Jia-Ling Koh Speaker:Chun-Wei Hsieh 07/14/2004
Problem Statement Excessive web pages requested make web servers ineffectively. User have the low Website connective speed and even cannot access the Website. The Website linkage design affects the Website access efficiency. User spend a lot of time on searching the contents.
Problem Solving Decrease the redundant pages requested at the peak hours will be of great help to improve the system performance.
Website Topology Website topology is the structure of a Website.
Website Topology If users want to sequentially visit {A,E,F} ,the shortest traversal path is {A,C,H,E,H,C,A,B,F}. The access sequence is S = {AC,CH,HE,EH,HC,CA,AB,BF}
Non-target accesses There are two types of access redundancy. 1.Jump-track access 2.Backtrack access S = {AC,CH,HE,EH,HC,CA,AB,BF} AC,CH,HE,EH,HC,CA,AB,BF AC,CH,HE,EH,HC,CA,AB,BF
UAE & SAE User Access Efficiency Server Access Efficiency S: a user access session A: the jump-track access session B: the backtrack access session
UAE & SAE (2) Given a access session database D, the UAE and SAE of D are calculated as follows:
Mining Access Patterns Three factors: 1.page transitive probability 2.page access frequency 3.page staying time The new index : Access Interest (AI)
Topology Probability Model h: the number of hyperlinks w: the weighting parameter m: assume the shortest path passes through m Web pages
Counting Page Sequential Accesses Given a user access session dataset D={S1,S2,…., Sn}, for each S D, S={P1,P2,….,Pr} is the number of accesses from to
The Temporal Factor is the staying time of visiting and : the average staying time on page
Access Interest (AI) can identify frequent but not efficient access patterns. A access pattern with higher AI value has more necessity to improve its access efficiency.
Dynamic Website Connectivity Enhancement 1.Convert Web-log data streams into access sessions. 2.Calculate the access efficiency (UAE and SAE) of current user access patterns.If Access efficiency is lower than the threshold, the monitoring system give a warning signal.Otherwise, keep on monitoring. 3.Calculate the AI value of current access patterns, then select access patterns with high AI value for connectivity enhancement. 4.Recompute the access efficiency of access patterns,if the access efficiency increases, it indicates the Website connectivity has been improved.
Experiments .
Conclusion Proposed new measures for evaluating access efficiency in Website. Proposed an efficient method aggregating access sessions and mining access patterns. In the future, extend the Website optimization model in the CRM.