Download presentation
Presentation is loading. Please wait.
Published byMustafa Hull Modified over 9 years ago
1
شهره کاظمی kazemi@ce.aut.ac.ir 1 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) گزارش پيشرفت کار پروژه مدل مارکف
2
شهره کاظمی kazemi@ce.aut.ac.ir 2 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) Modeling and Predicting a User’s Browsing Behavior the problem of modeling and predicting a user’s browsing behavior on a Web site can be used to improve: the Web cache performance [1; 2; 3] recommend related pages [4;5] improve search engines [6] understand and influence buying patterns [7] personalize the browsing experience [8]
3
شهره کاظمی kazemi@ce.aut.ac.ir 3 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) Markov models Markov models [9] have been used for studying and understanding stochastic processes They shown to be well suited for modeling and predicting a user’s browsing behavior on a Web site.
4
شهره کاظمی kazemi@ce.aut.ac.ir 4 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) Markov models In general, the input for these problems is the sequence of Web pages accessed by a user The goal is to build Markov models that can be used to predict the Web page that the user will most likely access next
5
شهره کاظمی kazemi@ce.aut.ac.ir 5 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) Markov Models for Predicting Next-Accessed Page The act of a user browsing a Web site is commonly modeled by observing the set of pages that he or she visits[10] This set of pages is referred to as a Web session W =( P1,P2,..., Pl )
6
شهره کاظمی kazemi@ce.aut.ac.ir 6 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) Markov Models for Predicting Next-Accessed Page The next-page prediction problem can be solved using a probabilistic framework as follows: Let W be a user’s Web session of length l let P( pi | W ) be the probability that the user visits page pi next Then the page pl+1 that the user will visit next is given by
7
شهره کاظمی kazemi@ce.aut.ac.ir 7 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) Markov Models for Predicting Next-Accessed Page the probability of visiting a page pi does not depend on all the pages in the Web session, but only on a small set of k preceding pages, where k « l Then we have:
8
شهره کاظمی kazemi@ce.aut.ac.ir 8 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) Markov Models for Predicting Next-Accessed Page The number of preceding pages k that the next page depends on is called the order of the Markov model, and the resulting model M is called the kth-order Markov model
9
شهره کاظمی kazemi@ce.aut.ac.ir 9 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) P1 P2 P4 P3 P5 Markov Models for Predicting Next-Accessed Page the site map for a sample Web site as a directed graph
10
شهره کاظمی kazemi@ce.aut.ac.ir 10 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) Markov Models for Predicting Next-Accessed Page a set of Web sessions that were generated on this Web site Training set W1 : W2 : W3 : W4 : W5 : W6 : Test set: Wt 1 :
11
شهره کاظمی kazemi@ce.aut.ac.ir 11 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) Markov Models for Predicting Next-Accessed Page the frequencies of different states for first-order Markov models 1 st –Order States Fr.P1P1 P2P2 P3P3 P4P4 P5P5 S(1,1)= S(1,2)= S(1,3)= S(1,4)= S(1,5)= 9371193711 0120001200 5000050000 4201142011 0030000300 0020000200
12
شهره کاظمی kazemi@ce.aut.ac.ir 12 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) Markov Models for Predicting Next-Accessed Page the frequencies of different states for second-order Markov models
13
شهره کاظمی kazemi@ce.aut.ac.ir 13 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) how these models are used to predict the most probable page for Web session Wt1 Markov Models for Predicting Next-Accessed Page
14
شهره کاظمی kazemi@ce.aut.ac.ir 14 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) Performance Measures for Markov Models The first is the accuracy of the model The second is the number of states of the model The third is the coverage of the mode the ratio of the number of Web sessions for which the model is able to correctly predict the hidden page to the total number of Web sessions in the test set the total number of states for which a Markov model has estimated the ratio of the number of Web sessions whose state required for making a prediction was found in the model to the total number of Web sessions in the test set
15
شهره کاظمی kazemi@ce.aut.ac.ir 15 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) Lower-order Markov models lower-order Markov models (first or second) are not successful in accurately predicting the next page to be accessed by the user Because these models do not look far into the past
16
شهره کاظمی kazemi@ce.aut.ac.ir 16 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) Higher-order Markov models In order to obtain better predictions, higher- order models must be used these higher-order models have a number of limitations: (i) high state-space complexity (ii) reduced coverage (iii) sometimes even worse accuracy due to the lower coverage
17
شهره کاظمی kazemi@ce.aut.ac.ir 17 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) Comparing accuracy, coverage and model size with the order of Markov model
18
شهره کاظمی kazemi@ce.aut.ac.ir 18 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) All-Kth-Order Markov model One method to overcome coverage problem is to train varying order Markov models and then combine them for prediction[8] For each test instance, the highest-order Markov model that covers the instance is used for prediction This scheme is called : All-Kth-Order Markov model But it increases the problem of model size
19
شهره کاظمی kazemi@ce.aut.ac.ir 19 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) Some techniques developed to intelligently combine different order Markov models The resulting model : Has low state complexity, Retains the coverage of the All-Kth-Order Markov model Achieves comparable accuracies
20
شهره کاظمی kazemi@ce.aut.ac.ir 20 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) Frequency based They are based on the observation that states that occur with low frequency in the training set, tend to also have low prediction accuracies These low frequency states can be eliminated without affecting the accuracy of the resulting model
21
شهره کاظمی kazemi@ce.aut.ac.ir 21 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) Frequency based The amount of pruning is controlled by the parameter Φ referred to as the frequency threshold Note that they will never prune a state from a first-order Markov model that will not reduce the coverage of the original model
22
شهره کاظمی kazemi@ce.aut.ac.ir 22 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) Frequency based Frequency threshold Accuracy# states 0 2 4 6 8 10 12 14 16 18 20 22 24 30.24 30.68 31.32 31.56 31.65 31.71 31.74 31.73 31.72 31.67 126464 44528 20914 14164 10899 8952 7661 6716 5969 5389 4965 4609 4296
23
شهره کاظمی kazemi@ce.aut.ac.ir 23 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) Error based The final predictions are computed by using only the states of the model that have the smallest estimated error rate the error associated with each state is estimated by a validation step A higher-order state is pruned by comparing its error rate with the error rate of its lower- order states
24
شهره کاظمی kazemi@ce.aut.ac.ir 24 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) For example, to prune the state S( 3,q) (Pi, Pj, Pk), its error rate will be compared with the error rate for states S( 2,r) (Pj, Pk), and state S( 1,s) (Pk); the state S( 3,q) will be pruned if its error rate is higher than any of them. Error based
25
شهره کاظمی kazemi@ce.aut.ac.ir 25 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) Training and validating Web sessions
26
شهره کاظمی kazemi@ce.aut.ac.ir 26 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) Various order Markov states with their maximum frequency page
27
شهره کاظمی kazemi@ce.aut.ac.ir 27 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) Error rates for Markov states
28
شهره کاظمی kazemi@ce.aut.ac.ir 28 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab)
29
شهره کاظمی kazemi@ce.aut.ac.ir 29 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) References [1] SCHECHTER, S., KRISHNAN, M., AND SMITH, M. D. 1998. Using path profiles to predict http requests.In 7th International World Wide Web Conference [2] BESTRAVOS, A. 1995. Using speculation to reduce server load and service time on www. In Proceedings of the 4th ACM International Conference of Information and Knowledge Management. ACM Press. [3] PADMANABHAM, V. AND MOGUL, J. 1996. Using predictive prefetching to improve world wide web latency. Comput. Commun. Rev. [4] DEAN, J. AND HENZINGER, M. R. 1999. Finding related pages in world wide web. In Proceedings of the 8th International World Wide Web Conference. [5] PIROLLI, P., PITKOW, J., AND RAO, R. 1996. Silk from a sow’s ear: Extracting usable structures from the web. In Proceedings of ACM Conference on Human Factors in Computing Systems (CHI-96).
30
شهره کاظمی kazemi@ce.aut.ac.ir 30 آزمايشکاه سيستم های هوشمند (http://ce.aut.ac.ir/islab) [6] BRIN, S. AND PAGE, L. 1998. The anatomy of large-scale hypertextual web search engine. In Proceedings of the 7th International World Wide Web Conference. [7] CHI, E., PITKOW, J., MACKINLAY, J., PIROLLI, P., GOSSWEILER, R., AND CARD, S. 1998. Visualizing the evolution of web ecologies. In Proceedings of ACM Conference on Human Factors in Computing Systems (CHI 98). [8] PITKOW, J. AND PIROLLI, P. 1999. Mining longest repeating subsequence to predict world wide web surfing. In 2nd USENIX Symposium on Internet Technologies and Systems. Boulder, CO. [9] PAPOULIS, A. 1991. Probability, Random Variables, and Stochastic Processes. McGraw Hill. [10] SRIVASTAVA, J., COOLEY, R., DESHPANDE, M., AND TAN, P.-N. 2000. Web usage mining: Discovery and applications of usage patterns from web data. SIGKDD Explor. 1, 2.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.