Download presentation
Presentation is loading. Please wait.
Published byLilian Norris Modified over 9 years ago
1
Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn http://www.amt.ac.cn/member/mazhiming/index.html
2
Y. Liu, Z. M. Ma, C. Zhou: Web Markov Skeleton Processes and Their Applications, to appear in Tohoku Math J. Y. Liu, Z. M. Ma, C. Zhou: Further Study on Web Markov Skeleton Processes
3
Web Markov Skeleton Process Markov Chain conditionally independent given
4
Define by : WMSP
5
Simple WMSP: Many simple WMSPs are Non-Markov Processes
6
[LMZ2011a,b]
7
Mirror Semi-Markov Process Mirror Semi-Markov Process is not a Hou’s Markov Skeleton Process, i.e. it does not satisfy
8
Time Homogeneous WMSP
9
right continuous, piecewise constant functions
10
Stability of Time homogeneous WMSP Theorem [LMZ 2011a,b] for all
11
WMSP Multivariate Point Process associated with WMSP
14
Let
16
Consequentlywhere Define We can prove that
17
where
18
Why it is called a Web Markov Skeleton Process?
21
How can google make a ranking of 1,950,000 pages in 0.19 seconds?
22
Web page Ranking Importance Ranking Relevance Ranking
23
HITS 1998 Jon Kleinberg Cornell University PageRank 1998 Sergey Brin and Larry Page Stanford University The first major improvement in the history of Web search engine 科学时报.pdf
24
Ranking Web pages by the mean frequency of visiting pages From probabilistic point of view, PageRank is the stationary distribution of a Markov chain. Page Rank, a ranking algorithm used by the Google search engine. 1998, Sergey Brin and Larry Page, Stanford University
25
Markov chain describing surfing behavior
26
Markov chain describing surfing behavior
27
Web surfers usually have two basic ways to access web pages: 1.with probability α, they visit a web page by clicking a hyperlink. 2. with probability 1-α, they visit a web page by inputting its URL address.
28
where
29
More generally we may consider personalized d.: PageRank is defined as the stationary distribution: By the strong ergodic theorem: mean frequency of visiting pages
30
Weak points of PageRank Using only static web graph structure Reflecting only the will of web managers, but ignore the will of users e.g. the staying time of users on a web. Can not effectively against spam and junk pages. BrowseRankSIGIR.ppt
31
Data Mining
32
Browsing Process Markov property Time-homogeneity
36
Computation of the Stationary Distribution –Stationary distribution: – is the mean of the staying time on page i. The more important a page is, the longer staying time on it is. – is the mean of the first re-visit time at page i. The more important a page is, the smaller the re- visit time is, and the larger the visit frequency is.
37
Properties of Q process: –Jumping probability is conditionally independent from jumping time: –Embedded Markov chain: is a Markov chain with the transition probability matrix Computation of the Stationary Distribution
38
– is the stationary distribution of –The stationary distribution of discrete model is easy to compute Power method for Log data for Computation of the Stationary Distribution
40
BrowseRank: Letting Web Users Vote for Page Importance Yuting Liu, Bin Gao, Tie-Yan Liu, Ying Zhang, Zhiming Ma, Shuyuan He, and Hang Li July 23, 2008, Singapore the 31st Annual International ACM SIGIR Conference on Research & Development on Information Retrieval. Best student paper !
46
Browse Rank the next PageRank says Microsoft jerbrows er.wmvjerbrows er.wmv
47
Browsing Processes will be a Basic Mathematical Tool in Internet Information Retrieval Beyond: --General fromework of Browsing Processes? --How about inhomogenous process? --Marked point process --Mobile Web: not really Markovian
48
ExtBrowseRank and semi-Markov processes
52
[10] B. Gao, T. Liu, Z. M. Ma, T. Wang, and H. Li A general markov framework for page importance computation, In proceedings of CIKM '2009, [11] B. Gao, T. Liu, Y. Liu, T. Wang, Z. M. Ma and H. LI Page Importance Computation based on Markov Processes, to appear in Information Retrieval online first: <http://www.springerlink.com/content/7mr7526x21671131 Web Markov Skeleton Process
55
Thank you !
56
The statistical properties of a time homogeneous mirror semi-Markov process is completely determined by:
57
Reconstruction of Mirror Semi-Markov Processes We can construct such that Given:,, Theorem [LMZ 2011b]
58
uniformly
59
Write is expressed as [LMZ2011b]
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.