Presentation is loading. Please wait.

Presentation is loading. Please wait.

Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU.

Similar presentations


Presentation on theme: "Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU."— Presentation transcript:

1 Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn http://www.amt.ac.cn/member/mazhiming/index.html

2 Y. Liu, Z. M. Ma, C. Zhou: Web Markov Skeleton Processes and Their Applications, to appear in Tohoku Math J. Y. Liu, Z. M. Ma, C. Zhou: Further Study on Web Markov Skeleton Processes

3 Web Markov Skeleton Process Markov Chain conditionally independent given

4 Define by : WMSP

5 Simple WMSP: Many simple WMSPs are Non-Markov Processes

6 [LMZ2011a,b]

7 Mirror Semi-Markov Process Mirror Semi-Markov Process is not a Hou’s Markov Skeleton Process, i.e. it does not satisfy

8 Time Homogeneous WMSP

9 right continuous, piecewise constant functions

10 Stability of Time homogeneous WMSP Theorem [LMZ 2011a,b] for all

11 WMSP Multivariate Point Process associated with WMSP

12

13

14 Let

15

16 Consequentlywhere Define We can prove that

17 where

18 Why it is called a Web Markov Skeleton Process?

19

20

21 How can google make a ranking of 1,950,000 pages in 0.19 seconds?

22 Web page Ranking Importance Ranking Relevance Ranking

23 HITS 1998 Jon Kleinberg Cornell University PageRank 1998 Sergey Brin and Larry Page Stanford University The first major improvement in the history of Web search engine 科学时报.pdf

24 Ranking Web pages by the mean frequency of visiting pages From probabilistic point of view, PageRank is the stationary distribution of a Markov chain. Page Rank, a ranking algorithm used by the Google search engine. 1998, Sergey Brin and Larry Page, Stanford University

25 Markov chain describing surfing behavior

26 Markov chain describing surfing behavior

27 Web surfers usually have two basic ways to access web pages: 1.with probability α, they visit a web page by clicking a hyperlink. 2. with probability 1-α, they visit a web page by inputting its URL address.

28 where

29 More generally we may consider personalized d.: PageRank is defined as the stationary distribution: By the strong ergodic theorem: mean frequency of visiting pages

30 Weak points of PageRank Using only static web graph structure Reflecting only the will of web managers, but ignore the will of users e.g. the staying time of users on a web. Can not effectively against spam and junk pages. BrowseRankSIGIR.ppt

31 Data Mining

32 Browsing Process Markov property Time-homogeneity

33

34

35

36 Computation of the Stationary Distribution –Stationary distribution: – is the mean of the staying time on page i. The more important a page is, the longer staying time on it is. – is the mean of the first re-visit time at page i. The more important a page is, the smaller the re- visit time is, and the larger the visit frequency is.

37 Properties of Q process: –Jumping probability is conditionally independent from jumping time: –Embedded Markov chain: is a Markov chain with the transition probability matrix Computation of the Stationary Distribution

38 – is the stationary distribution of –The stationary distribution of discrete model is easy to compute Power method for Log data for Computation of the Stationary Distribution

39

40 BrowseRank: Letting Web Users Vote for Page Importance Yuting Liu, Bin Gao, Tie-Yan Liu, Ying Zhang, Zhiming Ma, Shuyuan He, and Hang Li July 23, 2008, Singapore the 31st Annual International ACM SIGIR Conference on Research & Development on Information Retrieval. Best student paper !

41

42

43

44

45

46 Browse Rank the next PageRank says Microsoft jerbrows er.wmvjerbrows er.wmv

47 Browsing Processes will be a Basic Mathematical Tool in Internet Information Retrieval Beyond: --General fromework of Browsing Processes? --How about inhomogenous process? --Marked point process --Mobile Web: not really Markovian

48 ExtBrowseRank and semi-Markov processes

49

50

51

52 [10] B. Gao, T. Liu, Z. M. Ma, T. Wang, and H. Li A general markov framework for page importance computation, In proceedings of CIKM '2009, [11] B. Gao, T. Liu, Y. Liu, T. Wang, Z. M. Ma and H. LI Page Importance Computation based on Markov Processes, to appear in Information Retrieval online first: <http://www.springerlink.com/content/7mr7526x21671131 Web Markov Skeleton Process

53

54

55 Thank you !

56 The statistical properties of a time homogeneous mirror semi-Markov process is completely determined by:

57 Reconstruction of Mirror Semi-Markov Processes We can construct such that Given:,, Theorem [LMZ 2011b]

58 uniformly

59 Write is expressed as [LMZ2011b]

60

61


Download ppt "Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU."

Similar presentations


Ads by Google