DTMC Applications Ranking Web Pages & Slotted ALOHA TELE4642: Week11
Outline Apply the theory of discrete time Markov chains: Google’s ranking of web-pages What page is the user most likely searching for? Formulate web-graph as a Markov chain Does steady-state exist? Does a user randomly walk the web-graph? Can search results be improved further? Slotted ALOHA medium access control protocol Is the protocol stable for large number of nodes? How should the retransmission probability be chosen? Network Performance
Ranking of Web-pages Problem: how should a search engine rank web-pages? Idea: rank pages based on number of in-links (citations) Weakness: not all in-links are equal Google’s idea: a page has high rank if the sum of the ranks of its in-link pages is high Formulate moves between web-pages as Markov chain Solve to obtain steady-state probability of each state State probability is proportional to importance of page Example with three web-pages: N M A Network Performance
Markov Model of the Web 2 3 1 5 4 Issue 1: how to choose transition probabilities? Assumption: each link is equally likely to be clicked Can accommodate non-uniform probability if such information available Issue 2: some rows are zero (dead ends) Assumption: on reaching dead-end restart at any state r is an Nx1 column vector whose i-th row is non-zero for dead-end nodes v is an Nx1 column vector whose entries add to 1 could all be 1/N (uniform) could be different from uniform (i.e. personalized) Network Performance
Markov Model of the Web (contd.) 5 1 4 2 3 Issue 3: Transition probability matrix may still be non-stationary Solution: inter-connect all nodes: where u is an Nx1 column vector with all entries 1 α is a number between 0 and 1 (“tax” on “importance”) For and : The very sparse initial matrix now becomes the dense matrix Network Performance
Computing the page rank Issue 4: Computing involves solving billion+ equations! Instead take powers of Iterative procedure: No matrix multiplication, work with only one vector Multiplication with sparse matrix P, dense matrix not formed Convergence depends on parameter α What should α be set at? Small α allows faster convergence (why?) Large α preserves better the true nature of the web-graph (why?) Brin and Page [Google] claim that α=0.85 works well only 50 to 100 iterations are required for convergence Network Performance
Discussion 2 3 1 5 4 Basic idea: Random walk on the web-graph The more often you visit a node, the more “popular” the page Does your model of the walk path match real user behavior? Instead of connecting every node to every other node (“tax”), create a dummy node to which all other nodes are connected and that connects to all nodes; this alters the true web-graph less. At dead-end, user often hits the “back” button; so bias the transition probability towards predecessor pages. How to increase the ranking of your web-page? Create replicas of your page? Create many “dummy” web-pages that point to your page? Make your web-pages link to each other? Further reading: “The PageRank Citation Ranking: Bringing Order to the Web”, 1999 “Random Walks with Back Buttons”, 2000 “Deeper inside PageRank”, 2004 Network Performance
Slotted Aloha N nodes, time-slotted system, equal-size packets Probability of new packet arrival in a slot to any given node is pa and the new packet is transmitted immediately Collision happens if more than one node transmits in the same slot; detected by all nodes at end of slot If collision, each backlogged node retries in every slot with probability pr until successful transmission No queueing: new arrivals to a backlogged node are dropped Network Performance
Slotted Aloha: Markov chain State: number of backlogged nodes m = 0,…,N Probability that i backlogged nodes transmit in a slot is Probability that j non-backlogged nodes transmit in a slot is Markov chain: Network Performance
Slotted Aloha: Efficiency Probability of successful transmission in state m: For small pa and pr , and using for small x: Let be the transmission attempt rate in state m, the throughput (successful transmissions per slot) is Throughput maximized at G(m)=1 Max. throughput = 1/e = 36% Network Performance
Slotted Aloha: Instability Does slotted Aloha work when N is large? Given you are in state m, what is the probability of moving backwards (i.e. state < m)? Stated another way, when the number of backlogged nodes is large enough, the average attempt rate G(m) becomes > 1 i.e. there are excessive collisions and state keeps growing Potential solution: ensure the attempt rate G(m) < 1 How? make the retransmission probability dependent on state E.g.: exponential backoff: Price for making retransmission probability too small: large delay Network Performance