Algorithmic Problems in the Internet Christos H. Papadimitriou
Iowa State, April Goals of TCS ( ): Develop a productive mathematical understanding of the capabilities and limitations of the von Neumann computer and its software (the dominant and most novel computational artifacts of that time); Mathematical tools: combinatorics, logic What should the goals of TCS be today? (and what math tools will be handy?)
Iowa State, April 20033
4 The Internet huge, growing, open, emergent, mysterious built, operated and used by a multitude of diverse economic interests as information repository: open, huge, available, unstructured, critical foundational understanding urgently needed
Iowa State, April Today… Games and mechanism design Getting lost in the web The Internet’s heavy tail
Iowa State, April Games, games… strategies 3,-2 payoffs (NB: also, many players)
Iowa State, April ,-1-1,1 1,-1 0,00,00,10,1 1,01,0-1,-1 3,33,30,40,4 4,04,01,11,1 matching penniesprisoner’s dilemma chicken e.g.
Iowa State, April Nash equilibrium Definition: double best response (problem: may not exist) randomized Nash equilibrium Theorem [Nash 1952]: Always exists. Problem: there are usually many......
Iowa State, April The price of anarchy cost of worst Nash equilibrium “socially optimum” cost [Koutsoupias and P, 1998] in network routing = 2 [Roughgarden and Tardos, 2000, Roughgargen 2002]
Iowa State, April mechanism design (or inverse game theory) agents have utilities – but these utilities are known only to them game designer prefers certain outcomes depending on players’ utilities designed game (mechanism) has designer’s goals as dominating strategies
Iowa State, April e.g., Vickrey auction sealed-highest-bid auction encourages gaming and speculation Vickrey auction: Highest bidder wins, pays second-highest bid Theorem: Vickrey auction is a truthful mechanism. (Theorem: It maximizes social benefit and auctioneer expected revenue.)
Iowa State, April Vickrey shortest paths ts pay e Vc(e) = its declared cost c(e), plus a bonus equal to dist(s,t)| c(e) = - dist(s,t)
Iowa State, April Problem: ts
Iowa State, April But… …in the Internet Vickrey overcharge would be only about 30% on the average [FPSS 2002] Could this be the manifestation of rational behavior at network creation? [FPSS 2002]: Vickrey charges –Depend on origin and destination –Can be computed on top of BGP
Iowa State, April But… (cont) [FPSS 2002]: Vickrey charges –Depend on origin and destination –Can be computed on top of BGP [with Mihail and Saberi, 2003] –They are small in expectation in random graphs. –(Also: Why traffic grows moderately as the Internet grows…)
Iowa State, April The web as a graph cf: [Google 98], [Kleinberg 98] how do you sample the web? [Bar-Yossef, Berg, Chien, Fakcharoenphol, Weitz, VLDB 2000] e.g.: 42% of web documents are in html. How do you find that? What is a “random” web document?
Iowa State, April documents hyperlinks Idea: random walk Problems: 1. asymmetric 2. uneven degree 3. 2 nd eigenvalue? =
Iowa State, April The web walker: results mixing time is ~log N/(1- ) WW mixing time: 3,000,000 actual WW mixing time: 100.com 49%,.jp 9%,.edu 7%,.cn 0.8%
Iowa State, April Q: Is the web a random graph? Many K 3,3 ’s (“communities”) Indegrees/outdegrees obey “power laws” Model [Kumar et al, FOCS 2000]: copying
Iowa State, April Also the Internet [Faloutsos ] the degrees of the Internet are power law distributed Both autonomous systems graph and router graph Eigenvalues: ditto!??! Model?
Iowa State, April The world according to Zipf Power laws, Zipf’s law, heavy tails,… i-th largest is ~ i -a (cities, words: a = 1, “Zipf’s Law”) Equivalently: prob[greater than x] ~ x -b (compare with law of large numbers) “the signature of human activity”
Iowa State, April Models Size-independent growth (“the rich get richer,” or random walk in log paper) Growing number of growing cities In the web: copying links [Kumar et al, 2000] Carlson and Doyle 1999: Highly optimized tolerance (HOT)
Iowa State, April Our model [with Fabrikant and Koutsoupias, 2002]: min j < i [ d ij + hop j ]
Iowa State, April Theorem: if < const, then graph is a star degree = n -1 if > n, then there is exponential concentration of degrees prob(degree > x) < exp(-ax) otherwise, if const < < n, heavy tail: prob(degree > x) > x -b
Iowa State, April Heuristically optimized tradeoffs Also: file sizes (trade-off between communication costs and file overhead) Power law distributions seem to come from tradeoffs between conflicting objectives (a signature of human activity?) cf HOT, [Mandelbrot 1954] Other examples? General theorem?
Iowa State, April PS: eigenvalues Model: Edge [i,j] has prob. ~ d i d j Theorem [with Mihail, 2002]: If the d i ’s obey a power law, then the n b largest eigenvalues are almost surely very close to d 1, d 2, d 3, … (NB: The eigenvalue exponent observed in Faloutsos 3 is about ½ of the degree exponent) Corollary: Spectral methods are of dubious value in the presence of large features