COMS Network Theory Week 4: September 29, 2010 Dragomir R. Radev Wednesdays, 6:10-8 PM 325 Pupin Terrace Fall 2010
(27) Self-similarity
Similarity and self-similarity
Sierpinski Gasket See also Koch’s snowflake:
The Cantor set
Measuring a fractal’s dimension In the Sierpinski gasket example, we need at the first step 4 triangles of side ½, at the second step we need 3 such triangles, then at the third step we need 9 triangles of side ¼. Let N( ) be the number of triangles with side 1/ . Then the fractal dimension is:
Box counting N(1) = 1 N(1/2) = 3 N(1/4) = N((1/2) 2 ) = 9 = 3 2 N(1/8) = N((1/2) 3 ) = 27 = 3 3 … N((1/2) n ) = 3 n.
Effective fractal dimension For a compact triangle: –At the beginning, D = ln4/ln2 –After one iteration, D = ln16/ln4 = 2 For the Sierpinski gasket: D = ln3/ln2 = For the Koch curve: D = ln4/ln3 = For the Cantor set: D = ln2/ln3 =
A self-similar fern
(7) Small world networks
The idea of a small world Milgram’s experiment (1960s) Send a package to a stockbroker n Boston 296 senders 20% reached target Chain length (avg) = 6.5 Recent reenactment by Dodds et al. (2003) with 18 targets, 13 countries, 60K participants, only 384 reached the target with path length of 4.
The Watts-Strogatz model How to keep the diameter of a growing random graph small? Simple model: starts with a regular lattice. Two parameters: –Coordination number z: how many neighbors each node has –Shortcuts probability p: for an existing edge, the probability to draw a shortcut between two random nodes –Total number of shortcuts is mp=nzp/2
The Watts-Strogatz model
Diameter Example (Amaral and Barthelemy, 1999): d=1, N=1000, z=10, p=0.25: d=3.6 If p=0.016 (=1/64), the diameter d=7.6
Clustering coefficient It mirrors the underlying lattice structure. According to (Barrat and Weigt, 2000) In the limit, C=3/4
Properties For random graphs For lattices
Degree distribution From (Barrat and Veigt, 2000)
Kleinberg model Use geographical distance (e.g., p ~1/d 2 )
HW 1 Analyze a network data set Submit a PR-style 6 page paper Check class home page for examples and instructions Model papers –How to become a superhero, P. M. Gleiser, J. Stat. Mech. (2007) P –The Political Blogosphere and the 2004 U.S. Election: Divided They Blog (2005) –Patterns in syntactic dependency networks, Ramon Ferrer Cancho, Ricard V. Solé, and Reinhard Köhler, PHYSICAL REVIEW E 69, (2004) –Network properties of written human language, A. P. Masucci and G. J. Rodgers, Phys. Rev. E 74, (2006) –An evaluation of human protein-protein interaction data in the public domain, BMC Bioinformatics 2006, 7(Suppl 5):S19 Database: This database is hand-curated. There are around 25,000 proteins and 35,000 interactions
Examples program committees of conferences in NLP/CL or IR or ML Skitter ( syntactic dependencies mentions of named entities in text wikipedia social networking sites such as myspace, facebook, linkedin, etc.. product recommendations for sites such as amazon, ebay, clothing sites etc.. youtube related videos adjective/noun network Two words are connected if one appears in the directory definition of another. analyze the AAN author network, collaboration network, or title network (two paper titles are connected if they share a non-stop word) people or locations that are mentioned in the same news story collocation networks (Dorogovtsev and Mendes) co-occurrence or other sentence graphs concept, thesaurus, and association graphs citation Web Related similarity-based (e.g., cosine)