The Center for Naval Analyses Another View of the Small World Brian McCue (Original paper published in Social Networks 24 (2002), pages ) This work is not a product of the CNA Corporation, a non-profit research and analysis organization.
Capture-Recapture and the Hypergeometric Distribution “It’s a small world!” A common expression
Usual “Small World” Topics Lengths of typical acquaintance chains (“degrees of separation” joining individuals). Sizes of typical acquaintance volumes (numbers of people known to an individual.) Network structures of individuals’ acquaintanceships.
What do we mean? When we say, “It’s a small world,” do we mean: “It’s a short acquaintance chain”? “It’s a small acquaintance volume”? Or something about structure … ?
Capture-Recapture and the Hypergeometric Distribution It’s a short acquaintance chain! Acquaintance Chains?
Capture-Recapture and the Hypergeometric Distribution Duh! Chains
Capture-Recapture and the Hypergeometric Distribution It’s a small acquaintance volume! Acquaintance volumes?
Capture-Recapture and the Hypergeometric Distribution When do we vote next? Volume
Capture-Recapture and the Hypergeometric Distribution Structure? It’s a small world; I sample it at no great rate, and I keep getting all these repeats!
Capture-Recapture and the Hypergeometric Distribution Structure! That’s right!
“It’s a small world” “It must be a small world, because I sample the population at no great rate and keep getting all these repeats.” The “small world” is the world from which we would be sampling, if we were sampling randomly from a structureless world and experiencing the observed level of coincidental meetings.
Operational world size The evocation of this imaginary, small, structureless world is a statement about the structure of the real, large, structured world. We will estimate the size of this “operational world,” and thereby learn about the real world.
Definitions W = size of an individual’s “world” (Does not include individual herself.) I = Number of meetings she has had I k = number of meetings of person k (I k is defined for k = 1, 2, … W) W j = number of individuals met j times (W j is defined for j = 0, 1, 2, … I)
Distributing I balls over W boxes W I ways do to it. We don’t care about the order of introductions. We don’t care which person is which. A box with two balls is a coincidental re-introduction.
Probability of a configuration W I ways do assign I balls to W boxes. We don’t care about the order of introductions, so I!/(I 1 ! x I 2 ! x I 3 ! x … I W !) configurations can’t be told apart. We don’t care which person is which, so W!/(W 0 ! x W 1 ! x W 2 ! x … W I !) configurations can’t be told apart. So the probability of any configuration is:
Small village example A visitor meets randomly 9 people, two of them twice. Given a total population of W, the probability of this happening is
Likelihood, a function of W Probability, given W, that what happened would happen. Can be used to estimate W. Suggests that there are about 24 people.
Realistic numbers W 1 and I 1 are nearly equal to I These equal a few thousand for most people, but can only be estimated approximately. For j,k > 1,W j and I k are small and people might recall them.
More definitions S = I c – W c S is the number of surprising reintroductions.
Likelihood of W, re-written What the person doesn’t remember has factorial = 1 so it doesn’t matter Things a person might remember.
Maximizing L(W) L(W) still contains factorials of some big numbers. But we can find the W that maximizes by finding W such that
Estimating I The phonebook test of Freeman and Thompson presents 301 surnames and asks the subject how many are names of people she knows. I = score x total names in book/301. Book contains about 100,000 names. Typical result is 1,000 – 6,000.
Estimating W A person has I = 2000, S = 1: this leads to a W of about 2,000,000 in If I = 4,320 and S = 12, W = 775,000
But it’s worth computing L(W)
Observations on likelihoods Maxima are surprisingly high. Even S = 3 is enough to make a distinct peak. Resulting world sizes are –Much less than the real world’s size. –Comparable to (mostly less than or equal to) city sizes.
Conclusions We each might as well be drawing a lifetime’s introductions from a small city. For people who really do draw introductions from limited populations, coincidental re- introductions could be used to estimate I.
Discussion But what about those acquaintance chains, and the six degrees of separation? –In light of US population size and estimates of I, six degrees is surprisingly many, not surprisingly few. For a random structure, four degrees would be plenty. –Small world-size suggests that extra degrees are needed to make jumps from world to world.
Suggestions for future work Get solid data on coincidental re- introductions. Do math to find: –Why maxima of L(W) are equal for equal S’s. –Faster way of computing L’s for successive W’s. Think about how small worlds might connect and how we could, perhaps through coincidental reintroductions, discover how they really do connect.
Connected small worlds or Or what?
Partial Bibliography Manfred Kochen (editor), 1989, The Small World, Ablex Publishing Corporation, Norwood, MA. Includes the following chapters: –H. Russell Bernard, Eugene C. Johnsen, Peter D. Killworth, Scott Robinson, “Estimating the Size of an Average Personal Network and of an Event Subpopulation.” –Linton C. Freeman and Claire R. Thompson, “Estimating Acquaintanceship Volume.” –Alden S. Klovdahl, “Urban Social Networks, Some Methodological Problems and Possibilities” –Ithiel de Sola Pool and Manfred Kochen, “Contacts and Influence,” originally published in Social Networks 1 (1978), pages Brian McCue, “Estimating the Number of Unheard U-boats: A Problem in Traffic Analysis,” 2000, Military Operations Research, Volume 5, Number 4, pp Stanley Milgram, “The Small World Problem,” 1967 Psychology Today 1, pp Ray Solomonoff and Anatol Rapaport, 1951, “Connectivity of Random Nets,” Bulletin of Mathematical Biophysics 13, pp Ray Solomonoff, 1952, “An Exact Method for the Computation of the Connectivity of Random Nets,” Bulletin of Mathematical Biophysics 14, pp Jeffrey Travers and Stanley Milgram, 1970, “An experimental study of the small world problem, “ Sociometry 32, pp Duncan Watts, 1999, Small Worlds: The Dynamics of Networks between Order and Randomness, Princeton, Princeton University Press.