Pip Pattison University of Melbourne UKSNA, University of Greenwich, June 2013 A hierarchy of exponential random graph models for the analysis of social.

Slides:



Advertisements
Similar presentations
The Theory of Zeta Graphs with an Application to Random Networks Christopher Ré Stanford.
Advertisements

An introduction to exponential random graph models (ERGM)
Where we are Node level metrics Group level metrics Visualization
Emergence of Scaling in Random Networks Albert-Laszlo Barabsi & Reka Albert.
Λ14 Διαδικτυακά Κοινωνικά Δίκτυα και Μέσα Strong and Weak Ties Chapter 3, from D. Easley and J. Kleinberg book.
Analysis and Modeling of Social Networks Foudalis Ilias.
Based on chapter 3 in Networks, Crowds and markets (by Easley and Kleinberg) Roy Mitz Supervised by: Prof. Ronitt Rubinfeld November 2014 Strong and weak.
Analysis of Social Media MLD , LTI William Cohen
Identity and search in social networks Presented by Pooja Deodhar Duncan J. Watts, Peter Sheridan Dodds and M. E. J. Newman.
Directional triadic closure and edge deletion mechanism induce asymmetry in directed edge properties.
Chapter 8-3 Markov Random Fields 1. Topics 1. Introduction 1. Undirected Graphical Models 2. Terminology 2. Conditional Independence 3. Factorization.
Exponential random graph (p*) models for social networks Workshop Harvard University February 2002 Philippa Pattison Garry Robins Department of Psychology.
Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.
Centrality and Prestige HCC Spring 2005 Wednesday, April 13, 2005 Aliseya Wright.
Network Statistics Gesine Reinert. Yeast protein interactions.
Joint social selection and social influence models for networks: The interplay of ties and attributes. Garry Robins Michael Johnston University of Melbourne,
1 Virtual Neighborhoods Architecture of Online Communities Reuven Aviv Zippy Erlich Gilad Ravid
Advanced Topics in Data Mining Special focus: Social Networks.
Exponential Random Graph Models (ERGM) Michael Beckman PAD777 April 9, 2010.
How is this going to make us 100K Applications of Graph Theory.
Sunbelt 2009statnet Development Team ERGM introduction 1 Exponential Random Graph Models Statnet Development Team Mark Handcock (UW) Martina.
Beyond Triangles: The Importance Of Diamonds In Networks Katherine Stovel Christine Fountain Yen-Sheng Chiang University of Washington.
Optimization Based Modeling of Social Network Yong-Yeol Ahn, Hawoong Jeong.
Analysis and Modeling of the Open Source Software Community Yongqin Gao, Greg Madey Computer Science & Engineering University of Notre Dame Vincent Freeh.
Λ14 Διαδικτυακά Κοινωνικά Δίκτυα και Μέσα Strong and Weak Ties Chapter 3, from D. Easley and J. Kleinberg book.
Social Network Analysis and Complex Systems Science
Principles of Social Network Analysis. Definition of Social Networks “A social network is a set of actors that may have relationships with one another”
A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
Network Analysis of the local Public Health Sector: Translating evidence into practice Helen McAneney School of Medicine, Dentistry and Biomedical Sciences,
Random-Graph Theory The Erdos-Renyi model. G={P,E}, PNP 1,P 2,...,P N E In mathematical terms a network is represented by a graph. A graph is a pair of.
Social Network Analysis Prof. Dr. Daning Hu Department of Informatics University of Zurich Mar 5th, 2013.
"Social Networks, Cohesion and Epidemic Potential" James Moody Department of Sociology Department of Mathematics Undergraduate Recognition Ceremony May.
Neighbourhood-based models for social networks: model specification issues Pip Pattison, University of Melbourne [with Garry Robins, University of Melbourne.
Page 1 Inferring Relevant Social Networks from Interpersonal Communication Munmun De Choudhury, Winter Mason, Jake Hofman and Duncan Watts WWW ’10 Summarized.
Xiaowei Ying, Xintao Wu Dept. Software and Information Systems Univ. of N.C. – Charlotte 2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia.
Online Social Networks and Media
Yongqin Gao, Greg Madey Computer Science & Engineering Department University of Notre Dame © Copyright 2002~2003 by Serendip Gao, all rights reserved.
A two minute introduction to: Exponential random graph (p*)models for social networks SNAC Workshop, Illinois, November 2005 Garry Robins, University of.
Complex Network Theory – An Introduction Niloy Ganguly.
Class 9: Barabasi-Albert Model-Part I
Slides are modified from Lada Adamic
Lecture 10: Network models CS 765: Complex Networks Slides are modified from Networks: Theory and Application by Lada Adamic.
Comparing Snapshots of Networks Shah Jamal Alam and Ruth Meyer Centre for Policy Modelling 28 th March, 2007 – CAVES Bi-annual Meeting, IIASA,
Graphs & Matrices Todd Cromedy & Bruce Nicometo March 30, 2004.
Complex Network Theory – An Introduction Niloy Ganguly.
Most of contents are provided by the website Network Models TJTSD66: Advanced Topics in Social Media (Social.
Introduction to Statistical Models for longitudinal network data Stochastic actor-based models Kayo Fujimoto, Ph.D.
1 Epidemic Potential in Human Sexual Networks: Connectivity and The Development of STD Cores.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
Information Retrieval Search Engine Technology (10) Prof. Dragomir R. Radev.
Hierarchical Organization in Complex Networks by Ravasz and Barabasi İlhan Kaya Boğaziçi University.
Introduction to ERGM/p* model Kayo Fujimoto, Ph.D. Based on presentation slides by Nosh Contractor and Mengxiao Zhu.
The simultaneous evolution of author and paper networks
Structures of Networks
Hiroki Sayama NECSI Summer School 2008 Week 2: Complex Systems Modeling and Networks Network Models Hiroki Sayama
Groups of vertices and Core-periphery structure
Topics In Social Computing (67810)
Exponential random graph models for multilevel networks
Applications of graph theory in complex systems research
Social Balance & Transitivity
Network Science: A Short Introduction i3 Workshop
Models of Network Formation
Models of Network Formation
Modelling Structure and Function in Complex Networks
Power and Core-Periphery Networks
Lecture 9: Network models CS 765: Complex Networks
(Social) Networks Analysis II
Network Science: A Short Introduction i3 Workshop
Advanced Topics in Data Mining Special focus: Social Networks
Presentation transcript:

Pip Pattison University of Melbourne UKSNA, University of Greenwich, June 2013 A hierarchy of exponential random graph models for the analysis of social networks

Acknowledgments Joint work with Garry Robins, Peng Wang and Tom Snijders University of Melbourne Garry Robins, Peng Wang, Galina Daraganova, David Rolls University of Oxford Tom Snijders University of Manchester Johan Koskinen Swinburne University Dean Lusher

Outline 1.Structure in networks 2.The ERGM framework for network modelling 3.Hierarchy of dependence structures for ERGMs 4.Five networks 5.Applications

1. Structure in networks

Cartwright and Harary: Psychological Review, 1956 We expect: negative ties to be bi-partite in form (or k-partite in generalisations) positive ties to be potentially clustered

Granovetter: American Journal of Sociology, 1973 We expect: closed triangles in strong ties local bridges to be weak

Jackson & Wolinksy: Journal of Economic Theory, 1996 We expect: disconnected cliques stars

Watts & Strogatz: Nature, 1998 We expect: High concentration of triangles Short paths Low density Absence of hubs

Degree effects: degree assortativity and dissassortativity (e.g. Newman, 2003) We expect: relatively high (or low) rates of connection among high- degree nodes

Burt: American Journal of Sociology, 2004 Robins (2009): We expect to see brokers who are: embedded in groups bridging to other groups

Bearman, Moody & Stovel: American Journal of Sociology, 2004 We expect: An absence of 4- cycles (and 3-cycles)

Jackson, Rodriguez-Barraquer & Tan: American Economic review, 2012 We expect: m-cliques but not (m+1)-cycles

An aside Paper Citations (WoS, June 26, 2013) Cartwright & Harary (1956) 534 Granovetter (1973)5833 Jackson & Wolinsky (1996) 416 Watts & Strogatz (1998)7572 Newman (2003) 507 Burt (2004) 491 Bearman, Moody & Stovel (2004) 133 Jackson, Rodriguez-Barraquer & Tan (2012) 0 Our fascination with network structure runs deep!

Other regularities in network structure Other hypothesised sources of regularity in network structure include: Homophily and heterophily effects (e.g. McPherson, Smith-Lovin & Cook, 2001) Consequences of social foci and other settings (Feld, 1985; Pattison & Robins, 2002) Embedding in geographical, organisational and sociocultural contexts (e.g. Daraganova et al, 2012; Lomi et al, in press; White, 1992) Interdependence or embeddedness with other networks (e.g. Granovetter, 1985; Padgett & McLean, 2006)

Harrison White on network ties Notably, almost all of these hypotheses about structural regularity are based on arguments about local interaction in networks: “A social tie exists in, and only in, a relation between actors which catenates, that is entails (some) compound relation through other such ties of those actors. … Thus it is subject to, and known to be subject to, the hegemonic pressures of others engaged in the social construction of that network” (White, 1998)

2. General modelling framework

Network models Network models should: reflect known and hypothesised processes for network tie formation (such as those just mentioned) be dynamic, where possible, and consistent with known or hypothesised dynamics allow us to test propositions about network structure and process allow us to understand the consequences of network structure and process For cross-sectional data, the exponential random graph modelling (ERGM) framework is convenient

Exponential random graph models (ERGMs) We regard the nodes of a network as fixed, and treat potential ties among nodes as variables that are dependent on exogenous attributes of the nodes and potential ties and, potentially, on one another. The form of assumed dependence among tie variables leads to a general form of a probability model (an exponential random graph model) for the ensemble of tie variables Additional simplifying assumptions … The model can be estimated using MCMCMLE from an observation on the network (and relevant node- or dyad-level covariates) - see Snijders (2002)

Exponential random graph model (ERGM) Y(i,j) is a tie variable: Y(i,j) = 1 if node i is tied to node j, 0 otherwise Ensemble of tie variables: Y = [Y(i,j)] tie variables y = [y(i,j)]realisations P(Y=y) = (1/  (  )) exp{  p  p z p (y)} Frank & Strauss (1986) z p (y) are network statistics  p are corresponding parameters  (  ) is a normalising quantity Network effects

3. Dependence structures

Characterising the proximity of potential network ties Under what circumstances is the tie linking node a and node b conditionally dependent on the tie linking node c and node d? a cd b When each of actors a and b is already linked to both actors c and d, and conversely? Strict inclusion

Characterising the proximity of potential network ties Under what circumstances is the tie linking node a and node b conditionally dependent on the tie linking node c and node d? a cd b When each of actors a and b is already linked to at least one of actors c and d, and conversely? Inclusion

Characterising the proximity of potential network ties Under what circumstances is the tie linking node a and node b conditionally dependent on the tie linking node c and node d? a cd b When at least one of actors a and b is already linked to both actors c and d? Partial inclusion

Characterising the proximity of potential network ties Under what circumstances is the tie linking node a and node b conditionally dependent on the tie linking node c and node d? a cd b When at least one of actors a and b is already linked to at least one of actors c and d, and conversely? Distance criterion

A second dimension: varying path length a. Strict p-inclusion SI p (p>0) ab c d b. p-inclusion I p ab c d c. Partial p-inclusion PI p ab c d d. p-distance criterion D p ab c d Key: Red lines indicate existing paths of length p or less (p  0) Blue dashed lines indicate potential ties, Y ab and Y cd

The dependence hierarchy Pattison & Snijders, 2013) SI 1 I 0 = PI 0 D0D0 I1 I1 PI 1 D1 D1 SI 2 I2I2 PI 2 D2D2

Associated model configurations Each configuration is a subgraph of diameter p (p- club, Mokken, 1979) For p = 1: cohesive subsets a cd b SI p : Strict p-inclusion

Associated model configurations Each configuration has the property that every pair of edges lies on a cyclic walk of length  (2p+2) For p = 1: closure a cd b I p : p-inclusion

Associated model configurations Each configuration has the property that every pair of edges lies on a cyclic walk of length  (2p+2) or on a cyclic walk of length  (2p+1) with an edge incident to a node on the cycle For p = 1: brokerage a cd b PI p : Partial p-inclusion

Associated model configurations Each configuration has the property that every pair of edges lies on a path of length  p+2 For p = 1: connectivity a cd b D p : p-distance

Model configurations for the case of p = 0 SI 0 : not defined I 0 : each configuration is an edge PI 0 : each configuration is an edge D 0 : each configuration is such that every pair of edges lies on a path of length  2 Bernoulli or Erdös- Rényi model: edges are independent Markov model (Frank & Strauss, 1986)

The dependence hierarchy Pattison & Snijders, 2013) SI 1 (clique) I 0 = PI 0 (Bernoulli) D 0 (Markov) I 1 (social circuit) PI 1 (edge- triangle) D 1 (3-path) SI p (p-club) I p (cyclic walk of length  2p+2) PI p ((r+1)- path-(2(p - r)+1)-cyclic walk, 0  r  p-1) D p (path of length  p) Cohesion Closure Brokerage Connectivity

Other assumptions 1.Homogeneity: isomorphic configurations have equal parameters (Frank & Strauss, 1986) 2.Related effects: a single statistic for a family of related configurations, such as: – m-stars –m-triangles, –m-2-paths –m-edge-triangles –… (Snijders et al, 2006; Hunter & Handcock, 2006)

Resulting model effects often include: Edge: Propensity for edge to occur Alternating star: (Endogenous) propensity for edges to attach to nodes with edges (progressively discounted for additional edges) – hence level of dispersion of degree distribution Alternating 2-path: Propensity for presence of shared partners (progressively discounted for additional shared partners) Alternating triangle: Propensity for an association between an edge linking nodes and their propensity for shared partners (progressively discounted for additional shared partners) (closure) Alternating edge-triangle: Propensity for an association between degree and closure (progressively discounted for higher degrees)

4. Five networks

Gift-giving (taro exchange) among households in a Papuan village* (n = 22) Hage P. and Harary F. (1983). Structural models in anthropology. Cambridge: Cambridge University Press. Schwimmer E. (1973). Exchange in the social structure of the Orokaiva. New York: St Martins.

Interaction network in a university karate club (n = 34) Zachary W. (1977). An information flow model for conflict and fission in small groups. Journal of Anthropological Research, 33,

Kapferer’s tailor shop in Zambia, sociational (friendship and socioemotional) ties, time 2* (n = 39) *Kapferer B. (1972). Strategy and transaction in an African factory. Manchester: Manchester University Press.

An Australian government organisation (n=60): ‘important’ ties

A dolphin community near Doubtful Sound, NZ* (n = 62) *D. Lusseau, K. Schneider, O. J. Boisseau, P. Haase, E. Slooten, and S. M. Dawson, The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations, Behavioral Ecology and Sociobiology 54, (2003).

5. Applications

Gift-giving (taro exchange) among households in a Papuan village* (n = 22) Hage P. and Harary F. (1983). Structural models in anthropology. Cambridge: Cambridge University Press. Schwimmer E. (1973). Exchange in the social structure of the Orokaiva. New York: St Martins.

Heuristic goodness of fit: degree statistics The t statistic locates the observed value of each statistic in the distribution of statistics associated with the ergm simulated using model parameters: if  t   2, the observed statistic is within the envelope expected by the model For example: For the Bernoulli model: edge effect = (est se =.17) statistic observed simulated mean (sd) t triangles (4.151) 0.607

Taro exchange: Bernoulli effectsestimatesstderr Edge effectsobservedmeanstddevt-ratio 2-star star triangles SD degrees Skew degrees GCC* Mean LCC* Var LCC* *GCC is the global clustering coefficient, LCC is the local clustering coefficient

Taro exchange: edge-triangle models Model 2 effectsestimatesstderr edge * AT(2.00) * AET(2.00) * Model 3 effectsestimatesstderr edge star triangle * edge-triangle * Both models suggest: Triadic closure A negative association between participation in closed triads and degree

Comparison of Models 2 and 3 Model 2Model 3 effectsobs meanSDt-ratio meanSDt-ratio 2-star star Triangles SD_deg Skew _deg GCC Mean LCC Var LCC Model 3 appears to be more closely centred on the data

Taro exchange simulated from Model 3

The edge-triangle model for Taro exchange effectestimatesstderr edge star triangle * ET * A triadic closure effect, accompanied by a negative association between triadic closure and tie formation

Interaction network in a university karate club (n = 34) Zachary W. (1977). An information flow model for conflict and fission in small groups. Journal of Anthropological Research, 33,

Zachary’s karate club effect estimatestderr edge AS(2.00) AT(2.00) * A2P(2.00) * Goodness of fit is good except for: effectobservedmeanstddevt-ratio 5-clique Positive tendencies for closure in both 3- and 4-cycles

Kapferer’s tailor shop in Zambia, sociational (friendship and socioemotional) ties, time 2* (n = 39) *Kapferer B. (1972). Strategy and transaction in an African factory. Manchester: Manchester University Press.

Model 1 effectsestimatesstderr edge AS (2.00) AT (2.00)

Model 1: heuristic goodness of fit effectsobservedmeanstddevt-ratio 2-star star Triangle clique clique triangle cycle edge-triangle edge-triangle SD degrees Skew degrees Global CC Mean Local CC Var Local CC

Model 2 EffectParameterStd Err edge AS (2.00) AT (2.00) A2P (2.00)

Model 2: heuristic goodness of fit Network statisticobservedmeanstddevt-ratio 2-star star star star triangle clique clique clique clique triangle path cycle edge-triangle edge-triangle Std Dev degree dist Skew degree dist Global CC Mean Local CC Var Local CC

Model 3 EffectParameterStd Err edge AS (2.00) AT (2.00) * A2P (2.00) AET (2.00) *

Model 3: goodness of fit Effectsobservedmeanstddevt-ratio 2-star star star star clique clique clique clique clique triangle path cycle edge-triangle edge-triangle SD degree dist Skew degree dist Global CC Mean Local CC Variance Local CC

An Australian government organisation (n=60): ‘important’ ties

Model for Australian government organisation effectsestimatesstderr edge * AS(2.00) AT(2.00) * A2P(2.00) The model appears to fit well A modest and non-significant tendency towards dispersed degrees, and a moderate closure effect

A dolphin community near Doubtful Sound, NZ* (n = 62) *D. Lusseau, K. Schneider, O. J. Boisseau, P. Haase, E. Slooten, and S. M. Dawson, The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations, Behavioral Ecology and Sociobiology 54, (2003).

Model 1 EffectParameterStd Err edge alt-star(2.00) alt-triangle (2.00)

Model 1: goodness of fit Effectobservedmeanstddev t-ratio # 2-stars # 3-stars # 1-triangles # 2-triangles # 3-paths # 4-cycles # (1,1)-coathangers # cliques of size # alt-k-indpt.2-path(2.00) Std Dev degree dist Skew degree dist Global Clustering Mean Local Clustering Variance Local Clustering

Model 2 EffectParameterStd Err edge triangle triangle edge-triangle alt-star(2.00)

Model 2: goodness of fit Effectobservedmeanstddev t-ratio # 2-stars # 3-stars # 3-paths # 4-cycles # cliques of size # alt-triangle(2.00) # alt-indpt-2-path(2.00) Std Dev degree dist Skew degree dist Global Clustering Mean Local Clustering Variance Local Clustering

In conclusion The dependence hierarchy systematically articulates possible proximity-based logics for conditional dependencies between network ties and yields: –A versatile modelling framework to reflect a variety of hypothesised tie formation processes The illustrative applications demonstrate the potential value of this flexible framework, and suggest evidence for various hypothesised processes There is, of course, much more to be done, e.g.: –evaluating model adequacy –comparing models –ensuring robust model specifications...