Selection & Social Influence

Slides:



Advertisements
Similar presentations
Λ14 Διαδικτυακά Κοινωνικά Δίκτυα και Μέσα Strong and Weak Ties Chapter 3, from D. Easley and J. Kleinberg book.
Advertisements

Based on chapter 3 in Networks, Crowds and markets (by Easley and Kleinberg) Roy Mitz Supervised by: Prof. Ronitt Rubinfeld November 2014 Strong and weak.
Homophily, Social Influence, & Affiliation
Analysis of Social Media MLD , LTI William Cohen
Models of Network Formation Networked Life NETS 112 Fall 2013 Prof. Michael Kearns.
Link creation and profile alignment in the aNobii social network Luca Maria Aiello et al. Social Computing Feb 2014 Hyewon Lim.
Chapter 3 Producing Data 1. During most of this semester we go about statistics as if we already have data to work with. This is okay, but a little misleading.
Reading Group “Networks, Crowds and Markets” Session 1: Graph Theory and Social Networks Typ hier de naam van de FEB afzender.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 13 Experiments and Observational Studies.
Jure Leskovec, CMU Eric Horwitz, Microsoft Research.
Analysis of Social Media MLD , LTI William Cohen
Homophily, Social Influence, & Affiliation Dr. Frank McCown Intro to Web Science Harding University This work is licensed under a Creative Commons Attribution-NonCommercial-
Online Social Networks and Media Homophilly Networks with Positive and Negative ties.
Online Social Networks and Media
Λ14 Διαδικτυακά Κοινωνικά Δίκτυα και Μέσα Networks and Surrounding Contexts Chapter 4, from D. Easley and J. Kleinberg book.
Selection & Social Influence Michael L. Nelson CS 495/595 Old Dominion University This work is licensed under a Creative Commons Attribution-NonCommercial-
Topics In Social Computing (67810) Module 1 Introduction & The Structure of Social Networks.
Social Networks Strong and Weak Ties
Topics In Social Computing (67810) Module 1 Introduction & The Structure of Social Networks.
Selection & Social Influence Michael L. Nelson CS 495/595 Old Dominion University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.
Copyright © 2009 Pearson Education, Inc.
Unsupervised Learning
Chapter 1: Exploring Data
Statistics 200 Lecture #9 Tuesday, September 20, 2016
Sampling Distribution Models
If two people in a social network have a friend in common, then there is an increased likelihood that they will become friends themselves at some point.
Lecture Slides Elementary Statistics Twelfth Edition
Topics In Social Computing (67810)
Structural Properties of Networks: Introduction
What Stops Social Epidemics?
User Joining Behavior in Online Forums
Local Networks Overview Personal Relations: Core Discussion Networks
CHAPTER 11 Inference for Distributions of Categorical Data
4.1 (Day 2)
Brian Lafferty Virus on a Network.
SAMPLE DESIGN.
Sampling: Theory and Methods
Elementary Statistics
Inverse Transformation Scale Experimental Power Graphing
surrounding contexts:
Location Recommendation — for Out-of-Town Users in Location-Based Social Network Yina Meng.
1.2 Sampling LEARNING GOAL
Federalist Papers Activity
Michael L. Nelson CS 495/595 Old Dominion University
Models of Network Formation
Stat 217 – Day 28 Review Stat 217.
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
Models of Network Formation
Models of Network Formation
Daniela Stan Raicu School of CTI, DePaul University
Models of Network Formation
CHAPTER 11 Inference for Distributions of Categorical Data
CONSUMER MARKETS AND CONSUMER BUYING BEHAVIOR
Assortativity (people associate based on common attributes)
Daniela Stan Raicu School of CTI, DePaul University
CHAPTER 11 Inference for Distributions of Categorical Data
Correlation and the Pearson r
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Who are the Subjects? Intro to Sampling
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Demo: data input and transformation
CHAPTER 11 Inference for Distributions of Categorical Data
Demo data transformation
“The Spread of Physical Activity Through Social Networks”
Chapter 5: Sampling Distributions
Unsupervised Learning
Presentation transcript:

Selection & Social Influence Michael L. Nelson CS 432/532 Old Dominion University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License This course is based on Dr. McCown's class

Slides based on Ch 4 of Networks, Crowds and Markets by Easley & Kleinberg (2010) http://www.cs.cornell.edu/home/kleinber/networks-book/

Pre-Lecture Notes: 1. The cute children in the pictures are Dr. McCown's. Here are ours (which don't necessarily exhibit homophily): 2. “All models are wrong, but some are useful” George E. P. Box 3. “The map is not the territory” Alfred Korzybski

Timeless Questions… http://9livesoflulu.wordpress.com/2011/07/08/which-came-first-the-chicken-the-egg-or-bob/ http://blog.melchua.com/2012/01/31/nature-vs-nurture-comic/ http://www.dailymail.co.uk/sciencetech/article-1294341/Chicken-really-DID-come-egg-say-scientists.html

The old proverb says that "birds of a feather flock together"; I suppose that equality of years inclines them to the same pleasures, and similarity begets friendship; … http://classics.mit.edu/Plato/phaedrus.html http://en.wikipedia.org/wiki/Phaedrus_%28dialogue%29 (ca. 370BC)

Rush - "Subdivisions" (Subdivisions) In the high school halls In the shopping malls Conform or be cast out In the basement bars In the backs of cars Be cool or be cast out Any escape might help to smooth The unattractive truth But the suburbs have no charms to soothe The restless dreams of youth https://www.youtube.com/watch?v=EYYdQB0mkEU

Homophily Your friends are more similar to you in age, race, interests, opinions, etc. than a random collection of individuals Homophily: principle that we tend to be similar to our friends

Social network from town's middle and high schools middle school high school Clustering by race Moody (2001)

Q: Why Do We Study This? A: Recommending edges in the graph is life blood of social networks Facebook recommending to me groups for me to recommend to my friends Twitter recommending people for me to follow Facebook recommending groups for me

Homophily and Triadic Closure Triadic closure: When two individuals share a common friend, a friendship between the two is more likely to occur Homophily suggests two individuals are more alike because of common friend, so link may occur even if neither is aware of mutual friend! Difficult to attribute formation of link to any one factor

Can we develop a simple test for the presence of homophily?

Example Male/Female Network What would this network look like if it did not exhibit homophily? p = 6/9 = 0.67 q = 3/9 = 0.33

Randomly Generated Network Randomly assigned a gender according to gender balance of real network Number of cross-gender edges should not change significantly relative to real network If fraction p are males and q are females, what would be probably of… p2 = 0.45 2pq = 0.44 q2 = 0.11

Homophily Test: If the fraction of heterogeneous (cross-gender) edges is significantly less than 2pq then there is evidence for homophily  

Notes on Homophily Test “significantly less than” – How significant? A deviation below the mean is suitable What if network had significantly more than 2pq cross-gender edges? Inverse homophily (i.e., "heterophily") Example: Male-Female dating relationships

Notes on Homophily Test Homophily test can be used to test any characteristic like race, age, native language, preferences, etc. For characteristics that have more than 2 values, perform same type of calculation Compare number of heterogeneous edges (edges with nodes that differ in interested characteristic) to what randomly generated graph would look like using real data as probabilities

Selection Why is homophily often present in a social network? Selection Selection: People tend to choose friends that are like themselves

Selection Can operate at different scales and levels of intentionality You actively choose friends that are like yourself among a small group of people Your school's population is relatively homogeneous compared to overall population, so your environment compels you to choose friends like yourself https://www.odu.edu/about/odu-publications/insideodu/2013/11/21/topstory2 https://www.youtube.com/watch?v=0z1wbrUtR04

Mutable and Immutable Characteristics Selection operates differently based on type of characteristic Immutable: Characteristics that don't change (gender, race) or change consistently with the population (age, generation) Mutable: Characteristics that can change over time (behaviors, beliefs, interests, opinions)

Cartoon: http://nancybenet.com/dont-follow-the-crowd/ Social Influence Research has shown that (surprise!) people are susceptible to social influence: they may change their behaviors to more closely resemble the behaviors of their friends Cartoon: http://nancybenet.com/dont-follow-the-crowd/

Selection and Social Influence Social influence is reverse of selection Selection: Individual characteristics drive the formation of links Social influence: Existing links shape people's mutable characteristics Selection Characteristics Links Social Influence

Longitudinal Studies Difficult to tell if selection or social influence at play with a single snapshot of a network Longitudinal studies tracking social connections and individual behaviors over time can help researchers uncover effect of social influence Does behavior change after changes to network, or does network change after changes in behavior?

Example: Teenage Drug Use Drug use affected more by selection or social influence? Understanding these affects can be helpful in developing interventions If drug use displays homophily in a network, targeting social influence (get friends to influence other friends to stop) might work best But if homophily due to selection, former drug users may choose new friends and drug-using behavior of others is not strongly affected

Social Influence of Obesity Christakis and Fowler tracked obesity status and social network of 12,000 people over 32 years Found homophily based on obesity status They wanted to know why Christakis & Fowler (2007) http://www.ted.com/talks/nicholas_christakis_the_hidden_influence_of_social_networks.html

Why Obesity Homophily? Selection effects – people choose to befriend others of similar obesity status? Confounding effects of homophily – other factors that correlate with obesity status? Social influence – if friends changed their obesity status, did it influence person's future obesity status? Discovered significant evidence for hypothesis 3: Obesity is a type of “contagion” that can spread through social influence!

Homicide is Also Contagious Social Networks and the Risk of Gunshot Injury http://dx.doi.org/10.1007/s11524-012-9703-9 (Boston, link=observed colocation by police)

85% of Victims in Giant Component Social Networks and the Risk of Gunshot Injury http://dx.doi.org/10.1007/s11524-012-9703-9 (Boston, link=observed colocation by police)

41% of Homicides in 4% of Population http://www.npr.org/blogs/thetwo-way/2013/11/15/245458444/study-odds-of-being-murdered-closely-tied-to-social-networks http://dx.doi.org/10.2105/AJPH.2013.301441 Link=co-offending arrest records

Affiliation Network Putting context into the network by showing connections to activities, companies, organizations, neighborhoods, etc. Bipartite graph: every edge joins two nodes belonging to different sets Sue Bill Chess Club Band Affiliations or foci

Social-Affiliation Network Two types of edges: Person to person: Friendship or other social relationship Person to foci: Participation in the focus Sue Bill Chess Club Band Gary Alice

Closure Processes Triadic closure Sue Gary Alice Triadic closure Band Sue Bill Focal closure: closure due to selection Sue Gary Band Membership closure: closure due to social influence

Research Questions About Closure Would Sue be more likely to become friends with Alice if they shared more than one friend? In other words, is triadic closure dependent on the number of shared friends? More formally: What is the probability that two people form a link as a function of the number of mutual friends they share? Sue Gary Alice Sue Gary Alice Moe

Research Questions About Closure Focal closure: what is the probability that two people form a link as a function of the number of foci they are jointly affiliated with? Membership closure: what is the probability that a person becomes involved in a particular focus as a function of the number of friends who are already involved in it? Band Sue Bill Ping Pong Sue Gary Band Jill

Research Methodology: Measuring Triadic Closure Take two snapshots of network at times t1 and t2 For each k, identify all pairs of nodes who have exactly k friends in common at t1 but who are not directly connected by an edge Define T(k) to be fraction of these pairs that form an edge by t2. This is the probability that a link will form between two people with k friends in common Plot T(k) as function of k to illustrate effect of common friends on link formation idea: k=0, we might become friends, but chances (T(0)) are low k=10, we have a lot of friends in common, T(10) is high

Research Methodology: Measuring Triadic Closure Take two snapshots of network at times t1 and t2 t1 t2

Research Methodology: Measuring Triadic Closure 2. For each k, identify all pairs of nodes who have exactly k friends in common at t1 but who are not directly connected by an edge k = 0 t1 t2

Research Methodology: Measuring Triadic Closure 2. For each k, identify all pairs of nodes who have exactly k friends in common at t1 but who are not directly connected by an edge k = 0 t1 t2

Research Methodology: Measuring Triadic Closure 2. For each k, identify all pairs of nodes who have exactly k friends in common at t1 but who are not directly connected by an edge k = 0 Etc… t1 t2

Research Methodology: Measuring Triadic Closure 2. For each k, identify all pairs of nodes who have exactly k friends in common at t1 but who are not directly connected by an edge k = 1 k = 2, Etc… t1 t2

Research Methodology: Measuring Triadic Closure 3. Define T(k) = fraction pairs that form an edge by t2. This is the probability that a link will form between two people with k friends in common T(1) = 1/1 T(0) = 1/11 t1 t2

Research Methodology: Measuring Triadic Closure 4. Plot T(k) as function of k to illustrate effect of common friends on link formation T(k) Prob of link formation k Friends in common

Email Social Network Kossinets and Watts (2006) examined email communication of 22,000 students over one year at a large US university Made link between two people if an email was sent between the two in the last 60 days Each snapshot is one day apart T(k) averaged over multiple pairs of snapshots

Triadic Closure in Email Data Set empirical simple model Almost no emails exchanged when no friends in common more conservative simple model Figure from http://www.cs.cornell.edu/home/kleinber/networks-book/networks-book-ch04.pdf

Triadic Closure in Email Data Set Significant increase but on much smaller subpopulation Significant increase going from 1 to 2 friends Figure from http://www.cs.cornell.edu/home/kleinber/networks-book/networks-book-ch04.pdf

Triadic Closure in Email Data Set Baseline model: probability p of forming link when k friends in common Tbase(k) = 1 – (1 – p)k p=probability of sending email, each k relationship is independent (not completely realistic) Tbase(k) = 1 – (1 – p)k-1 Figure from http://www.cs.cornell.edu/home/kleinber/networks-book/networks-book-ch04.pdf

Email Social Network To evaluate focal closure, Kossinets and Watts obtained class schedules for 22,000 students Created social-affiliation network where classes are foci Determined probability of link formation as function of number of shared foci ENG 101 Sue Bill HIST 202

Focal Closure in Email Data Set Tbase(k) = 1 – (1 – p)k Sharing a class has nearly same effect as sharing a friend Diminishing returns (if we share 3+ classes and we're still not friends, I just don't like you) Figure from http://www.cs.cornell.edu/home/kleinber/networks-book/networks-book-ch04.pdf

Measuring Membership Closure Backstrom et al. (2006) created social-affiliation network for LiveJournal Friendships designated by users in their profile Foci are membership in user-defined communities Sue Gary Jazz Comm Jill

LiveJournal User Profile Example http://jazzfish.livejournal.com/profile

Membership Closure in LiveJournal Data Set Diminishing returns Significant increase going from 1 to 2 Figure from http://www.cs.cornell.edu/home/kleinber/networks-book/networks-book-ch04.pdf

Measuring Membership Closure Crandall et al. (2008) created social-affiliation network for Wikipedia Node for editors maintaining a user talk page Link if two editors have communicated using talk page(s) Foci are articles edited by editors Sue Gary Jazz Article Jill

Wikipedia User Talk Pages

Membership Closure in Wikipedia Data Set T(k) Diminishing returns Significant increase going from 1 to 2 Figure from http://www.cs.cornell.edu/home/kleinber/networks-book/networks-book-ch04.pdf

Selection and Social Influence Further examination of Wikipedia data set to see evidence of homophily produced by selection and social influence How do we measure similarity of two editors? number of articles edited by both editors number of articles edited by at least one editor Similar to neighborhood overlap of two editors in an affiliation network of editors and articles

Selection and Social Influence Pairs of Wikipedia editors who have communicated are significantly more similar in behavior than editors who have never communicated Does homophily arise because… editors are talking with editors who have edited the same article? (selection) editors are led to the articles by those they talk to? (social influence)

+ = since averaged over many pairs, tells us more than anecdotes - = does not inform us re: any single pair Figure from http://www.cs.cornell.edu/home/kleinber/networks-book/networks-book-ch04.pdf

Segregation: Homophily in Action Cities often divided into homogeneous neighborhoods based on ethnic and racial lines Image: http://wideurbanworld.blogspot.com/2011/08/race-ethnicity-social-class-are-most.html

High density of African-Americans Low density of African-Americans Figure from http://www.cs.cornell.edu/home/kleinber/networks-book/networks-book-ch04.pdf

Closer To Home… http://www.city-data.com/neighborhood/Colonial-Place-Norfolk-VA.html

Schelling Model of Segregation Thomas Schelling introduced simple model in 1970s that shows how global patterns of self-chosen segregation can occur due to the effects of homophily at the local level Population formed by two types of individuals called agents (X and O) Two types of agents represent immutable characteristic like race, ethnicity, country of origin, or native language

Agents placed randomly in grid X O Agents are satisfied with their location in the grid if at least t of their neighbors are agents of the same type. t = threshold

In each round, first identify all unsatisfied agents X O Assume t is 3 – all agents prefer to have at least 3 neighboring agents that are of the same type

Unsatisfied because only 2 neighbors are X Assume t is 3 – all agents prefer to have at least 3 neighboring agents that are of the same type

Unsatisfied because only 2 neighbors are X Assume t is 3 – all agents prefer to have at least 3 neighboring agents that are of the same type

Satisfied because 3 neighbors are O X* O X Assume t is 3 – all agents prefer to have at least 3 neighboring agents that are of the same type

Satisfied because 4 neighbors are X Assume t is 3 – all agents prefer to have at least 3 neighboring agents that are of the same type

Unsatisfied X* O X Assume t is 3 – all agents prefer to have at least 3 neighboring agents that are of the same type

Satisfied X* O X O* Assume t is 3 – all agents prefer to have at least 3 neighboring agents that are of the same type

Unsatisfied X* O X O* Assume t is 3 – all agents prefer to have at least 3 neighboring agents that are of the same type

Etc… Here's the final result where unsatisfied agents marked with * X* O X O* Now all unsatisfied agents move to a location where they are satisfied

O X O* Note that the grid is now more segregated than before Previously satisfied agent no longer satisfied! O X O* Sometimes an agent can't find a new location that will satisfy – leave alone or move to random location

Model Variations Schedule agents to move in random order or in sweeping motion Agents move to nearest location that satisfies or random one Threshold could be percentage or vary among or within population groups World could wrap (left edge meets up with right edge) Many other variations, but results usually end up the same: self-imposed segregation!

Schelling Simulators Check out some of these simulators: Frank McCown http://nifty.stanford.edu/2014/mccown-schelling-model-segregation/ Uri Wilensky's NetLogo Segregation model http://ccl.northwestern.edu/netlogo/models/Segregation Raj Singh http://web.mit.edu/rajsingh/www/lab/alife/schelling.html Sean Luke http://cs.gmu.edu/~eclab/projects/mason/projects/schelling/

Luke's Schelling Model using t = 3 Randomly placed After 20 rounds

Conclusion Schelling model is an example of how immutable characteristics can become highly correlated with mutable characteristics Choice of where to live (mutable) over time conforms to agents' type (immutable) producing segregation (homophily)