School of Information University of Michigan 1 Information diffusion in online communities Lada Adamic ICOS Sept. 29, 2006.

Slides:



Advertisements
Similar presentations
Mobile Communication Networks Vahid Mirjalili Department of Mechanical Engineering Department of Biochemistry & Molecular Biology.
Advertisements

photo credit: dbarefootdbarefoot Social Media is Consumer generated media It is media that is designed to be shared, sharing means that it is easy to.
LEARNING INFLUENCE PROBABILITIES IN SOCIAL NETWORKS Amit Goyal Francesco Bonchi Laks V. S. Lakshmanan University of British Columbia Yahoo! Research University.
COMP 621U Week 3 Social Influence and Information Diffusion
Combining the Power of Social Media and Online Advertising for Effective Web-Based Promotions APTA Marketing & Communications Workshop February 24, 2014.
Spread of Influence through a Social Network Adapted from :
Managing Online Identity References: Prepared for Trustworthy Computing Group, Microsoft by Cross-Tab Marketing Services.
School of Computer Science Carnegie Mellon University 1 The dynamics of viral marketing Jure Leskovec, Carnegie Mellon University Lada Adamic, University.
American Teens & Online Safety: What the research is telling us… Amanda Lenhart Family Online Safety Institute December 6, 2007 Washington, DC.
Patterns of Influence in a Recommendation Network Jure Leskovec, CMU Ajit Singh, CMU Jon Kleinberg, Cornell School of Computer Science Carnegie Mellon.
For more information, contact: Jordan Losen President, VeraQuest, Inc. Ph: Prepared by: VeraQuest, Inc.
February 6, Background: Where We Are The Internet is changing the way Americans obtain news and information 55 million blogs Explosion of social.
What Do People Do Online? Implications For the Future of Media Cindy Royal Assistant Professor School of Journalism and Mass Communication Texas State.
Empirical analysis of social recommendation systems Review of paper by Ophir Gaathon Analysis of Social Information Networks COMS , Spring 2011,
School of Information University of Michigan 1 The dynamics of viral marketing Lada Adamic MOCHI November 9, 2005.
Hearst digital: We Know Women Online. Online Survey Ran 7 th July to 6 th August 40 questions across 5 key insight areas Sample 4566 Methodology Cosmopolitan.
Electronic communication and social networks. 3 Questions: Does the internet weaken community? Because people replace in-person relationship with time.
Data dump: A closer look at online social activity Lada Adamic School of Information, University of Michigan, Ann Arbor online social networking sites.
Technology and Community Group 3 Additional Reading Jody Chatalas.
Understanding students online IAB and The Student Room Research April 2011.
The political blogosphere and the 2004 election:
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
Social Networks: Advertising, Pricing and All That Zvi Topol & Itai Yarom.
1 I256: Applied Natural Language Processing Marti Hearst Nov 8, 2006.
Recommender systems Ram Akella November 26 th 2008.
 Digital marketing: Uses digital media to develop communications and exchanges with customers  Electronic media (E-marketing): Refers to the strategic.
Online PR Srba Jovanović International Public Relations Association – IPRA Board member.
Brand Engagement Study - Retail. Brand Engagement Studies To demonstrate the ability of internet advertising to drive engagement To measure the effects.
Models of Influence in Online Social Networks
Page 1 Walk Through the 2011 Recreational Boating Statistical Abstract Section 1 Boating Population 1.Participation Estimates Participation Study.
Journalist and reader interaction in local British newspapers online: initial case study data.
Digital Content & Users: Patterns & Impacts OECD Workshop on “The Economic and Social Impacts of Broadband Communications” John B. Horrigan Associate Director.
The Unique Value of Advertising in Local TV Broadcast News
Online Marketing Gay, Charlesworth & Esen Chapter Five.
Web 2.0 for Businesses How You Can Use Social Media to Bring in Money & Promote Your Brand Kimberly L. Sanberg Director of Online Strategy, Ignitus presentation.
Social Networking. What We’ll Cover What is social networking? Examples Stats Metrics Convincing your boss Step-By-Step Guide.
Page 1Marketing Module David F. Miller Center for Retailing Education and Research Retail Marketing Management 3. Retail Communication.
Teens, Online Stranger Contact and Cyberbullying What the research is telling us… Amanda Lenhart Internet Safety Task Force April 30, 2008 Washington,
Suggesting Friends using the Implicit Social Graph Maayan Roth et al. (Google, Inc., Israel R&D Center) KDD’10 Hyewon Lim 1 Oct 2014.
The broadband difference Lee Rainie – Director Capital Cabal, Washington, D.C. June 27, 2002.
Using Transactional Information to Predict Link Strength in Online Social Networks Indika Kahanda and Jennifer Neville Purdue University.
Blogging Information Technology and Social Life April 11, 2005.
Author: Sali Allister Date: 21/06/2011 COASTAL Google Analytics Report March 2011 – June /03/2011 – 08/06/11.
>  Slide 1 Coaching Insights Coaching statistics and analysis 2015/16.
Generating and Tracking Communities Based on Implicit Affinities Matthew Smith – BYU Data Mining Lab April 2007.
The Dynamics of Viral Marketing Jure Leskovec Lada Adamic Bernardo A. Huberman Stanford University University of MichiganHP Labs Presented by Leman Akoglu.
To Blog or Not to Blog: Characterizing and Predicting Retention in Community Blogs Imrul Kayes 1, Xiang Zuo 1, Da Wang 2, Jacob Chakareski 3 1 University.
BMW Creative Benchmarking May About Newspaper Creative Benchmarking.
School of Computer Science Carnegie Mellon University 1 The dynamics of viral marketing Jure Leskovec, Carnegie Mellon University Lada Adamic, University.
Campaign s By Alyssa Ray. Forms of Interactivity 1. One way communication (e.g. web pages). 2. Two way communication (e.g. private s). 1.
The American Christian Bible Study Market Research undertaken by Phil Prior on the behalf of Focus Radio, publishers of Facing the Challenge Bible Studies,
Using Social Media for Fundraising and Communication with Supporters Lindsay Boyle – Communications & Research Coordinator Claire Chapman – Information.
NTUST IM AHP Case Study 2 Identifying key factors affecting consumers' choice of wealth management services: An AHP approach.
Teens, Social Networks & Safety An Overview Amanda Lenhart Family Online Safety Institute Launch February 13, 2007 Washington, DC.
Politics and Social media: The Political Blogosphere and the 2004 U.S. election: Divided They Blog Crystal: Analyzing Predictive Opinions on the Web Swapna.
The Main Streaming of Online Life The Pew Internet and American Life Project Unit 1-1 Managing the Digital Enterprise By Professor Michael Rappa.
With each device or application that expands the bandwidth of available information, the computer ’ s understanding of us remains unchanged.
Shoveling tweets: An analysis of the microblogging engagement of traditional news organizations Marcus Messner Maureen Linke Asriel Eford School of Mass.
1 Friends and Neighbors on the Web Presentation for Web Information Retrieval Bruno Lepri.
The O.C. The OC and its official website Connection between TV show and website.
1 Patterns of Cascading Behavior in Large Blog Graphs Jure Leskoves, Mary McGlohon, Christos Faloutsos, Natalie Glance, Matthew Hurst SDM 2007 Date:2008/8/21.
How Chapters Can use Social Media Mark Storace Sacramento Chapter March 2013.
Audience Profiling with Personae and Use-Case Scenarios User Scenarios combine User Personas/Personae with User Tasks remember.
The Spread of Media Content through the Blogosphere
What Stops Social Epidemics?
A Strong Brick-and-Mortar Sector
Kevin Wallsten Department of Political Science
Decision Based Models of Cascades
The dynamics of viral marketing
Technology and Community
Presentation transcript:

School of Information University of Michigan 1 Information diffusion in online communities Lada Adamic ICOS Sept. 29, 2006

2 Talk outline discussions in the political blogosphere online person-to-person product recommendations

3 joint work with Natalie Nielsen/Buzzmetrics The political blogosphere and the 2004 election: Divided they blog

4 Political blogs gaining in importance Pew Internet & American Life Project Report, January 2005, reports: 63 million U.S. citizens use the Internet to stay informed about politics (mid-2004, Pew Internet Study) 9% of Internet users read political blogs preceding the 2004 U.S. Presidential Election 2004 Presidential Campaign Firsts Candidate blogs: e.g. Dean’s blogforamerica.com Successful grassroots campaign conducted via websites & blogs Bloggers credentialed as journalists & invited to nominating conventions

5 Related research on political blogs 10 most popular political blogs account for half the blogs read by surveyed journalists (Drezner and Farrell 2004) The most popular blogs also receive the majority of citation links (Shirky 2003). Citation link structure reveals topical subcommunities: Catholicism, homeschooling, A-list bloggers (Herring et. al. 2005) Comparison of network neighborhoods of Atrios and Instapundit: no overlap in linking behavior (Welsch 2005) Research question: Are we witnessing cyberbalkanization of the Internet?

6 Collected self-identified liberal and conservative blogs from online directories (eTalkingHead, BlogCatalog, CampaignLine, Blogorama) Crawled home page of each blog in February 2005: found 30 more well-cited political blogs (manually categorized) biases toward sidebar/blogroll links Did not include libertarian, independent or moderate blogs (fewer in number and lesser in popularity) Identified: 676 liberal and 659 conservative blogs Calling all political blogs

7 The larger political blogosphere Results 91% of links point to blog of same persuasion Conservative blogs show greater tendency to link 82% of conservative blogs linked to at least once; 84% link to at least one other blog 67% of liberal blogs are linked to at least once; 74% link to at least one other blog Average # of links per blog is similar: 13.6 for liberal; 15.1 for conservative Higher proportion of liberal blogs that are not linked to at all

8 Indegree distributions for political blogs

9 Different rankings produce similar A-lists

10 Top 20 liberal blogs

11 Top 20 conservative blogs

12 Harvested posts for top 20 lists from BlogPulse BlogPulse stores individual posts: date, permalink, and content Date range: late August 2004 –> mid-November 2004 Collected: 12,470 liberal posts; 10,414 conservative posts Identifying citation links (weblog post -> blog OR post) For each post, extract all links (hrefs) Exclude self-links Blogroll/sidebar links not included 1511 L-L citations; 2110 R-R citations; 247 L-R; 312 R-L Result: Conservatives had 16% fewer posts but cited each other 40% more often Methodology for detailed study of A-list blogs

13 A)all citations between A-list blogs in 2 months preceding the 2004 election B)citations between A-list blogs with at least 5 citations in both directions C)edges further limited to those exceeding 25 combined citations Citations between blogs in their posts (Aug 29 th – Nov 15 th, 2004) only 15% of the citations bridge communities

14 21 JawaReport 22 Vodka Pundit 23 Roger L Simon 24 Tim Blair 25 Andrew Sullivan 26 Instapundit 27 Blogs for Bush 28 LittleGreenFootballs 29 Belmont Club 30 Captain’s Quarters 31 Powerline 32 Hugh Hewitt 33 INDC journal 34 Real Clear Politics 35 Winds of Change 36 Allahpundit 37 Michelle Malkin 38 Wizbang 39 Dean’s World 40 Volokh 1 Digby’s Blog 2 James Walcott 3 Pandagon 4 blog.johnkerry.com 5 Oliver Willis 6 America Blog 7 Crooked Timber 8 Daily Kos 9 American Prospect 10 Eschaton 11 Wonkette 12 Talk Left 13 Political Wire 14 Talking Points Memo 15 Matthew Yglesias 16 Washington Monthly 17 MyDD 18 Juan Cole 19 Left Coaster 20 Bradford DeLong

15 Notable examples of blogs breaking a story 1. Swiftvets.com anti-Kerry video Bloggers linked to this in late July, keeping accusations alive Kerry responded in late August, bringing mainstream media coverage 2. CBS memos alleging preferential treatment of Pres. Bush during the Vietnam War Powerline broke the story on Sep. 9 th, launching flurry of discussion Dan Rather apologized later in the month 3. “Was Bush Wired?” Salon.com asked the question first on Oct. 8 th, echoed by Wonkette & PoliticalWire.com MSM follows-up the next day

16 Discussion of “forged documents” Liberals and conservatives differ in the topics they discuss

17 Political blogs as echo chambers Pairwise comparison of URLs and phrases posted by each blog v A = w U1 w U2 … w UN tf*idf weight~ (number of times blog mentions URL)* log[(total number of blogs monitored by blogpulse)/(number of those blogs citing the URL)] Similarity of two blogs is given by the cosine of their vectors cos(A,B) = v A.v B /(||v A ||*||v B ||) Similarity in URLs between blogs of the same persuasion was higher (0.08 for liberal blogs and 0.09 for conservative ones), than between mixed pairs (0.03) Same trend for phrases. We can even invert the analysis, and see what phrases are similar…

18 Network of phrases found on the same blogs

19 Political figures being discussed 59% of the mentions of Kerry are by right leaning blogs 53% of the mentions of Bush are by left leaning blogs

20 Mainstream media bias (links from 1,400 blog set)

21 Mainstream media cited about once every other post from the A-list bloggers (6,762 times from the left, 6,364 from the right)

22 Insights from the political blogosphere Liberal and conservative blogs are balanced in numbers and tend to link primarily to their own communities But is 10% cross-linking really too little? Or is it a sign of significant discussion? Conservative blogs are more likely to include links to other blogs on their pages, and their A-list blogs reference one another more frequently Liberal and conservative blogs tend to discuss different things, but one is not more ‘coherent’ than the other

23 Trying to bridge the divide Opposition to the bankruptcy bill (March 2005) conservative blog post liberal blog post uncategorized blog post news article government website link between posts/pages posts/pages belonging to same blog/site but, bill was passed nevertheless: Senate , House

School of Information University of Michigan 24 The dynamics of viral marketing Jure Leskovec, Carnegie Mellon University Lada Adamic, University of Michigan Bernardo Huberman, HP Labs

25 Using online networks for viral marketing Burger King’s subservient chicken

26 Outline prior work on viral marketing & information diffusion incentivised viral marketing program cascades and stars network effects product and social network characteristics The dynamics of Viral Marketing

27 Information diffusion Studies of innovation adoption hybrid corn (Ryan and Gross, 1943) prescription drugs (Coleman et al. 1957) Models (very many) Rogers, ‘Diffusion of Innovations’ Watts, Information cascades (2003) Kempe, Kleinberg, Tardos, Maximizing the spread of Influence, (2005)

28 Motivation for viral marketing viral marketing successfully utilizes social networks for adoption of some services hotmail gains 18 million users in 12 months, spending only $50,000 on traditional advertising gmail rapidly gains users although referrals are the only way to sign up customers becoming less susceptible to mass marketing mass marketing impractical for unprecedented variety of products online

29 The web savvy consumer and personalized recommendations > 50% of people do research online before purchasing electronics personalized recommendations based on prior purchase patterns and ratings Amazon, “people who bought x also bought y” MovieLens, “based on ratings of users like you…” Is there still room for viral marketing?

30 Is there still room for viral marketing next to personalized recommendations? We are more influenced by our friends than strangers  68% of consumers consult friends and family before purchasing home electronics (Burke 2003)

31 Incentivised viral marketing (our problem setting) Senders and followers of recommendations receive discounts on products 10% credit10% off Recommendations are made to any number of people at the time of purchase Only the recipient who buys first gets a discount

32 Product recommendation network purchase following a recommendation customer recommending a product customer not buying a recommended product

33 the data large anonymous online retailer (June 2001 to May 2003) 15,646,121 recommendations 3,943,084 distinct customers 548,523 products recommended Products belonging to 4 product groups: books DVDs music VHS

34 summary statistics by product group productscustomersrecommenda- tions edgesbuy + get discount buy + no discount Book103,1612,863,9775,741,6112,097,80965,34417,769 DVD19,829805,2858,180,393962,341 17,232 58,189 Music393,598794,1481,443,847585,7387,8372,739 Video26,131239,583280,270160, Full542,7193,943,08415,646,1213,153,67691,32279,164 high low people recommendations

35 viral marketing program not spreading virally 94% of users make first recommendation without having received one previously size of giant connected component increases from 1% to 2.5% of the network (100,420 users) – small! some sub-communities are better connected 24% out of 18,000 users for westerns on DVD 26% of 25,000 for classics on DVD 19% of 47,000 for anime (Japanese animated film) on DVD others are just as disconnected 3% of 180,000 home and gardening 2-7% for children’s and fitness DVDs

36 medical study guide recommendation network

37 measuring cascade sizes delete late recommendations count how many people are in a single cascade exclude nodes that did not buy steep drop-off very few large cascades books = 1.8e6 x -4.98

38 cascades for DVDs shallow drop off – fat tail a number of large cascades DVD cascades can grow large possibly as a result of websites where people sign up to exchange recommendations ~ x -1.56

39 simple model of propagating recommendations (ignoring for the moment the specific mechanics of the recommendation program of the retailer) Each individual will have p t successful recommendations. We model p t as a random variable. At time t+1, the total number of people in the cascade, N t+1 = N t * (1+p t ) Subtracting from both sides, and dividing by N t, we have

40 simple model of propagating recommendations (continued) Summing over long time periods The right hand side is a sum of random variables and hence normally distributed. Integrating both sides, we find that N is lognormally distributed if  large resembles power-law

41 participation level by individual very high variance The most active customer made 83,729 recommendations and purchased 4,416 different items!

42 Network effects

43 does receiving more recommendations increase the likelihood of buying? BOOKS DVDs

44 does sending more recommendations influence more purchases? BOOKS DVDs

45 the probability that the sender gets a credit with increasing numbers of recommendations consider whether sender has at least one successful recommendation controls for sender getting credit for purchase that resulted from others recommending the same product to the same person probability of receiving a credit levels off for DVDs

46 Multiple recommendations between two individuals weaken the impact of the bond on purchases BOOKS DVDs

47 product and social network characteristics influencing recommendation effectiveness

48 recommendation success by book category consider successful recommendations in terms of av. # senders of recommendations per book category av. # of recommendations accepted books overall have a 3% success rate (2% with discount, 1% without) lower than average success rate (significant at p=0.01 level) fiction romance (1.78), horror (1.81) teen (1.94), children’s books (2.06) comics (2.30), sci-fi (2.34), mystery and thrillers (2.40) nonfiction sports (2.26) home & garden (2.26) travel (2.39) higher than average success rate (statistically significant) professional & technical medicine (5.68) professional & technical (4.54) engineering (4.10), science (3.90), computers & internet (3.61) law (3.66), business & investing (3.62)

49 professional and organized contexts In general, professional & technical book recommendations are more often accepted (probably in part due to book cost) Some organized contexts other than professional also have higher success rate, e.g. religion overall success rate 3.13% Christian themed books Christian living and theology (4.7%) Bibles (4.8%) not-as-organized religion new age (2.5%) occult spirituality (2.2%) Well organized hobbies books on orchids recommended successfully twice as often as books on tomato growing

50 examples of communities use a community finding algorithm to identify groups of people sending recommendations to one another # nodes # senders topics _ books: American literature, poetry sci-fi books, TV series DVDs, alternative rock music music: dance, indie discounted DVDs books: art & photography, web development, graphical design, sci-fi books: sci-fi and other books: Christianity and Catholicism books: business and investing, computers, Harry Potter books: parenting, women’s health, pregnancy books: comparative religion, Egypt’s history, new age, role playing games

51

52 telling the world vs. telling your friends consider # reviewers per book # recommenders per book what we observe (ratio of recommendations to reviews given in parentheses) tell the world but not your friends literature & fiction (0.57) mystery & thrillers (0.36) horror (0.44) tell the world and your friends biographies (0.90) children’s books (1.12) religion (1.73) history (1.27) nonfiction (1.89) tell just your friends about personal pursuits health, mind & body (2.39) home & garden (3.48) arts & photography (3.85) cooking, food & wine (3.49) tell your colleagues about professional interests professional & technical (3.22) computers & internet (3.10) medicine (4.19) engineering (3.85) law (4.25)

53 anime DVDs 47,000 customers responsible for the 2.5 out of 16 million recommendations in the system 29% success rate per recommender of an anime DVD giant component covers 19% of the nodes Overall, recommendations for DVDs are more likely to result in a purchase (7%), but the anime community stands out

54 regressing on product characteristics VariabletransformationCoefficient const *** # recommendationsln(r)0.426 *** # sendersln(n s ) *** # recipientsln(n r ) *** product priceln(p)0.128 *** # reviewsln(v) *** avg. ratingln(t) * R2R significance at the 0.01 (***), 0.05 (**) and 0.1 (*) levels small tightly knit communities purchasing expensive products

55 Marketing in the long tail Chris Anderson, ‘The Long Tail’, Wired, Issue October 2004Issue 12.10

56 The long tail of successful recommendations 20% of the products account for 50% of the purchases 67 % of the products have only a single successful recommendation (30% of all purchases)

57 Conclusions Overall incentivized viral marketing contributes marginally to total sales occasionally large cascades occur Observations for future diffusion models purchase decision more complex than threshold or simple infection influence saturates as the number of contacts expands links lose effectiveness if they are overused Conditions for successful recommendations professional and organizational contexts discounts on expensive items small, tightly knit communities

58 Overall observations Interests and social connections co-evolve What we are exposed to depends on the active topics in our community Who we communicate with depends on our interests

59 Thanks!