Social Network Analysis with Apache Spark and Neo4J

Slides:



Advertisements
Similar presentations
Six degrees: The science of a connected age By Duncan J. Watts Brian Lewis INF 385Q December 1, 2005 Brian Lewis INF 385Q December 1, 2005.
Advertisements

Dr. Henry Hexmoor Department of Computer Science Southern Illinois University Carbondale Network Theory: Computational Phenomena and Processes Social Network.
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.
Analysis and Modeling of Social Networks Foudalis Ilias.
Analysis of Social Media MLD , LTI William Cohen
Social Networks 101 P ROF. J ASON H ARTLINE AND P ROF. N ICOLE I MMORLICA.
Modeling Relationship Strength in Online Social Networks Rongjian Xiang 1, Jennifer Neville 1, Monica Rogati 2 1 Purdue University, 2 LinkedIn WWW 2010.
Directional triadic closure and edge deletion mechanism induce asymmetry in directed edge properties.
Link creation and profile alignment in the aNobii social network Luca Maria Aiello et al. Social Computing Feb 2014 Hyewon Lim.
Social Network Analysis Social Computing Foothill College.
Web as Graph – Empirical Studies The Structure and Dynamics of Networks.
Sunbelt 2009statnet Development Team ERGM introduction 1 Exponential Random Graph Models Statnet Development Team Mark Handcock (UW) Martina.
Graph databases …the other end of the NoSQL spectrum. Material taken from NoSQL Distilled and Seven Databases in Seven Weeks.
Leveraging Big Data: Lecture 11 Instructors: Edith Cohen Amos Fiat Haim Kaplan Tova Milo.
LDBC & The Social Network Benchmark Peter Boncz Database Architectures CWI Special chair “Large-Scale Data VU event.cwi.nl/lsde2015.
Social Network Analytics Manage your campaign Obama-style Marketing, Advertising and Fundraising TORUX SNA
Models of Influence in Online Social Networks
Graph Theory in 50 minutes. This Graph has 6 nodes (also called vertices) and 7 edges (also called links)
Exploring the dynamics of social networks Aleksandar Tomašević University of Novi Sad, Faculty of Philosophy, Department of Sociology
Management Information System
Using Transactional Information to Predict Link Strength in Online Social Networks Indika Kahanda and Jennifer Neville Purdue University.
Network Analysis Diffusion Networks. Social Network Philosophy Social structure is visible in an anthill Movements & contacts one sees are not random.
Principles of Social Network Analysis. Definition of Social Networks “A social network is a set of actors that may have relationships with one another”
Today’s topics Strength of Weak Ties Next Topic Acknowledgements
Science: Graph theory and networks Dr Andy Evans.
A Graph-based Friend Recommendation System Using Genetic Algorithm
Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent W. Freeh Dr. Kevin Bowyer Supported in part by the National Science.
Class 10: Introduction to CINET Using CINET for network analysis and visualization Network Science: Introduction to CINET 2015 Prof. Boleslaw K. Szymanski.
Λ14 Διαδικτυακά Κοινωνικά Δίκτυα και Μέσα Networks and Surrounding Contexts Chapter 4, from D. Easley and J. Kleinberg book.
Susan O’Shea The Mitchell Centre for Social Network Analysis CCSR/Social Statistics, University of Manchester
Yongqin Gao, Greg Madey Computer Science & Engineering Department University of Notre Dame © Copyright 2002~2003 by Serendip Gao, all rights reserved.
Our simulation is based on Chris Starnes. original work by Reynolds [8] on the simulation of flocks of birds (or ‘Boids‘) in a manner not subject to the.
1 1 COMP5331: Knowledge Discovery and Data Mining Acknowledgement: Slides modified based on the slides provided by Lawrence Page, Sergey Brin, Rajeev Motwani.
What Is A Network? (and why do we care?). An Introduction to Network Theory | Kyle Findlay | SAMRA 2010 | 2 “A collection of objects (nodes) connected.
Final Year Project – I Smart Recruiter Group Members: Uzair Siddiqui [05363] Rehma Ather [05625] Meeran Khan [05364] Syed Maaz Alam [05284] Supervisor.
Networks are connections and interactions. Networks are present in every aspect of life. Examples include economics/social/political sciences. Networks.
Analyzing Networks. Milgram’s Experiments “Six degrees of Separation” Milgram’s letters to various recruits in Nebraska who were asked to forward the.
Topical Analysis and Visualization of (Network) Data Using Sci2 Ted Polley Research & Editorial Assistant Cyberinfrastructure for Network Science Center.
Topics In Social Computing (67810) Module 1 Introduction & The Structure of Social Networks.
Mapping Your Digital Audiences Nicole Fernandez, Georgetown Erin Gamble, Charrosé King,
Social Networks in the Real World: The Struggle for Positional Advantage Graham Room 17 March 2011.
Philip Scanlon & Alan F Smeaton
Exploring classroom interaction with dynamic social network analysis
Tutorial: Big Data Algorithms and Applications Under Hadoop
Social Networks Analysis
Distributed voting application for handheld devices
Graph Database.
Empirical analysis of Chinese airport network as a complex weighted network Methodology Section Presented by Di Li.
David Ostrovsky | Couchbase
The Strength of Weak Ties
SOCIAL NETWORKS Amit Sharma INF -38FQ School of Information
Network Science: A Short Introduction i3 Workshop
An Efficient method to recommend research papers and highly influential authors. VIRAJITHA KARNATAPU.
Apache Spark & Complex Network
A Locality Model of the Evolution of Blog Networks
Department of Computer Science University of York
Mining Collaboration Patterns
Michael Ernst CSE 140 University of Washington
Clustering Coefficients
Assortativity (people associate based on common attributes)
CS 594: Empirical Methods in HCC Social Network Analysis in HCI
Mining Social Networks. Contents  What are Social Networks  Why Analyse Them?  Analysis Techniques.
SOCIAL NETWORKS Amit Sharma INF -38FQ School of Information
Management Information System
(Social) Networks Analysis II
Clustering The process of grouping samples so that the samples are similar within each group.
Analyzing Massive Graphs - ParT I
Ruth Anderson CSE 140 University of Washington
“The Spread of Physical Activity Through Social Networks”
Presentation transcript:

Social Network Analysis with Apache Spark and Neo4J Charles Copley Nathan Begbie Eli Copley

Introduction to social network concepts Workshop data & data handling OVERVIEW Introduction to social network concepts Workshop data & data handling Applied visualisation and network computations By the end of the workshop, participants will have the basic skills needed to learn to use Apache Spark with Neo4j for social network analysis.

01 Introduction to Social Networks Introduction to Concepts & Terminology Used in Social Network Analysis

Levels of Analysis → → → → Individuals affect other individuals SOCIAL NETWORK ANALYSIS Levels of Analysis → → → → Individuals affect other individuals Individual behaviours and decisions determine network structures and dynamics Network properties and an individual’s network location affect individual behaviour Network structures, dynamics, evolution mechanisms at time 1 affect network dynamics and structures at time 2

Isolates Component Edge Node (degree = 4) SOCIAL NETWORK CONCEPTS & TERMINOLOGY Isolates Node (degree = 4) Component Edge

Homophily Birds of a feather flock together SOCIAL NETWORK CONCEPTS & TERMINOLOGY Homophily Birds of a feather flock together Image from Moody, J. (2004)

Sourced by Ambika Samarthya-Howard, Praekelt.Org

Influence and Selection SOCIAL NETWORK CONCEPTS & TERMINOLOGY Influence and Selection 2 1 2 4 3 5 3 1 We influence and are influenced by the people we are connected to; but we also select those who are similar to us. 4 5

SOCIAL NETWORK CONCEPTS & TERMINOLOGY Triadic Closure Triad

How connected are your friends? SOCIAL NETWORK CONCEPTS & TERMINOLOGY How connected are your friends? Clustering Coefficient 1/3 Clustering Coefficient 2/3 Clustering Coefficient 3/3

Page Rank Your influence is determined by the influence of people you are connected to. Your influence is passed on to people that you link to Then you iterate…. MANY TIMES PR=1.35 PR =1.35 PR=0.15

02 Workshop Data Why and how we use specific tools to handle large network datasets

US National Longitudinal Study of Student Health DATASET US National Longitudinal Study of Student Health Longitudinal study of a nationally representative sample of adolescents in grades 7-12 in the United States during the 1994-95 school year Includes Race, Gender and Grade. See: http://www.cpc.unc.edu/projects/addhealth Reference: A Statnet Tutorial (Goodreau, Handcock, Hunter, Butts and Morris ), Journal of Statistical Software, February 2008, Volume 24. https://www.jstatsoft.org/article/view/v024i09

Distributed Computation Graph Database DATA HANDLING Raw Data Distributed Computation Graph Database Holds your primary data (could also be in a database) First import data into Spark for data handling, formatting and calculation function Then move the data into Neo4j, which allows you to query relationship patterns and conduct SNA.

03 Data Practical Visualising network data and computing basic metrics

DATA PRACTICAL A recommender system could consist of searching for people connected to your friends, e.g. via LinkedIn Person 1 knows Person 2 → Person 2 knows Person 3 MATCH (p1)-[r1:knows]-(p2), (p1)-[r2:knows]-(p3), (p3)-[r3:knows]-(p2) return p1,p2,p3,r1,r2 limit 10

Thank you! Any questions? charles@praekelt.org nathan@praekelt.org eli@praekelt.org

More Reading Social Network Analysis with Big Data Charles Copley, Head of Data Science at Praekelt: https://medium.com/mobileforgood/social-network-analysis-using-apache-spark-and-neo4j-1ccba3c8af9a Homophily and Influence Sinan Aral (2013) What would Ashton Do? Harvard Business Review On how homophily and social location impact our choices https://hbr.org/2013/05/what-would-ashton-do-and-does-it-matter Weak Ties, Social Capital Granovetter, M. S. (1977) The Strength of Weak Ties. American Journal of Sociology, 78(6), 1360-1380.