Selected Topics in Data Networking

Slides:



Advertisements
Similar presentations
Network Matrix and Graph. Network Size Network size – a number of actors (nodes) in a network, usually denoted as k or n Size is critical for the structure.
Advertisements

Analysis and Modeling of Social Networks Foudalis Ilias.
Relationship Mining Network Analysis Week 5 Video 5.
Psychology: A Modular Approach to Mind and Behavior, Tenth Edition, Dennis Coon Appendix Appendix: Behavioral Statistics.
Informetric methods seminar Tutorial 2: Using Pajek for network properties Qi Yu.
Graphs Graphs are the most general data structures we will study in this course. A graph is a more general version of connected nodes than the tree. Both.
Selected Topics in Data Networking
V4 Matrix algorithms and graph partitioning
Graph & BFS.
Centrality and Prestige HCC Spring 2005 Wednesday, April 13, 2005 Aliseya Wright.
Segmentation Graph-Theoretic Clustering.
Advanced Topics in Data Mining Special focus: Social Networks.
Graphs G = (V,E) V is the vertex set. Vertices are also called nodes and points. E is the edge set. Each edge connects two different vertices. Edges are.
Network Measures Social Media Mining. 2 Measures and Metrics 2 Social Media Mining Network Measures Klout.
Models of Influence in Online Social Networks
Sunbelt XXIV, Portorož, Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy.
Selected Topics in Data Networking Explore Social Networks: Prestige and Ranking.
Social Network Analysis: A Non- Technical Introduction José Luis Molina Universitat Autònoma de Barcelona
Unless otherwise noted, the content of this course material is licensed under a Creative Commons Attribution 3.0 License.
DATA MINING LECTURE 13 Absorbing Random walks Coverage.
GRAPHS CSE, POSTECH. Chapter 16 covers the following topics Graph terminology: vertex, edge, adjacent, incident, degree, cycle, path, connected component,
Principles of Social Network Analysis. Definition of Social Networks “A social network is a set of actors that may have relationships with one another”
A Clustering Algorithm based on Graph Connectivity Balakrishna Thiagarajan Computer Science and Engineering State University of New York at Buffalo.
An Introduction to Social Network Analysis Yi Li
Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 April 25, 2012.
Understanding Crowds’ Migration on the Web Yong Wang Komal Pal Aleksandar Kuzmanovic Northwestern University
Special Topics in Educational Data Mining HUDK5199 Spring 2013 March 25, 2012.
Lecture 13: Network centrality Slides are modified from Lada Adamic.
Susan O’Shea The Mitchell Centre for Social Network Analysis CCSR/Social Statistics, University of Manchester
Data Structures & Algorithms Graphs
Slides are modified from Lada Adamic
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Data Structures & Algorithms Graphs Richard Newman based on book by R. Sedgewick and slides by S. Sahni.
Graphs Upon completion you will be able to:
Selected Topics in Data Networking Explore Social Networks: Center and Periphery.
Informatics tools in network science
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
Topical Analysis and Visualization of (Network) Data Using Sci2 Ted Polley Research & Editorial Assistant Cyberinfrastructure for Network Science Center.
Central Tendency  Key Learnings: Statistics is a branch of mathematics that involves collecting, organizing, interpreting, and making predictions from.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Descriptive Statistics ( )
Graph clustering to detect network modules
Chapter 12 Understanding Research Results: Description and Correlation
Ranking in social networks
Groups of vertices and Core-periphery structure
Social Networks Analysis
Department of Computer and IT Engineering University of Kurdistan
Greedy Algorithm for Community Detection
Network analysis.
Numerical Descriptive Measures
CSE 4705 Artificial Intelligence
Network Science: A Short Introduction i3 Workshop
Centrality in Social Networks
Segmentation Graph-Theoretic Clustering.
Critical Issues with Respect to Clustering
Models of Network Formation
Centralities (4) Ralucca Gera,
Models of Network Formation
Graphs All tree structures are hierarchical. This means that each node can only have one parent node. Trees can be used to store data which has a definite.
Graphs Chapter 11 Objectives Upon completion you will be able to:
Peer-to-Peer and Social Networks Fall 2017
Models of Network Formation
Models of Network Formation
Graphs.
Cluster Validity For supervised classification we have a variety of measures to evaluate how good our model is Accuracy, precision, recall For cluster.
An Introduction to Correlational Research
Based on slides by Y. Peng University of Maryland
Graphs G = (V,E) V is the vertex set.
New Media and Knowledge Management Sergeja Sabo
Presentation transcript:

Selected Topics in Data Networking Explore Social Networks: Cohesion

Introduction to Cohesion Social Network Analysis: Investigating who is related and who is not. Why are some people or organizations related, whereas others are not? People who match on social characteristics will interact more often and people who interact regularly will foster a common attitude or identity.

Introduction to Cohesion Social interaction Basis for solidarity, shared norms, identity, and collective behavior People who interact intensively are likely to consider themselves a social group. Expecting similar people to interact a lot, at least more often than with dissimilar people. This phenomenon is called homophily: love of the same (the tendency of individuals to associate and bond with similar others) Does the homophily principle work?

Introduction to Cohesion Study in the Turrialba region, which is a rural area in Costa Rica (Latin America). Visual impression of the kin visits network and the family–friendship groupings, which are identified by the colors and numbers within the vertices

Meaning: Cohesion Cohesion means a social network contains many ties. Community cohesion refers to the aspect of togetherness and bonding exhibited by members of a community, the “glue” that holds a community together. More ties between people yield a tighter structure, which is more cohesive. The density of a network captures this idea. Source: https://digestiblepolitics.wordpress.com/2013/01/04/the-importance-of-community-cohesion-in-society-in-2013/

Meaning: Cohesion Review Multiple lines between vertices and higher line values indicate more cohesive ties. Density = the number of edges divided by the number possible. If self-loops are excluded, then the number possible is n(n-1)/2. If self-loops are allowed, then the number possible is n(n+1)/2.

Cohesion: Indicated by Density In the kin visiting relation network, density is 0.045, which means that only 4.5 percent of all possible arcs are present. Density is inversely related to network size: The larger the social network, the lower the density because the number of possible lines increases rapidly with the number of vertices, whereas the number of ties which each person can maintain is limited. Discussion: Why? Network density is not very useful

Cohesion: Indicated by Degree The number of ties in which each vertex is involved. Degree of a vertex. Vertices with high degree are more likely to be found in dense sections of the network. Review (Undirected Graph) Cohesion: Comparing between Density and Degree

Cohesion: Indicated by Degree A higher degree of vertices yields a denser network because vertices entertain more ties. Average degree of all vertices: Measuring the structural cohesion of a network. This is a better measure of overall cohesion than density It does not depend on network size, so average degree can be compared between networks of different sizes.

Cohesion: Indicated by Degree NOTE Directed Graph: the sum of the indegree and the outdegree of a vertex does not necessarily equal the number of its neighbors

Cohesion: Indicated by Component Vertices with a degree of one or higher are connected to at least one neighbor, so they are not isolated. If the network is cut up in pieces. Isolated sections of the network may be regarded as cohesive subgroups because the vertices within a section are connected, whereas vertices in different sections are not. The connected parts of a network are called components

Cohesion: Indicated by Component “Singletons,” who have no connections and are least central The “giant component,” which is the largest group of nodes tightly connected to the central nodes and to each other The “middle region,” which represents isolated groups which interact amongst themselves but not with the rest of the network, forming isolated stars. Source: http://boxesandarrows.com/social-networks-and-group-formation/

Cohesion: Indicated by Component

Cohesion: Indicated by Component A network is weakly connected – if all vertices are connected by a semipath. In a (weakly) connected network, we can “walk” from each vertex to all other vertices if we neglect the direction of the arcs.

Cohesion: Indicated by Component In directed networks, there is a second type of connectedness: a network is strongly connected if each pair of vertices is connected by a path. In a strongly connected network, we can travel from each vertex to any other vertex obeying the direction of the arcs.

Cohesion: Indicated by Component Strong connectedness is more restricted than weak connectedness: Each strongly connected network is also weakly connected but a weakly connected network is not necessarily strongly connected.

Cohesion: Indicated by Component Vertices v1, v3, v4, and v5 constitute a (weak) component because they are connected by semipaths and there is no other vertex in the network which is also connected to them by a semipath.

Cohesion: Indicated by Component A (weak) component is a maximal (weakly) connected subnetwork. A strong component, which is a maximal strongly connected subnetwork.

Cohesion: Indicated by Component The example network contains three strong components. The largest strong component is composed of vertices v3, v4, and v5, which are connected by paths in both directions.

Cohesion: Indicated by Component There are two strong components consisting of one vertex each, namely vertex v1 and v2. Vertex v2 is isolated and there are only paths from vertex v1 but no paths to this vertex, so it is not strongly connected to any other vertex. It is asymmetrically linked to the larger strong component.

Cohesion: Indicated by Component In an undirected network, lines have no direction Each semiwalk is also a walk and each semipath is also a path. Components are isolated from one another, there are no lines between vertices of different components. This is similar to weak components in directed networks.

Cohesion: Indicated by Component Components can be split up further into denser parts by considering the number of distinct, that is, noncrossing, paths or semipaths that connect the vertices. Within a weak component, one semipath between each pair of vertices suffices but there must be at least two different semipaths in a bi-component. k-connected components: maximal subnetworks in which each pair of vertices is connected by at least k distinct semipaths or paths.

Cohesion: Indicated by Core The distribution of degree reveals local concentrations of ties around individual vertices but it does not tell us whether vertices with a high degree are clustered or scattered all over the network. Using degree to identify clusters of vertices that are tightly connected because each vertex has a particular minimum degree within the cluster.

Cohesion: Indicated by Core Paying no attention to the degree of one vertex but to the degree of all vertices within a cluster. These clusters are called k-cores k indicates the minimum degree of each vertex within the core Ex: 2-core contains all vertices that are connected by degree two or more to other vertices within the core.

Cohesion: Indicated by Core A k-core identifies relatively dense subnetworks so they help to find cohesive subgroups.

Cohesion: Indicated by Core Undirected network: the degree of a vertex is equal to the number of its neighbors, k-core contains the vertices that have at least k neighbors within the core

Cohesion: Indicated by Core All vertices belong to the 1-core, which is drawn in black One vertex, v5, has only one neighbor, so it is not part of the 2-core Vertex v6 has a degree of 2, so it does not belong to the 3-core k-cores are nested: a vertex in a 3-core is also part of a 2-core, but not all members of a 2-core belong to a 3-core.

Cohesion: Indicated by Core Different cohesive subgroups within a k-core are usually connected by vertices that belong to lower cores Vertex v6, which is part of the 2-core, connects the two segments of the 3-core. Eliminating the vertices belonging to cores below the 3-core, Obtain a network consisting of two components, which identify the cohesive subgroups within the 3-core.

Cohesion: Indicated by Core How k-cores help to detect cohesive subgroups? Removing the lowest k-cores from the network until the network breaks up into relatively dense components. Each component is considered to be a cohesive subgroup because they have at least k neighbors within the component.

Selected Topics in Data Networking Explore Social Networks: Prestige and Ranking

Introduction Prestige is conceptualized as a particular pattern of social ties. In directed networks, people who receive many positive choices are considered to be prestigious. If everybody likes to play with the most popular girl or boy in a group but he or she does not play with all of them. Source: http://pursuitist.com/lady-gaga-rules-twitter/

Introduction Popularity and Indegree: Prestige When ties are associated to some positive aspects such as friendship or collaboration, indegree is often interpreted as a form of popularity, outdegree is interpreted as gregariousness. A prestigious art museum receives more attention from art critics than less prestigious ones. Source: http://en.wikipedia.org/wiki/Centrality

Introduction The simplest measure of structural prestige is called popularity and it is measured by the number of choices a vertex receives: its indegree In undirected networks, we cannot measure prestige; instead, we use degree as a simple measure of centrality.

Introduction We should note that indegree does reflect prestige if we transpose the arcs in such a network, that is, if we reverse the direction of arcs. It is interesting to note that several structural properties of a network do not change when the arcs are transposed

Correlation Structural prestige scores: Correlation coefficients range from -1 to 1 A positive coefficient indicates that a high score on one feature is associated with a high score on the other (e.g., high structural prestige occurs in families with high social status). A negative coefficient points toward a negative or inverse relation: a high score on one characteristic combines with a low score on the other (e.g., high structural prestige is found predominantly with low social status families).

Correlation There is no correlation if the absolute value of the coefficient is less than (+/-)0.05. If the absolute value of a coefficient is between 0.05 and 0.25 (and from -0.05 to-0.25), association is weak: Positive, Negative Coefficients from 0.25 to 0.60 (and from −0.25 to −0.60) indicate moderate association: Positive and Negative Coefficients from 0.60 to 1.00 (or −0.60 to −1.00) is interpreted as strong association : Positive and Negative Coefficient of 1 or −1 is said to display perfect association : Positive and Negative

Domains Popularity is a very restricted measure of prestige because it takes only direct choices into account. This is the input domain of an actor, which has been called the influence domain because structurally prestigious people are thought to influence people who regard them as their leaders. The larger the input domain of a person, the higher his or her structural prestige. The output domain is more likely to reflect prestige in the case of a relation such as “lend money to”.

Proximity Prestige Limit the input domain to direct neighbors or to neighbors at maximum distance two on the assumption that nominations by close neighbors are more important than nominations by distant neighbors. An indirect choice contributes less to prestige if it is mediated by a longer chain of intermediaries.

Proximity Prestige Proximity Prestige: This index of prestige considers all vertices within the input domain of a vertex but it attaches more importance to a nomination if it is expressed by a closer neighbor. A nomination by a close neighbor contributes more to the proximity prestige of an actor than a nomination by a distant neighbor, but many “distant nominations” may contribute as much as one “close nomination.”

Proximity Prestige To allow direct choices to contribute more to the prestige of a vertex than indirect choices, proximity prestige weights each choice by its path distance to the vertex. A higher distance yields a lower contribution to the proximity prestige of a vertex, but each choice contributes something.

Proximity Prestige A larger input domain (larger numerator) yields a higher proximity prestige because more vertices are choosing an actor directly or indirectly. A smaller average distance (smaller denominator) yields a higher proximity prestige score because there are more nominations by close neighbors. Maximum proximity prestige is achieved if a vertex is directly chosen by all other vertices. The proportion of vertices in the input domain is 1 and the mean distance from these vertices is 1, so proximity prestige is 1 divided by 1. Vertices without input domain get minimum proximity prestige by definition, which is zero.

Proximity Prestige All vertices at the extremes of the network (v2, v4, v5, v6,and v10) have empty input domains, hence they have a proximity score of zero. The input domain of vertex v9 contains vertex v10 only, so its input domain size is 1 out of 9 (.11). Average distance within the input domain of vertex v9 is one, so the proximity prestige of vertex 9 is .11 divided by 1 = 0.11. Vertex v1 has a maximal input domain (9 out of 9 =1), Average distance is 2.0, so proximity prestige amounts to 1.00 divided by 2.0, which is .5. Avg.dist. = (4+3+2+1+1+1+2+2+2)/9 = 2

References Wouter de Nooy, Andrej Mrvar, and Vladimir Batagelj, Exploratory Social Network Analysis with Pajek, Cambridge