Selected Topics in Data Networking Explore Social Networks: Cliques and Sub-group
Introduction Many of the approaches to understanding the structure of a network Emphasize how dense connections are built-up from simpler dyads and triads to more extended dense clusters such as "cliques." This view of social structure focuses attention on how solidarity and connection of large social structures can be built up out of small and tight components A sort of "bottom up" approach.
Introduction Network analysts have developed a number of useful definitions an algorithms that identify how larger structures are compounded from smaller ones: Cliques n-cliques n-clans k-plexes Divisions of actors into groups and sub-structures can be a very important aspect of social structure. It can be important in understanding how the network as a whole is likely to behave.
Introduction Actors in one network form two non-overlapping groups; Actors in another network also form two groups, but that the memberships overlap (some people are members of both groups). Where the groups overlap, we might expect that conflict between them is less likely than when the groups don't overlap. Mobilization and diffusion may spread rapidly across the entire network; where the groups don't overlap, traits may occur in one group and not diffuse to the other. G1 G2 G1 G2
Introduction Some people may act as "bridges" between groups (cosmopolitans, boundary spanners, or "brokers”). Others may have all of their relationships within a single group (locals or insiders). Some actors may be part of a tightly connected and closed elite Others are completely isolated from this group Individuals are embedded in the structure of groups within in a network can have profound consequences for the ways that these actors see their "society," and the behaviors that they are likely to practice.
Introduction We can also look for sub-structure from the "top-down“ approach. We can think of sub-structures as areas of the graph that seem to be locally dense, but separated to some degree, from the rest of the graph. This idea has been applied in a number of ways: components, blocks/cut points, K-cores, Lambda sets and bridges, factions, and f-groups will be discussed here.
Introduction “Top-down“ approach Some regions of a graph may less connected to the whole than others may lead to insights into lines of cleavage and division. Weaker parts in the "social fabric" also create opportunities for brokerage and less constrained action. Numbers and sizes of regions, and their "connection topology" may be consequential for predicting both the opportunities and constraints facing groups and actors, as well as predicting the evolution of the graph itself.
Introduction Matrices that have very high density, almost by definition, are likely to have few distinctive sub-groups or cliques. A graph in which all vertices are adjacent to all others is said to be complete.
Introduction Actors #5 and #2 appear to be in the middle of the action -- in the sense that they are members of many of the groupings, and serve to connect them, by co-membership. The connection of sub-graphs by actors can be an important feature. We can also see that there is one case (#6) that is not a member of any sub-group. Most of the sub-groupings are connected -- that groups overlap. How many groups do #5 and #2 belong?
Introduction Answers to the main questions about a graph, in terms of its sub-structures, may be apparent from inspection: How separate are the sub-graphs? Do they overlap and share members, or do they divide or factionalize the network? How large are the connected sub-graphs? Are there a few big groups, or a larger number of small groups? Are there particular actors that appear to play network roles? Act as nodes that connect the graph, or who are isolated from groups?
Bottom-up Approaches All networks are composed of groups (or sub-graphs). When two actors have a tie, they form a "group." A clique extends the dyad by adding to it members who are tied to all of the members in the group. Bottom-up approaches: tend to emphasize how the macro might emerge out of the micro. Tend to focus our attention on individuals first, and try to understand how they are embedded in the web of overlapping groups in the larger structure.
Cliques A clique is a sub-set of a network in which the actors are more closely and intensely tied to one another than they are to other members of the network. In terms of friendship ties, it is not unusual for people in human groups to form "cliques" on the basis of age, gender, religion, and many other things. The smallest "cliques" are composed of two actors: the dyad. But dyads can be "extended" to become more and more inclusive -- forming strong or closely connected regions in graphs. A number of approaches to finding groups in graphs can be developed by extending the close-coupling of dyads to larger structures.
Cliques A clique is the maximum number of actors who have all possible ties present among themselves. Maximal complete subgraph of at least three nodes (No dyads allowed) A "Maximal complete sub-graph" is such a grouping, expanded to include as many actors as possible. A clique in a non-directed binary graph consists of the largest number of actors with ties to all other clique members (i.e., no null ties).
Cliques Density Calculation Density = the number of edges divided by the number possible. If self-loops are excluded, then the number possible is n(n-1)/2. If self-loops are allowed, then the number possible is n(n+1)/2. Density = (6 edges)/(6(6-1)/2)= 6/15
Cliques Clique density is always 1.00. If any other network actor were added to the clique, the result would a sub-graph having one or more null ties and hence a density less than 1.00. Clique ABC=> Density = (3 edges)/(3(3-1)/2)=1 A node may belong to more than one clique; at least one node differs in each clique. Actor G below belongs to three 3-actor cliques; which are they?
Cliques Find the four cliques within this 7-actor non-directed graph HINT: first find all the completely connected 4-actor sub-graphs, then any 3-actor sub-graphs not wholly subsumed within that larger one
Cliques The {D E F} clique members split apart and merge hierarchically into the two larger cliques (D into one clique, E and F into the other clique), presumably because of their preponderance ties to the other actors in those two cliques.
N-cliques The strict clique definition (maximal fully-connected sub-graph) may be too strong for many purposes. You can probably think of cases of "cliques" where at least some members are not so tightly or closely connected. There are two major ways that the "clique" definition has been "relaxed" to try to make it more helpful and general. N-clique N-clan
N-cliques “N-clique” is to define an actor as a member of a clique if they are connected to every other member of the group at a distance greater than one. This corresponds to being "a friend of a friend." This approach to defining sub-structures is called N-clique, where N stands for the length of the path allowed to make a connection to all other members. Every actor must be no more than n-steps away from every other actor. As the cutoff value n increases, an n-clique’s size increases but its density falls below 1.00, because some proportion of subgroup pairs have no ties, by definition.
N-cliques 2-cliques: N=2 Does this graph have a 3-clique? {A B C D E F} and {D E F G H}, {A B C D E F}=> Density= (9edges)/(6(6-1)/2) = 9/15 {D E F G H} => Density = (8 edges)/(5(5-1)/2) = 8/10 Does this graph have a 3-clique? What is its density?
N-clan n-clan – “maximal n-diameter subgraph” Subgraph diameter is the length of the largest geodesic within the subgraph An n-clan is an n-clique in which (a) all actors are connected by paths of length ≤ n, (b) every node is also a member of the n-clique.
N-clan The graph-theoretic distance (usually shortened to just “distance”) between two vertices is defined as the length of a geodesic that connects them. The distance between every pair of vertices, A distance matrix D The maximum distance in a graph defines the graph’s diameter Diameter = 4
N-clan 2-clan: N=2 2-cliques: {b c d e f}, {a d c d e} the set {b,c,d,e,f} is a 2-clan, but {a,b,c,d,e} is not because b and c have distance greater than 2 in the induced subgraph. Note that {a,b,f,e} is also fails the 2-clan criterion because n-clans are defined to be n-cliques and {a,b,f,e} is not a 2-clique 2-cliques: {b c d e f}, {a d c d e} An 2-clan is an 2-clique in which (a) all actors are connected by paths of length ≤ 2,: {b c d e f}, (b) every node is also a member of the 2-clique: {b c d e f},
Top-down approaches The bottom-up approach may focus our attention on the underlying dynamic processes by which actor build networks. Some might prefer, however, to start with the entire network as their frame of reference, rather than the dyad. Approaches of this type tend to look at the "whole" structure, and identify "sub-structures" as parts that are locally denser than the field as a whole. In a sense, this more macro lens is looking for "holes" or "vulnerabilities" or "weak spots" in the overall structure or solidarity of the network. Top-down Approaches: Component Blocks and Cutpoints (Bi-components)
Components Components of a graph are sub-graphs that are connected within, but disconnected between sub-graphs. If a graph contains one or more "isolates" these actors are components. More interesting components are those which divide the network into separate parts, and where each part has several actors who are connected to one another. For directed graphs (in contrast to simple graphs), we can define two different kinds of components. A weak component is is a maximal subgraph which would be connected if we ignored the direction of the arcs. A strong component a maximal subgraph in which there is a path from every point to every point following all the arcs in the direction they are pointing. A strong component is. A weak component.
Blocks and Cutpoints An alternative approach to finding the key "weak" spots in the graph is to ask: if a node were removed, would the structure become divided into un-connected parts? If there are such nodes, they are called "cutpoints." Cutpoints may be particularly important actors -- who may act as brokers among otherwise disconnected groups. The divisions into which cut-points divide a graph are called blocks. We can find the maximal non-separable sub-graphs (blocks) of a graph by locating the cutpoints. Try to find the nodes that connects the graph. Another name for a block is a "bi-component."
Blocks and Cutpoints Two blocks are identified, with EDUC a member of both. This means that if EDUC (node 3) were removed, the WRO would become isolated. Node 3, then, is a cut-point.
Conclusion One of the most interesting thing about social structures is their sub-structure in terms of groupings or cliques. The number, size, and connections among the sub-groupings in a network can tell us a lot about the likely behavior of the network as a whole. How fast will things move across the actors in the network? Will conflicts most likely involve multiple groups, or two factions. All of these aspects of sub-group structure can be very relevant to predicting the behavior of the network as a whole. The location of individuals in nets can also be thought of in terms of cliques or sub-groups. Certain individuals may act as "bridges" among groups, others may be isolates; some actors may be cosmopolitans, and others locals in terms of their group affiliations. Such variation in the ways that individuals are connected to groups or cliques can be quite consequential for their behavior as individuals.
References Robert A. Hanneman, Mark Riddle, “Introduction to social network methods” Stephen P. Borgatti, “Graph Theory” Teaching Material: SOC8412 Social Network Analysis Fall 2009