Selected Topics in Data Networking Explore Social Networks: Prestige and Ranking.

Slides:



Advertisements
Similar presentations
Network Matrix and Graph. Network Size Network size – a number of actors (nodes) in a network, usually denoted as k or n Size is critical for the structure.
Advertisements

Analysis and Modeling of Social Networks Foudalis Ilias.
Introduction to Trees Chapter 6 Objectives
Knowledge Management Prof.dr.Nada Lavrač Topics: - Degree centrality/centralization - Closeness centrality/centralization - Betweenness centrality/centralization.
CS 206 Introduction to Computer Science II 03 / 27 / 2009 Instructor: Michael Eckmann.
Relationship Mining Network Analysis Week 5 Video 5.
Informetric methods seminar Tutorial 2: Using Pajek for network properties Qi Yu.
Graphs Graphs are the most general data structures we will study in this course. A graph is a more general version of connected nodes than the tree. Both.
Selected Topics in Data Networking
CONNECTIVITY “The connectivity of a network may be defined as the degree of completeness of the links between nodes” (Robinson and Bamford, 1978).
V4 Matrix algorithms and graph partitioning
Chapter 6: Correlational Research Examine whether variables are related to one another (whether they vary together). Correlation coefficient: statistic.
CS 206 Introduction to Computer Science II 11 / 11 / Veterans Day Instructor: Michael Eckmann.
Graph & BFS.
Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.
Centrality and Prestige HCC Spring 2005 Wednesday, April 13, 2005 Aliseya Wright.
CS 206 Introduction to Computer Science II 11 / 05 / 2008 Instructor: Michael Eckmann.
Graphs G = (V,E) V is the vertex set. Vertices are also called nodes and points. E is the edge set. Each edge connects two different vertices. Edges are.
Journal Status* Using the PageRank Algorithm to Rank Journals * J. Bollen, M. Rodriguez, H. Van de Sompel Scientometrics, Volume 69, n3, pp , 2006.
Spearman Rho Correlation
Not all Journals are Created Equal! Using Impact Factors to Assess the Impact of a Journal.
1 1 Slide © 2000 South-Western College Publishing/ITP Slides Prepared by JOHN LOUCKS.
Network Measures Social Media Mining. 2 Measures and Metrics 2 Social Media Mining Network Measures Klout.
Social Networks Corina Ciubuc.
May 5, 2015Applied Discrete Mathematics Week 13: Boolean Algebra 1 Dijkstra’s Algorithm procedure Dijkstra(G: weighted connected simple graph with vertices.
Sunbelt XXIV, Portorož, Pajek Workshop Vladimir Batagelj Andrej Mrvar Wouter de Nooy.
Presentation: Random Walk Betweenness, J. Govorčin Laboratory for Data Technologies, Faculty of Information Studies, Novo mesto – September 22, 2011 Random.
Infrastructure of MANETs  MANETS are without a fixed infrastructure  Network Graphs in MANETS are rarely or ever connected  MANET routing protocols.
Social Network Analysis: A Non- Technical Introduction José Luis Molina Universitat Autònoma de Barcelona
Unless otherwise noted, the content of this course material is licensed under a Creative Commons Attribution 3.0 License.
Confidential1 Tree Diagrams. Confidential2 Warm Up Find the lower and upper quartiles of the following data sets 1.23, 25, 27, 29, 31, 33, , 51,
GRAPHS CSE, POSTECH. Chapter 16 covers the following topics Graph terminology: vertex, edge, adjacent, incident, degree, cycle, path, connected component,
May 1, 2002Applied Discrete Mathematics Week 13: Graphs and Trees 1News CSEMS Scholarships for CS and Math students (US citizens only) $3,125 per year.
Foundations of Discrete Mathematics
Introduction to Graphs. Introduction Graphs are a generalization of trees –Nodes or verticies –Edges or arcs Two kinds of graphs –Directed –Undirected.
An Introduction to Social Network Analysis Yi Li
Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 April 25, 2012.
Special Topics in Educational Data Mining HUDK5199 Spring 2013 March 25, 2012.
Lecture 13: Network centrality Slides are modified from Lada Adamic.
How to Analyse Social Network?
COSC 2007 Data Structures II Chapter 14 Graphs I.
Most of contents are provided by the website Graph Essentials TJTSD66: Advanced Topics in Social Media.
Slides are modified from Lada Adamic
Community Discovery in Social Network Yunming Ye Department of Computer Science Shenzhen Graduate School Harbin Institute of Technology.
Data Structures & Algorithms Graphs Richard Newman based on book by R. Sedgewick and slides by S. Sahni.
GRAPHS. Graph Graph terminology: vertex, edge, adjacent, incident, degree, cycle, path, connected component, spanning tree Types of graphs: undirected,
OPTIMAL CONNECTIONS: STRENGTH AND DISTANCE IN VALUED GRAPHS Yang, Song and David Knoke RESEARCH QUESTION: How to identify optimal connections, that is,
Discrete Structures CISC 2315 FALL 2010 Graphs & Trees.
Chapter 10: Trees A tree is a connected simple undirected graph with no simple circuits. Properties: There is a unique simple path between any 2 of its.
Selected Topics in Data Networking Explore Social Networks: Center and Periphery.
How to Analyse Social Network? Social networks can be represented by complex networks.
Informatics tools in network science
BINARY TREES Objectives Define trees as data structures Define the terms associated with trees Discuss tree traversal algorithms Discuss a binary.
Topical Analysis and Visualization of (Network) Data Using Sci2 Ted Polley Research & Editorial Assistant Cyberinfrastructure for Network Science Center.
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
GRAPH AND LINK MINING 1. Graphs - Basics 2 Undirected Graphs Undirected Graph: The edges are undirected pairs – they can be traversed in any direction.
1 CS 430 / INFO 430: Information Retrieval Lecture 20 Web Search 2.
The simultaneous evolution of author and paper networks
Selected Topics in Data Networking
Ranking in social networks
Social Networks Analysis
Department of Computer and IT Engineering University of Kurdistan
Lecture 18. Basics and types of Trees
Network Science: A Short Introduction i3 Workshop
Taibah University College of Computer Science & Engineering Course Title: Discrete Mathematics Code: CS 103 Chapter 10 Trees Slides are adopted from “Discrete.
Graphs All tree structures are hierarchical. This means that each node can only have one parent node. Trees can be used to store data which has a definite.
Graphs Chapter 11 Objectives Upon completion you will be able to:
Important Problem Types and Fundamental Data Structures
Graphs G = (V,E) V is the vertex set.
New Media and Knowledge Management Sergeja Sabo
Presentation transcript:

Selected Topics in Data Networking Explore Social Networks: Prestige and Ranking

Introduction Prestige is conceptualized as a particular pattern of social ties. In directed networks, people who receive many positive choices are considered to be prestigious.  If everybody likes to play with the most popular girl or boy in a group but he or she does not play with all of them. 2 Source:

Introduction Popularity and Indegree: Prestige  When ties are associated to some positive aspects such as friendship or collaboration, indegree is often interpreted as a form of popularity, outdegree is interpreted as gregariousness.  A prestigious art museum receives more attention from art critics than less prestigious ones. 3 Source:

Introduction The simplest measure of structural prestige is called popularity and it is measured by the number of choices a vertex receives: its indegree In undirected networks, we cannot measure prestige; instead, we use degree as a simple measure of centrality. 4

Introduction We should note that indegree does reflect prestige if we transpose the arcs in such a network, that is, if we reverse the direction of arcs.  It is interesting to note that several structural properties of a network do not change when the arcs are transposed 5

Correlation Structural prestige scores: Correlation coefficients range from -1 to 1  A positive coefficient indicates that a high score on one feature is associated with a high score on the other (e.g., high structural prestige occurs in families with high social status).  A negative coefficient points toward a negative or inverse relation: a high score on one characteristic combines with a low score on the other (e.g., high structural prestige is found predominantly with low social status families). 6

Correlation There is no correlation if the absolute value of the coefficient is less than (+/-)0.05.  If the absolute value of a coefficient is between 0.05 and 0.25 (and from to-0.25), association is weak: Positive, Negative  Coefficients from 0.25 to 0.60 (and from −0.25 to −0.60) indicate moderate association: Positive and Negative  Coefficients from 0.60 to 1.00 (or −0.60 to −1.00) is interpreted as strong association : Positive and Negative  Coefficient of 1 or −1 is said to display perfect association : Positive and Negative 7

Domains Popularity is a very restricted measure of prestige because it takes only direct choices into account. This is the input domain of an actor, which has been called the influence domain because structurally prestigious people are thought to influence people who regard them as their leaders. The larger the input domain of a person, the higher his or her structural prestige. The output domain is more likely to reflect prestige in the case of a relation such as “lend money to”. 8

Proximity Prestige Limit the input domain to direct neighbors or to neighbors at maximum distance two on the assumption that nominations by close neighbors are more important than nominations by distant neighbors. An indirect choice contributes less to prestige if it is mediated by a longer chain of intermediaries. 9

Proximity Prestige Proximity Prestige: This index of prestige considers all vertices within the input domain of a vertex but it attaches more importance to a nomination if it is expressed by a closer neighbor. A nomination by a close neighbor contributes more to the proximity prestige of an actor than a nomination by a distant neighbor, but many “distant nominations” may contribute as much as one “close nomination.” 10

Proximity Prestige To allow direct choices to contribute more to the prestige of a vertex than indirect choices, proximity prestige weights each choice by its path distance to the vertex. A higher distance yields a lower contribution to the proximity prestige of a vertex, but each choice contributes something. 11

Proximity Prestige A larger input domain (larger numerator) yields a higher proximity prestige because more vertices are choosing an actor directly or indirectly. A smaller average distance (smaller denominator) yields a higher proximity prestige score because there are more nominations by close neighbors. Maximum proximity prestige is achieved if a vertex is directly chosen by all other vertices. The proportion of vertices in the input domain is 1 and the mean distance from these vertices is 1, so proximity prestige is 1 divided by 1. Vertices without input domain get minimum proximity prestige by definition, which is zero. 12

Proximity Prestige All vertices at the extremes of the network (v2, v4, v5, v6,and v10) have empty input domains, hence they have a proximity score of zero. The input domain of vertex v9 contains vertex v10 only, so its input domain size is 1 out of 9 (.11). Average distance within the input domain of vertex v9 is one, so the proximity prestige of vertex 9 is.11 divided by 1 = Vertex v1 has a maximal input domain (9 out of 9 =1), Average distance is 2.0, so proximity prestige amounts to 1.00 divided by 2.0, which is.5.  Avg.dist. = ( )/9 = 2 13

Selected Topics in Data Networking Explore Social Networks: Genealogies and Citations

Introduction Social communities and intellectual traditions can be defined by a common set of ancestors, by structural relinking (families which intermarry repeatedly), or by long-lasting co-citation of papers.  Genealogy Analysis: Pedigree is also important for the retrospective attribution of prestige to ancestors.  Citation analysis: The number of descendants (citations) is used to assign importance and influence to precursors. 15 Source: /a-deeper-look-to-your-history-through-genealogy- research/ Source: om/?p=22

Genealogy Analysis Many people are assembling their family trees.  Genealogies contain persons as units and two types of relations among persons: birth and marriage. A person may belong to two nuclear families: a family in which it is a child and a family in which it is a parent.  Child: the family of child or orientation (FAMC)  Parent: the family of spouse or procreation (FAMS). 16

Genealogy Analysis Petrus Gondola’s family contains his wife and eight children and it is identical to the family of orientation of each of his children. Family codes are translated to arcs between parents and children. Men are represented by triangles, women by ellipses, marriages by (double) lines, and parent–child ties by arcs. Note that the arcs point from parent to child following the flow of time. 17

Genealogy Analysis Fathers and mothers are connected to their children in an Ore graph. This greatly simplifies the calculation of kinship relations because the length and the direction of the shortest semipath between two individuals defines their kinship tie The Ore graph is a very useful instrument for finding an individual’s ancestors (pedigree) and descendants from both the father’s side and the mother’s side. 18

Genealogy Analysis Kinship is a fundamental social relation that is extensively studied by anthropologists and historians. These genealogies, which are usually very large, enable the study of overall patterns of kinship ties which  Reflect cultural norms for marriage: who are allowed to marry? 19

Citation Analysis Citations make explicit this frame of reference  They are a valuable source of data for the study of scientific development and scientific communities in scientometrics, history, and the sociology of science. They reveal the impact of articles and their authors on later scientific work and they signal scientific communities or specialties which share knowledge. 20

Citation Analysis Articles can cite only articles that appeared earlier, so the network is acyclic.  Arcs never point back to older articles just as parents cannot be younger than their children. Out-degree  A citation network contains one relation, whereas genealogical data concern two relations: parenthood and marriage.  Citations are being used to assess the scientific importance of papers, authors, and journals.  Item receiving more citations is deemed more important. 21

Citation Analysis Citation analysis reveals such cohesive subgroups and it studies their institutional or paradigmatic background.  Specialties are cohesive subgroups in the citation network Weak components identify isolated scientific communities that are not aware of each other or who see no substantial overlap between their research domains 22

Citation Analysis Main path analysis calculates the extent to which a particular citation or article is needed for linking articles, which is called the traversal count or traversal weight of a citation or article. 23

Citation Analysis There are two sources (v1 and v5) and two sinks (v3 and v4). There are eight paths from sources to sinks  Four paths reach v4 from v1 and three paths from v5 and one path from v1 to v3 24

Citation Analysis Traversal Weight (between sources and sinks)  Possible paths: v1:v3, v1=>v3 v1: v4, v1=>v4, v1=>v6=>v2=>v4, v1=>v6=>v4 v5:v3, none v5:v4, v5=>v2=>v4, v5=>v6=>v2=>v4, v5=>v6=>v4  V1: 5/8=  V5: 3/8=  V4: 6/8= 0.75  V3: 1/8=

Citation Analysis Main path is the path from a source vertex to a sink vertex with the highest traversal weights on its arcs.  Choosing the source vertex (or vertices) incident with the arc(s) with the highest weight, selecting the arc(s) and the head(s)of the arc(s), and repeating this step until a sink vertex is reached. 26

Citation Analysis The main paths start with vertex v1 and vertex v5 because both source vertices are incident with an arc carrying a traversal weight of Both arcs point toward vertex v6, which is the next vertex on the main paths. Then, the paths proceed either to vertex v2 and on to vertex v4 or directly from vertex v6 to vertex v4. We find several main paths, but they lead to the same sink, so we conclude that the network represents one research tradition. 27

Practice Download File: lj.si/pub/networks/Data/esna/centrality.htmhttp://vlado.fmf.uni- lj.si/pub/networks/Data/esna/centrality.htm  Centrality literature network (*.net) Main Path Analysis A citation network is directed and acyclic  Net=>Citation Weights Computing the traversal weights for lines and vertices in an acyclic network.  Search Path Count (SPC) command counts the paths between all source and sink vertices  Search Path Link Count (SPLC) command traces paths from all vertices to the sink vertices.  Search Path Node Pair (SPNP) command, each vertex is considered as a source and as a sink. As a result, vertices and edges in the middle will receive higher traversal weights. 28

References Wouter de Nooy, Andrej Mrvar, and Vladimir Batagelj, Exploratory Social Network Analysis with Pajek, Cambridge 29