Class 2: Graph Theory IST402. The Bridges of Konigsberg Section 1.

Slides:



Advertisements
Similar presentations
CSE 211 Discrete Mathematics
Advertisements

Chapter 9 Graphs.
Lecture 5 Graph Theory. Graphs Graphs are the most useful model with computer science such as logical design, formal languages, communication network,
22C:19 Discrete Math Graphs Fall 2010 Sukumar Ghosh.
22C:19 Discrete Math Graphs Fall 2014 Sukumar Ghosh.
Midwestern State University Department of Computer Science Dr. Ranette Halverson CMPS 2433 – CHAPTER 4 GRAPHS 1.
Graph-02.
Data Structures Using C++
Midwestern State University Department of Computer Science Dr. Ranette Halverson CMPS 2433 CHAPTER 4 - PART 2 GRAPHS 1.
C++ Programming: Program Design Including Data Structures, Third Edition Chapter 21: Graphs.
Mining and Searching Massive Graphs (Networks)
Prof. Amr Goneid, AUC1 Analysis & Design of Algorithms (CSCE 321) Prof. Amr Goneid Department of Computer Science, AUC Part R5. Graphs.
Graph & BFS.
CS 728 Lecture 4 It’s a Small World on the Web. Small World Networks It is a ‘small world’ after all –Billions of people on Earth, yet every pair separated.
Introduction to Graphs
1 Data Structures and Algorithms Graphs I: Representation and Search Gal A. Kaminka Computer Science Department.
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
Graph: Relations There are many kinds of social relations. For example: Role-based : brother of, father of, sister of, etc. : friend of, acquaintance of,
Discrete Mathematics Lecture 9 Alexander Bukharovich New York University.
22C:19 Discrete Math Graphs Spring 2014 Sukumar Ghosh.
Graph Essentials Social Media Mining. 2 Measures and Metrics 2 Social Media Mining Graph Essentials Networks A network is a graph. – Elements of the network.
Social Media Mining Graph Essentials.
GRAPH Learning Outcomes Students should be able to:
Data Structures Using C++ 2E
Graph Theoretic Concepts. What is a graph? A set of vertices (or nodes) linked by edges Mathematically, we often write G = (V,E)  V: set of vertices,
Chapter 2 Graph Algorithms.
Class 2: Graph theory and basic terminology Learning the language Network Science: Graph Theory 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr.
Network properties Slides are modified from Networks: Theory and Application by Lada Adamic.
Network Science Class 2: Graph Theory (Ch2) Albert-László Barabási with Roberta Sinatra and Sean P. Cornelius
GRAPHS CSE, POSTECH. Chapter 16 covers the following topics Graph terminology: vertex, edge, adjacent, incident, degree, cycle, path, connected component,
Complex Networks First Lecture TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA TexPoint fonts used in EMF. Read the.
1 CS104 : Discrete Structures Chapter V Graph Theory.
Lecture 5: Mathematics of Networks (Cont) CS 790g: Complex Networks Slides are modified from Networks: Theory and Application by Lada Adamic.
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Graphs.
Connected Components Fun with graphs, searching, and queues.
Mathematics of Networks (Cont)
Graphs.  Definition A simple graph G= (V, E) consists of vertices, V, a nonempty set of vertices, and E, a set of unordered pairs of distinct elements.
Week 11 - Monday.  What did we talk about last time?  Binomial theorem and Pascal's triangle  Conditional probability  Bayes’ theorem.
Graphs What are Graphs? General meaning in everyday math: A plot or chart of numerical data using a coordinate system. Technical meaning in discrete.
Data Structures & Algorithms Graphs
L – Modelling and Simulating Social Systems with MATLAB Lesson 6 – Graphs (Networks) Anders Johansson and Wenjian Yu (with S. Lozano.
Complex Networks: Models Lecture 2 Slides by Panayiotis TsaparasPanayiotis Tsaparas.
Most of contents are provided by the website Graph Essentials TJTSD66: Advanced Topics in Social Media.
Lecture 10: Graph-Path-Circuit
Graph theory and networks. Basic definitions  A graph consists of points called vertices (or nodes) and lines called edges (or arcs). Each edge joins.
GRAPHS. Graph Graph terminology: vertex, edge, adjacent, incident, degree, cycle, path, connected component, spanning tree Types of graphs: undirected,
MAT 2720 Discrete Mathematics Section 8.2 Paths and Cycles
Discrete Structures CISC 2315 FALL 2010 Graphs & Trees.
Class 2: Graph Theory IST402. Can one walk across the seven bridges and never cross the same bridge twice? Network Science: Graph Theory THE BRIDGES OF.
Chapter 9: Graphs.
Introduction to Graph Theory By: Arun Kumar (Asst. Professor) (Asst. Professor)
Class 2: Graph Theory IST402.
Chapter 20: Graphs. Objectives In this chapter, you will: – Learn about graphs – Become familiar with the basic terminology of graph theory – Discover.
Random Network IST402 – Network Science Acknowledgement: Laszlo Barabasi.
Department of Computer and IT Engineering University of Kurdistan Social Network Analysis Graph Theory By: Dr. Alireza Abdollahpouri.
1 GRAPH Learning Outcomes Students should be able to: Explain basic terminology of a graph Identify Euler and Hamiltonian cycle Represent graphs using.
Lecture II Introduction to complex networks Santo Fortunato.
Lecture 20. Graphs and network models 1. Recap Binary search tree is a special binary tree which is designed to make the search of elements or keys in.
1 Data Structures and Algorithms Graphs. 2 Graphs Basic Definitions Paths and Cycles Connectivity Other Properties Representation Examples of Graph Algorithms:
رياضيات متقطعة لعلوم الحاسب MATH 226. Chapter 10.
Complex Networks Analysis Information Systems Engineering ‎ (2013 A) Instructor: Rami Puzis TA: Luiza
Hiroki Sayama NECSI Summer School 2008 Week 2: Complex Systems Modeling and Networks Network Models Hiroki Sayama
Graph Theory An Introduction.
Data Structures Graphs - Terminology
Graph theory Definitions Trees, cycles, directed graphs.
Discrete Structures – CNS2300
Network analysis.
Network Science: A Short Introduction i3 Workshop
Graph Algorithm.
Graphs Chapter 7 Visit for more Learning Resources.
Presentation transcript:

Class 2: Graph Theory IST402

The Bridges of Konigsberg Section 1

Can one walk across the seven bridges and never cross the same bridge twice? Network Science: Graph Theory THE BRIDGES OF KONIGSBERG

Can one walk across the seven bridges and never cross the same bridge twice? Network Science: Graph Theory THE BRIDGES OF KONIGSBERG : Euler’s theorem: (a)If a graph has more than two nodes of odd degree, there is no path. (b)If a graph is connected and has no odd degree nodes, it has at least one path.

Networks and graphs Section 2

COMPONENTS OF A COMPLEX SYSTEM Network Science: Graph Theory  components: nodes, vertices N  interactions: links, edges L  system: network, graph (N,L)

network often refers to real systems www, social network metabolic network. Language: (Network, node, link) graph: mathematical representation of a network web graph, social graph (a Facebook term) Language: (Graph, vertex, edge) We will try to make this distinction whenever it is appropriate, but in most cases we will use the two terms interchangeably. NETWORKS OR GRAPHS? Network Science: Graph Theory

A COMMON LANGUAGE Network Science: Graph Theory N=4 L=4

The choice of the proper network representation determines our ability to use network theory successfully. In some cases there is a unique, unambiguous representation. In other cases, the representation is by no means unique. For example, the way we assign the links between a group of individuals will determine the nature of the question we can study. CHOOSING A PROPER REPRESENTATION Network Science: Graph Theory

If you connect individuals that work with each other, you will explore the professional network. CHOOSING A PROPER REPRESENTATION Network Science: Graph Theory

If you connect those that have a romantic and sexual relationship, you will be exploring the sexual networks. CHOOSING A PROPER REPRESENTATION Network Science: Graph Theory

If you connect individuals based on their first name (all Peters connected to each other), you will be exploring what? It is a network, nevertheless. CHOOSING A PROPER REPRESENTATION Network Science: Graph Theory

Question 2 Q2: Degree, degree distribution.

Degree, Average Degree and Degree Distribution Section 2.3

Node degree: the number of links connected to the node. NODE DEGREES Undirected In directed networks we can define an in-degree and out-degree. The (total) degree is the sum of in- and out-degree. Source: a node with k in = 0; Sink: a node with k out = 0. Directed A G F B C D E A B

Network Science: Graph Theory A BIT OF STATISTICS

N – the number of nodes in the graph Network Science: Graph Theory AVERAGE DEGREE Undirected Directed A F B C D E j i

Network Science: Graph Theory Average Degree

Degree distribution P(k): probability that a randomly chosen node has degree k N k = # nodes with degree k P(k) = N k / N ➔ plot DEGREE DISTRIBUTION

Log-log plot

Discrete Representation: p k is the probability that a node has degree k. Continuum Description: p(k) is the pdf of the degrees, where represents the probability that a node’s degree is between k 1 and k 2. Normalization condition: where K min is the minimal degree in the network. Network Science: Graph Theory DEGREE DISTRIBUTION

Question 3 Q3: Directed vs. undirected networks.

Links: undirected (symmetrical) Graph: Directed links : URLs on the www phone calls metabolic reactions Network Science: Graph Theory UNDIRECTED VS. DIRECTED NETWORKS UndirectedDirected A B D C L M F G H I Links: directed (arcs). Digraph = directed graph: Undirected links : coauthorship links Actor network protein interactions An undirected link is the superposition of two opposite directed links. A G F B C D E

Section 2.2Reference Networks

Question 4 Q4: Adjacency Matrices

Adjacency matrix Section 2.4

A ij =1 if there is a link between node i and j A ij =0 if nodes i and j are not connected to each other. Network Science: Graph Theory ADJACENCY MATRIX Note that for a directed graph (right) the matrix is not symmetric if there is a link pointing from node j and i if there is no link pointing from j to i.

ADJACENCY MATRIX AND NODE DEGREES Undirected Directed

a b c d e f g h a b c d e f g h ADJACENCY MATRIX Network Science: Graph Theory b e g a c f h d

ADJACENCY MATRICES ARE SPARSE Network Science: Graph Theory

More on Matrixology

Question 5 Q5: Sparsness

Real networks are sparse Section 4

The maximum number of links a network of N nodes can have is: A graph with degree L=L max is called a complete graph, and its average degree is =N-1 Network Science: Graph Theory COMPLETE GRAPH

Most networks observed in real systems are sparse: L << L max or <<N-1. WWW (ND Sample): N=325,729;L= L max =10 12 =4.51 Protein (S. Cerevisiae): N= 1,870;L=4,470L max =10 7 =2.39 Coauthorship (Math): N= 70,975; L= L max = =3.9 Movie Actors: N=212,250; L= L max = =28.78 (Source: Albert, Barabasi, RMP2002) Network Science: Graph Theory REAL NETWORKS ARE SPARSE

ADJACENCY MATRICES ARE SPARSE Network Science: Graph Theory

The maximum number of links a network of N nodes can have is: Network Science: Graph Theory METCALFE’S LAW

WEIGHTED AND UNWEIGHTED NETWORKS Section 2.6

WEIGHTED AND UNWEIGHTED NETWORKS

The maximum number of links a network of N nodes can have is: Network Science: Graph Theory METCALFE’S LAW

Question 5 Q6: Bipartite Networks

BIPARTITE NETWORKS Section 2.7

bipartite graph (or bigraph) is a graph whose nodes can be divided into two disjoint sets U and V such that every link connects a node in U to one in V; that is, U and V are independent sets.graphdisjoint setsindependent sets Examples: Hollywood actor network Collaboration networks Disease network (diseasome) BIPARTITE GRAPHS Network Science: Graph Theory

Gene network GENOME PHENOME DISEASOME Disease network Goh, Cusick, Valle, Childs, Vidal & Barabási, PNAS (2007) GENE NETWORK – DISEASE NETWORK Network Science: Graph Theory

HUMAN DISEASE NETWORK

Y.-Y. Ahn, S. E. Ahnert, J. P. Bagrow, A.-L. Barabási Flavor network and the principles of food pairing, Scientific Reports 196, (2011). Ingredient-Flavor Bipartite Network Network Science: Graph Theory

Question 6 Q6: Paths

PATHOLOGY Section 2.8

A path is a sequence of nodes in which each node is adjacent to the next one P i0,in of length n between nodes i 0 and i n is an ordered collection of n+1 nodes and n links In a directed network, the path can follow only the direction of an arrow. Network Science: Graph Theory PATHS

The distance (shortest path, geodesic path) between two nodes is defined as the number of edges along the shortest path connecting them. *If the two nodes are disconnected, the distance is infinity. In directed graphs each path needs to follow the direction of the arrows. Thus in a digraph the distance from node A to B (on an AB path) is generally different from the distance from node B to A (on a BCA path). Network Science: Graph Theory DISTANCE IN A GRAPH Shortest Path, Geodesic Path D C A B D C A B

N ij,number of paths between any two nodes i and j: Length n=1: If there is a link between i and j, then A ij =1 and A ij =0 otherwise. Length n=2: If there is a path of length two between i and j, then A ik A kj =1, and A ik A kj =0 otherwise. The number of paths of length 2: Length n: In general, if there is a path of length n between i and j, then A ik …A lj =1 and A ik …A lj =0 otherwise. The number of paths of length n between i and j is * * holds for both directed and undirected networks. Network Science: Graph Theory NUMBER OF PATHS BETWEEN TWO NODES Adjacency Matrix

Distance between node 0 and node 4: 1.Start at 0. Network Science: Graph Theory FINDING DISTANCES: BREADTH FIRST SEARCH Network Science: Graph Theory

Distance between node 0 and node 4: 1.Start at 0. 2.Find the nodes adjacent to 1. Mark them as at distance 1. Put them in a queue. Network Science: Graph Theory FINDING DISTANCES: BREADTH FIRST SEARCH 011 1

Distance between node 0 and node 4: 1.Start at 0. 2.Find the nodes adjacent to 0. Mark them as at distance 1. Put them in a queue. 3.Take the first node out of the queue. Find the unmarked nodes adjacent to it in the graph. Mark them with the label of 2. Put them in the queue. Network Science: Graph Theory FINDING DISTANCES: BREADTH FIRST SEARCH Network Science: Graph Theory 1 1

Distance between node 0 and node 4: 1.Repeat until you find node 4 or there are no more nodes in the queue. 2.The distance between 0 and 4 is the label of 4 or, if 4 does not have a label, infinity. FINDING DISTANCES: BREADTH FIRST SEARCH

Diameter: d max the maximum distance between any pair of nodes in the graph. Average path length/distance,, for a connected graph: where d ij is the distance from node i to node j In an undirected graph d ij =d ji, so we only need to count them once: Network Science: Graph Theory NETWORK DIAMETER AND AVERAGE DISTANCE

Network Science: Graph Theory PATHOLOGY: summary Shortest Path The path with the shortest length between two nodes (distance).

Network Science: Graph Theory PATHOLOGY: summary Diameter Average Path Length The longest shortest path in a graph The average of the shortest paths for all pairs of nodes.

Network Science: Graph Theory PATHOLOGY: summary Cycle Self-avoiding Path A path with the same start and end node. A path that does not intersect itself.

Network Science: Graph Theory PATHOLOGY: summary Eulerian Path Hamiltonian Path A path that visits each node exactly once. A path that traverses each link exactly once.

CONNECTEDNESS Section 2.9

Connected (undirected) graph: any two vertices can be joined by a path. A disconnected graph is made up by two or more connected components. Bridge: if we erase it, the graph becomes disconnected. Largest Component: Giant Component The rest: Isolates Network Science: Graph Theory CONNECTIVITY OF UNDIRECTED GRAPHS D C A B F F G D C A B F F G

The adjacency matrix of a network with several components can be written in a block- diagonal form, so that nonzero elements are confined to squares, with all other elements being zero: Network Science: Graph Theory CONNECTIVITY OF UNDIRECTED GRAPHS Adjacency Matrix

Strongly connected directed graph: has a path from each node to every other node and vice versa (e.g. AB path and BA path). Weakly connected directed graph: it is connected if we disregard the edge directions. Strongly connected components can be identified, but not every node is part of a nontrivial strongly connected component. In-component : nodes that can reach the scc, Out-component : nodes that can be reached from the scc. Network Science: Graph Theory CONNECTIVITY OF DIRECTED GRAPHS D C A B F G E E C A B G F D

Section 2.9

Clustering coefficient Section 10

Clustering coefficient: what fraction of your neighbors are connected? Node i with degree k i C i in [0,1] Network Science: Graph Theory CLUSTERING COEFFICIENT Watts & Strogatz, Nature 1998.

Clustering coefficient: what fraction of your neighbors are connected? Node i with degree k i C i in [0,1] Network Science: Graph Theory CLUSTERING COEFFICIENT Watts & Strogatz, Nature 1998.

summary Section 11

Degree distribution: P(k) Path length: Clustering coefficient: Network Science: Graph Theory THREE CENTRAL QUANTITIES IN NETWORK SCIENCE

3 Network Science: Graph Theory GRAPHOLOGY 1 UndirectedDirected Actor network, protein-protein interactionsWWW, citation networks

Network Science: Graph Theory GRAPHOLOGY 2 Unweighted (undirected) Weighted (undirected) protein-protein interactions, wwwCall Graph, metabolic networks

Network Science: Graph Theory GRAPHOLOGY 3 Self-interactionsMultigraph (undirected) Protein interaction network, wwwSocial networks, collaboration networks

Network Science: Graph Theory GRAPHOLOGY 4 Complete Graph (undirected) Actor network, protein-protein interactions

Network Science: Graph Theory GRAPHOLOGY: Real networks can have multiple characteristics WWW > directed multigraph with self-interactions Protein Interactions > undirected unweighted with self-interactions Collaboration network > undirected multigraph or weighted. Mobile phone calls > directed, weighted. Facebook Friendship links > undirected, unweighted.

A. Degree distribution: p k B. Path length: C. Clustering coefficient: Network Science: Graph Theory THREE CENTRAL QUANTITIES IN NETWORK SCIENCE

protein-gene interactions protein-protein interactions PROTEOME GENOME Citrate Cycle METABOLISM Bio-chemical reactions Bio-Map

Metabolic Network Metab-movie Protein Interactions

A CASE STUDY: PROTEIN-PROTEIN INTERACTION NETWORK Network Science: Graph Theory Undirected network N=2,018 proteins as nodes L=2,930 binding interactions as links. Average degree =2.90. Not connected: 185 components the largest (giant component) 1,647 nodes

A CASE STUDY: PROTEIN-PROTEIN INTERACTION NETWORK Network Science: Graph Theory Undirected network N=2,018 proteins as nodes L=2,930 binding interactions as links. Average degree =2.90. Not connected: 185 components the largest (giant component) 1,647 nodes

A CASE STUDY: PROTEIN-PROTEIN INTERACTION NETWORK Network Science: Graph Theory p k is the probability that a node has degree k. N k = # nodes with degree k p k = N k / N

A CASE STUDY: PROTEIN-PROTEIN INTERACTION NETWORK Network Science: Graph Theory d max =14 =5.61

A CASE STUDY: PROTEIN-PROTEIN INTERACTION NETWORK Network Science: Graph Theory =0.12

FINAL PROJECTS

COMPONENTS OF THE PROJECT 1.DATA ACQUISITION Downloading the data and putting it in a usable format 2.NETWORK RESPRESENTATION What are the nodes and links 3.NETWORK ANALYSIS What questions do you want to answer with this network, and which tools/measurements will you use?

DATA ACQUISITION Many online data sources will have an API (application programming interface) that allows querying and downloading the data in a targeted way Example: What are all movies from starring Kevin Bacon and distributed by Paramount Pictures? This is done either through a web interface or through a library within a programming language Other sources will provide raw bulk data (e.g., Excel spreadsheets) that require processing, either manually or through a program you will write

NETWORK RECONSTRUCTION Most datasets will admit more than one representation as a network Some representations will be more or less informative than others Figuring out the “network” that’s buried in your data is part of your project!

NETWORK RECONSTRUCTION Suppose you have a list of students and the courses they are registered for One possible network Another possibility Joe PHYS 5116 BIO 1234 BIO 1234 Jane Sam Joe Jane Sam

Measure: N(t), L(t) [t- time if you have a time dependent system); P(k) (degree distribution); average path length; C (clustering coefficient), C rand, C(k); Visualization/communities; P(w) if you have a weighted network; networ robustness (if appropriate); spreading (if appropriate). It is not sufficient to measure things– you need to discuss the insights they offer: What did you learn from each quantity you measured? What was your expectation? How do the results compare to your expectations? Time frame will be strictly enforced. Approx 12min + 3 min questions; You will also need to write a formal report summarizing your project. Send us an with names/titles/program. Come earlier and try out your slides with the projector. Show an entry of the data source—just to have a sense of how the source looks like. On the slide, give your program/name. Grading criteria: Use of network tools (completeness/correctness); Ability to extract information/insights from your data using the network tools; Overall quality of the project/presentation. Final project guidelines