Class 3: Introduction to CINET We are surrounded by systems that are hopelessly complex, from the society, a collection of seven billion individuals, to communications systems, integrating billions of devices, from computers to cell phones. Our very existence is rooted in the ability of thousands of genes to work together in a seamless fashion; our thoughts, reasoning, and our ability to comprehend the world surrounding us is hidden in the coherent activity of billions of neurons in our brain. Many of these systems appear to random to the casual observer, but upon closer inspection they are found to display endless signatures of order and self-organization. These systems are collectively called complex systems, and the research area that aims to explore and understand them is often referred to as complexity. Given the important role they play in our life, their understanding, quantification, prediction and eventually control is the major intellectual scientific challenges of the 21st century. Class 3: Introduction to CINET Tools for network analysis and visualization Prof. Boleslaw K. Szymanski Konstantin Kuzmin Network Science: Introduction to CINET 2015
Tools for network analysis and visualization TOOLS OVERVIEW (LISTED ALPHABETICALLY) Tools for network analysis and visualization Computing model and interface Desktop GUI applications API/code libraries, Web services Web GUI front-ends (cloud, distributed, HPC) Extensibility model Only by the original developers By other users/developers (add-ins, modules, additional packages, etc.) Source availability model Open-source Closed-source Business model Free of charge Commercial Network Science: Introduction to CINET 2015
CyberInfrastructure for NETwork science TOOLS CINET CyberInfrastructure for NETwork science Accessed via a Web-based portal Supported by grants, no charge for end users Aims to provide researchers, analysts, and educators interested in Network Science with an easy-to-use cyber-environment that is accessible from their desktop and integrates into their daily work Users can contribute new networks, data, algorithms, hardware, and research results Primarily for research, teaching, and collaboration No programming experience is required Network Science: Introduction to CINET 2015
Network Data Integration, Analysis, and Visualization TOOLS Cytoscape Network Data Integration, Analysis, and Visualization A standalone GUI application A platform for visualizing complex networks and integrating these with any type of attribute data Originally developed for biological research Includes features for data integration, analysis, and visualization A variety of layout algorithms, including cyclic, tree, force-directed, edge-weight, and yFiles Organic layouts Implemented in Java Runs on any Java-supported platform Modular architecture extensible through plugins (called Apps) Open-source and free of charge Network Science: Introduction to CINET 2015
The Open Graph Viz Platform TOOLS Gephi The Open Graph Viz Platform A standalone GUI application An interactive visualization and exploration platform for all kinds of networks and complex systems, dynamic and hierarchical graphs Static and dynamic networks Clustering and hierarchical graphs, community detection Visualization layouts supported: ForceAtlas, Yifan's Hu Multilevel Modular architecture customizable with plugins Runs on Windows, Linux and Mac OS X Implemented in Java. Graph size <1M nodes & edges Open-source and free of charge Network Science: Introduction to CINET 2015
Graph Visualization Software TOOLS Graphviz Graph Visualization Software A graph description language (called DOT) and a set of tools that can generate and/or process DOT files Can be used as standalone tool or as a library Only graph drawing A wide range of layouts: Hierarchical or layered drawings Spring model layouts Multiscale layout for large graphs Radial layouts Circular layouts Implemented in C Runs on Linux, Windows and Mac OS X Extensible through a scripting API Open-source and free of charge Network Science: Introduction to CINET 2015
Network Science: Introduction to CINET 2015 TOOLS Pajek Pajek and Pajek-XXL A standalone GUI application Several partitioning and community detection algorithms Network generator (random, Bernoulli/Poisson, scale free, small world, etc.) Support for ordinary (directed, undirected, mixed) as well as multi-relational networks, bipartite, and temporal networks Capable of analyzing and visualizing large networks with thousands or even millions of nodes Macro capability enables recording and playback of a sequence of primitive commands Implemented in Delphi (Pascal). Only Windows OS are supported (32 and 64 bit) Freely available for noncommercial use Network Science: Introduction to CINET 2015
Stanford Network Analysis Platform (SNAP) TOOLS SNAP Stanford Network Analysis Platform (SNAP) A general purpose network analysis and graph mining library Written in C++ but Python interface is also available Scales to massive networks with hundreds of millions of nodes, and billions of edges Efficiently manipulates large graphs, calculates structural properties, generates regular and random graphs, and supports attributes on nodes and edges Also available through the NodeXL which is a graphical front-end that integrates network analysis into Microsoft Office and Excel Network Science: Introduction to CINET 2015 http://snap.stanford.edu/
Network Science: Introduction to CINET 2015 What is CINET A web-based tool for analyzing networks that represent interactions in large-scale complex systems A large set of networks and algorithms to analyze networks Ability to add user networks and have them analyzed by the algorithms available in CINET The web-based interface has been designed to simplify the analysis of complex networks for users who are not necessarily computer scientists Network Science: Introduction to CINET 2015
Creating an account with GRANITE CINET Registration Creating an account with GRANITE Go to the login page http://cinet.vbi.vt.edu/granite/granite.html Click “Register” to create an account Fill in the “Request Account” form and click “Register Account” Use your username and password to log into the system Network Science: Introduction to CINET 2015
Network Science: Introduction to CINET 2015 CINET Structural organization Client-server model Network Science: Introduction to CINET 2015
Network Science: Introduction to CINET 2015 CINET Architecture Layered architecture Network Science: Introduction to CINET 2015
Network Science: Introduction to CINET 2015 CINET Apps Tools in CINET Structural Analysis Tool (Granite) 190+ networks (graphs) 20+ network generators 70+ network algorithms (measures): GaLib, SNAP (Stanford), NetworkX Visualization of networks: Gephi Service for adding new networks (graphs) Service for adding new structural analysis tools (graph algorithms) Graph Dynamical System Calculator (GDSC) Complete network dynamics on networks Analyzing the phase structure of GDS; small graphs 13 graph templates; 15 vertex function (behavior) families Simulation of Dynamics (EDISON) Forward trajectory (dynamics) on networks Compute (contagion) dynamics on larger networks: simulation Services to manipulate attributed networks and to run simulations Several contagion models: with and without interventions Network Science: Introduction to CINET 2015
Computational engines and resources CINET Components Computational engines and resources GaLib: provides efficient implementations of various classical and new graph algorithms that are motivated by the analysis of social contact graphs and disease dynamics on such graphs. NetworkX: a powerful Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks. Stanford Network Analysis Platform (SNAP): a general purpose network analysis and graph mining library. Both traditional high performance computing clusters, e.g., Shadowfax, Pecos (Virginia Tech), and cloud computing infrastructure, e.g., FutureGrid. An intelligent resource manager chooses appropriate computing platform for a network analysis job considering resource availability and computational and memory requirement. Network Science: Introduction to CINET 2015
Network Science: Introduction to CINET 2015 CINET Features Available features Network Analysis Network Generators Network List Measure List Visualization NetScript Dynamic Analysis (in upcoming versions) Network Science: Introduction to CINET 2015
Network Science: Introduction to CINET 2015 CINET Datasets Networks Social, web/internet, biological, infrastructure and transportation, artificial, and other types of networks Currently 194 public datasets are available: Amazon product co-purchasing American College Football DBLP Collaboration Enron email Gowalla friendship Wikipedia Who-votes-on-whom … Public networks are available to any CINET user Users can also upload their own datasets and make them public or private Two different representations of the networks are supported: Adjacency list (Galib) format Edge list (NetworkX) format Network Science: Introduction to CINET 2015
Network Science: Introduction to CINET 2015 CINET Analysis Tools Network analysis Graph Algorithms Over 70 algorithms with variety of types related to shortest path, sub graph and motif counting, centrality, graph traversal, etc. Dynamic Analysis Multiple different simulation codes to provide different diffusion models and simulation capabilities. Analysis of the phasic structure of a graph dynamical system (e.g., spreading dynamic phenomena such as rumors through networks). Network Generators Implementation of ~20 random and deterministic network generators such as Barabási–Albert, Erdős–Rényi, small world, star graphs, etc. Network Science: Introduction to CINET 2015
Network Visualization CINET Visualization Network Visualization An integrated visualization module that supports dynamic range of visualizations. Multiple layout algorithms: Random, Force Atlas, Yifan Hu, etc. Feature based organization: determining node size and color by degree, betweenness, etc. Coloring communities: applying community detection algorithm to visualize different communities in different colors. Vector graphics output (SVG). Network Science: Introduction to CINET 2015
Using CINET in education and research CINET Applications Using CINET in education and research Network science courses Virginia Tech, Blacksburg, VA North Carolina A&T State University, Greensboro, NC Jackson State University, Jackson, MS University at Albany – State University of New York, Albany, NY Research We the People (WtP) project: Web-enabled petitioning system Other petitioning sites (change.org) Case studies Network Science: Introduction to CINET 2015
Network Science: Introduction to CINET 2015 CINET Summary CINET in Context User interface—all user interaction – No need to program – No need for HPC resources. Types of analysis – Network structural characteristics – Dynamics on networks Large networks – Generation – Analysis Multiple tools provided under a CINET umbrella Crowd-sourced platform – Self-sustaining – Self-managing Collaborative science Community resource Network Science: Introduction to CINET 2015
Papers and other publications CINET References Papers and other publications Abdelhamid S, Alo R, Arifuzzaman S, Beckman P, Bhuiyan M, Bisset K, Fox E, Fox G, Hall K, Hasan S, Joshi A, Khan M, Kuhlman C, Lee S, Leidig J, Makkapati H, Marathe M, Mortveit H, Qiu J, Ravi S, Shams Z, Sirisaengtaksin O, Subbiah R, Swarup S, Trebon N, Vullikanti A, Zhao Z (2012) CINET: A CyberInfrastructure for Network Science. In The 8th IEEE International Conference on eScience, 2012. Chicago, IL, October 8-12, 2012. Abdelhamid S, Alam M, Alo R, Arifuzzaman S, Beckman P, Bhattacharjee T, Bhuiyan H, Bisset K, Eubank S, Esterline A, Fox E, Fox G, Hasan S, Hayatnagarkar H, Khan M, Kuhlman C, Marathe M, Meghanathan N, Mortveit H, Qiu J, Ravi S, Shams Z, Sirisaengtaksin O, Swarup S, Vullikanti A, Wu T (2014) CINET 2.0: A CyberInfrastructure for Network Science. In The 10th IEEE International Conference on eScience, 324-331. Abdelhamid et. al., “GDSCalc: A Web-Based Application for Evaluating Discrete Graph Dynamical Systems,” PLOS One 2015. … Network Science: Introduction to CINET 2015
Network Science: Introduction to CINET 2015 CINET Links Useful links Main CINET page http://cinet.vbi.vt.edu/ Granite page http://cinet.vbi.vt.edu/granite/granite.html Stanford Network Analysis Project http://snap.stanford.edu/ Network Science: Introduction to CINET 2015
Network Science: Introduction to CINET 2015 CINET Hands-on Labs Overview Exercise 1 Learn how to use CINET through the Granite interface Compute simple network measures Exercise 2 Analyze a larger set of networks with CINET Use the output of CINET to compute additional network measures and study correlations between graph parameters Exercise 3 Use CINET to visualize networks Explore different layouts and visualization parameters Network Science: Introduction to CINET 2015 http://pmtips.net/Blog/handson-project-manager
Network Science: Introduction to CINET 2015 CINET Hands-on Labs Exercise 1 Objectives In this exercise Review networks and measures available in CINET Practice setting up network analysis and using different measures Compute three measures for each of the two networks (Dolphins Social Network in New Zealand and Erdős Collaboration Network). Fill in the following table: Network # of nodes # of edges Density # of triangles Diameter Dolphins 62 159 Erdős 6,927 11,850 Network Science: Introduction to CINET 2015
Network Science: Introduction to CINET 2015 CINET Hands-on Labs Exercise 1 Procedure Follow these steps Set up a new analysis (click on the “+New Analysis” button and choose a name for the analysis). In the search box under the “Networks” heading , type “Dolphins” (without the quote marks). The name of the network (“Dolphins Social Network in NZ”) appears below the search box. Select the network by clicking on the check box. If necessary, additional networks can also be selected for analysis. Click the “Continue” button above “Networks”; the system then displays the menu for “Add measure”. In the search box under “Measures”, type “Density” (without the quote marks). The Density measure appears below the search box. Select the measure by clicking on the check box. It is possible to compute multiple measures as part of the same analysis. Use measures called “Compute the Number of Triangles” and “Find Diameter of a Graph” provided by CINET. Click the “Analyze” button above “Measures”. The system starts the computation and displays the “Status” of the computation. When the “Status” appears as “COMPLETED”, click on the “View Report” link. In the resulting window, click on the log.out link to see the answer and record the answer in the table above. Network Science: Introduction to CINET 2015
Network Science: Introduction to CINET 2015 CINET Hands-on Labs Exercise 1 Outcome Exercise review What kind of networks are publicly available in CINET? What network analysis measures does CINET offer? Analysis results for the networks: Network # of nodes # of edges Density # of triangles Diameter Dolphins 62 159 ≈0.084082 95 8 Erdős 6,927 11,850 ≈0.000494 5,973 4 Network Science: Introduction to CINET 2015
Average node degree (∆) CINET Hands-on Labs Exercise 2 Objectives In this exercise Compute three measures for each of the five networks. Fill in the following table: Determine whether certain pairs of graph measures are correlated using Pearson Correlation Coefficient (PCC) as the measure of correlation. Draw scatter plots. Network # of nodes # of edges Average node degree (∆) # of triangles (T) Diameter (D) Autonomous systems - Oregon-1-010407 10,729 21,999 Δ 1 = 𝑇 1 = 𝐷 1 = Erdős Collaboration Network 6,927 11,850 Δ 2 = 𝑇 2 = 𝐷 2 = Autonomous systems - Oregon-1-010331 10,670 22,002 Δ 3 = 𝑇 3 = 𝐷 3 = Autonomous systems - Oregon-2-010331 10,900 31,180 Δ 4 = 𝑇 4 = 𝐷 4 = Enron Giant Component 33,696 180,811 Δ 5 = 𝑇 5 = 𝐷 5 = Network Science: Introduction to CINET 2015
Pearson Correlation Coefficient CINET Hands-on Labs Exercise 2 Pearson Correlation Coefficient Pearson Correlation Coefficient Suppose we are given a data sample consisting of n ≥ 1 pairs of numbers 𝑥 1 , 𝑦 1 , 𝑥 2 , 𝑦 2 , …, 𝑥 𝑛 , 𝑦 𝑛 . Let 𝑥 and 𝑦 denote respectively the mean values of the sets 𝑋= 𝑥 1 , 𝑥 2 , …, 𝑥 𝑛 and 𝑌= 𝑦 1 , 𝑦 2 , …, 𝑦 𝑛 ; that is 𝑥 = 𝑖=1 𝑛 𝑥 𝑖 𝑛 , and 𝑦 = 𝑖=1 𝑛 𝑦 𝑖 𝑛 . The Pearson Correlation Coefficient (PCC) r for the sample is given by 𝑟= 𝑖=1 𝑛 𝑥 𝑖 − 𝑥 𝑦 𝑖 − 𝑦 𝑖=1 𝑛 𝑥 𝑖 − 𝑥 2 𝑖=1 𝑛 𝑦 𝑖 − 𝑦 2 where positive square roots are used for both terms in the denominator. The PCC value r defined above satisfies the condition −1 ≤ r ≤ 1. The value r = 1 indicates that a linear equation describes the relationship between the two sets X and Y . Similarly, r = −1 indicates a linear relationship between the two sets, with Y values decreasing as the X values increase. The value r = 0 indicates that X and Y are not correlated. Network Science: Introduction to CINET 2015
Network Science: Introduction to CINET 2015 CINET Hands-on Labs Exercise 2 Procedure Follow these steps Set up a new analysis in CINET and select the appropriate networks. For each of the five networks, find the average node degree (use a measure called “Degree Statistics” provided by CINET), the number of triangles, and the diameter. Since each of the five networks is connected, all five diameter values should be finite. Compute the PCC value 𝑟 1 for the sample Δ 1 , 𝑇 1 , Δ 2 , 𝑇 2 , …, Δ 5 , 𝑇 5 using a tool of your choice (a calculator, an Excel spreadsheet, by writing a simple program, etc.). Compute the PCC value 𝑟 2 for the sample Δ 1 , 𝐷 1 , Δ 2 , 𝐷 2 , …, Δ 5 , 𝐷 5 using a tool of your choice. Prepare two scatter plots, one showing the pairs Δ 1 , 𝑇 1 , Δ 2 , 𝑇 2 , …, Δ 5 , 𝑇 5 and the other showing the pairs Δ 1 , 𝐷 1 , Δ 2 , 𝐷 2 , …, Δ 5 , 𝐷 5 . In each case, please show the ∆ values along the X axis and the other value along the Y axis. Network Science: Introduction to CINET 2015
Average node degree (∆) CINET Hands-on Labs Exercise 2 Outcome Exercise review Is there a correlation between the network measures you computed? If so, what kind of correlation it is and why? What does it tell you about the networks? Analysis results for two networks: Network # of nodes # of edges Average node degree (∆) # of triangles (T) Diameter (D) Autonomous systems - Oregon-1-010407 10,729 21,999 Δ 1 =4.101 𝑇 1 =15,834 𝐷 1 =12 Erdős Collaboration Network 6,927 11,850 Δ 2 =3.421 𝑇 2 =5,973 𝐷 2 =4 Autonomous systems - Oregon-1-010331 10,670 22,002 Δ 3 =4.124 𝑇 3 =17,144 𝐷 3 =10 Autonomous systems - Oregon-2-010331 10,900 31,180 Δ 4 =5.721 𝑇 4 =82,856 𝐷 4 =9 Enron Giant Component 33,696 180,811 Δ 5 =10.732 𝑇 5 =725,311 𝐷 5 =13 Network Science: Introduction to CINET 2015
Network Science: Introduction to CINET 2015 CINET Hands-on Labs Exercise 2 Outcome Exercise review The PCC values are 𝑟 1 ≈0.981563 and 𝑟 2 ≈0.607290 Scatter plots Network Science: Introduction to CINET 2015
CINET Hands-on Labs Exercise 3 Objectives In this exercise Review layout algorithms and visualization parameters available in CINET Create visualizations for the following networks: Network # of nodes # of edges Karate 34 78 American College Football 115 613 Amazon product co-purchasing 262,111 617,438 Network Science: Introduction to CINET 2015
Network Science: Introduction to CINET 2015 CINET Hands-on Labs Exercise 3 Procedure Follow these steps Switch to the “Networks” tab. In the search box under the “Networks” heading , type “Karate” (without the quote marks). The name of the network (“Karate network”) appears below the search box. Click the network to select it. Set up a new visualization (click on the “+Add Visualization” button and choose a name for the visualization). Select “Random” as the Layout Algorithm. Click the “Generate” button at the bottom of the screen to produce the visualization. The system starts creating the visualization and displays the “Viz request submitted” status. Click the “Visualization” link to switch to the visualization pane. If the system is still displaying the “QUEUED”, “RUNNING”, or “DOWNLOADING RESULTS” prompt wait until rendering is done. Check the status by clicking on the visualization name to refresh the pane. Click on the network visualization to view it in a vector format (SVG). Save the SVG file on your local filesystem. Create additional visualizations for the same network with the following parameters: Once you have multiple visualizations you can switch between them by clicking on visualization names in the pane header. Follow the same procedure to create visualizations for other networks. Layout Node Node size Node Min Size Node Max Size Random Degree None 1 10 Force Atlas Modularity 5 Network Science: Introduction to CINET 2015
Exercise review What layout algorithms does CINET offer? CINET Hands-on Labs Exercise 3 Outcome Exercise review What layout algorithms does CINET offer? What are some possible ways of reducing clutter when visualizing large networks Network visualizations Karate, Force Atlas layout, modularity node parameter, node size: degree, node min size: 5, node max size: 10 American Football, Force Atlas layout, modularity node parameter, node size: degree, node min size: 5, node max size: 10 Karate, random layout, degree node parameter Network Science: Introduction to CINET 2015
Network Science: Introduction to CINET 2015 CINET Hands-on Labs Exercise 3 Outcome Exercise review Network visualizations Amazon product co-purchasing, Force Atlas layout, modularity node parameter, node size: degree, node min size: 5, node max size: 10 Network Science: Introduction to CINET 2015