Network Science Class 2: Graph Theory (Ch2) Albert-László Barabási with Roberta Sinatra and Sean P. Cornelius www.BarabasiLab.com.

Slides:



Advertisements
Similar presentations
Performance in Decentralized Filesharing Networks Theodore Hong Freenet Project.
Advertisements

Introduction to Mendeley. What is Mendeley? Mendeley is a reference manager allowing you to manage, read, share, annotate and cite your research papers...
‘Small World’ Networks (An Introduction) Presenter : Vishal Asthana
A PowerPoint Presentation
Reference Management Software Tools Mendeley. Table of Contents: Part A Background/Location Signup/Login Import References Organize (Manage) References.
The Small World of Software Reverse Engineering Ahmed E. Hassan and Richard C. Holt SoftWare Architecture Group (SWAG) University Of Waterloo.
Analysis and Modeling of Social Networks Foudalis Ilias.
The Structure of Networks with emphasis on information and social networks RU T-214-SINE Summer 2011 Ýmir Vigfússon.
Mining and Searching Massive Graphs (Networks)
1 Representing Graphs. 2 Adjacency Matrix Suppose we have a graph G with n nodes. The adjacency matrix is the n x n matrix A=[a ij ] with: a ij = 1 if.
Information Retrieval in Practice
T HE S TRUCTURE OF S CIENTIFIC C OLLABORATION N ETWORKS & R ESEARCH F UNDING N ETWORKS CS790g Complex Networks Jigar Patel November 30 th 2009.
Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.
HCC class lecture 6 comments John Canny 2/7/05. Administrivia.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
September -, 2006 NEAT IDEAS FAIR SPONSORED BY Recommended format for creating an Neat Ideas Fair (Social Innovation Award Entry) Poster This slide set.
1527 Standley Dr. Early childhood education I am a full time student and mother.
CSE 321 Discrete Structures Winter 2008 Lecture 25 Graph Theory.
Chapter 14 The Second Component: The Database.
Experimental Psychology PSY 433
Managing references : Mendeley
English Word Origins Grade 3 Middle School (US 9 th Grade) Advanced English Pablo Sherman The etymology of language.
This presentation will guide you though the initial stages of installation, through to producing your first report Click your mouse to advance the presentation.
Table of Contents Introduction to Dashboard Design Dashboards for different audiences Dashboard Design - Questions to Ask Dashboard Color Schemes Color.
Topic 13 Network Models Credits: C. Faloutsos and J. Leskovec Tutorial
Graph Theory in 50 minutes. This Graph has 6 nodes (also called vertices) and 7 edges (also called links)
1 Business Administrators of today and tomorrow need, along with their business knowledge, analytic insight and understanding, as well the ability.
Database Systems COMSATS INSTITUTE OF INFORMATION TECHNOLOGY, VEHARI.
Computers Are Your Future Tenth Edition Chapter 12: Databases & Information Systems Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall1.
Managing your References Sue Bird Bodleian Bio- & Environmental Sciences October 2010.
Pascal Visualization Challenge Blaž Fortuna, IJS Marko Grobelnik, IJS Steve Gunn, US.
System Design: Designing the User Interface Dr. Dania Bilal IS582 Spring 2009.
Section 8 – Ec1818 Jeremy Barofsky March 31 st and April 1 st, 2010.
Introduction to Mendeley. What is Mendeley? Mendeley is a reference manager allowing you to manage, read, share, annotate and cite your research papers...
BIO1130 Lab 2 Scientific literature. Laboratory objectives After completing this laboratory, you should be able to: Determine whether a publication can.
These slides are designed to accompany Web Engineering: A Practitioner’s Approach (The McGraw-Hill Companies, Inc.) by Roger Pressman and David Lowe, copyright.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Complex Networks First Lecture TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA TexPoint fonts used in EMF. Read the.
Course grading Project: 75% Broken into several incremental deliverables Paper appraisal/evaluation/project tool evaluation in earlier May: 25%
Midterm Project Guide Prof. Dr. Daning Hu Department of Informatics University of Zurich Oct 23th, 2012.
Reports and Learning Resources Module 5 1. SLMS Primary Administrator Training Module 5: Reports and Learning Resources 2.
Neural Network of C. elegans is a Small-World Network Masroor Hossain Wednesday, February 29 th, 2012 Introduction to Complex Systems.
NVivo Software – A Qualitative Research And Data Analysis Tool: New User Tutorial Created Through a CMU Faculty Insight Team Grant by Joanne Hopper Bradley.
Most of contents are provided by the website Introduction TJTSD66: Advanced Topics in Social Media Dr.
Fair and Appropriate Grading
CPT 499 Internet Skills for Educators Overview of the Internet Session One.
Word Processing Word processing packages such as Microsoft Word are text based. When text is entered via a keyboard, the characters are displayed on screen.
Class 2: Graph Theory IST402. The Bridges of Konigsberg Section 1.
Class 2: Graph Theory IST402. Can one walk across the seven bridges and never cross the same bridge twice? Network Science: Graph Theory THE BRIDGES OF.
Class 2: Graph Theory IST402.
Organize. Collaborate. Discover. 1 Introduction to Mendeley.
Random Network IST402 – Network Science Acknowledgement: Laszlo Barabasi.
Topical Analysis and Visualization of (Network) Data Using Sci2 Ted Polley Research & Editorial Assistant Cyberinfrastructure for Network Science Center.
GRAPH ANALYSIS AND VISUALIZATION PART 1. History of Graph 1735.
GBIF Governing Board 20 Module 6B: New GBIF Tools II 2013 Portal and NPT Startup Daniel Amariles IT Leader, National Biodiversity Information System of.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 11: BIG DATA AND.
11 Crucifix Lane London Bridge London SE1 3JW United Kingdom Advertising Deck 2016.
Topics In Social Computing (67810) Module 1 Introduction & The Structure of Social Networks.
DATA VISUALIZATION BOB MARSHALL, MD MPH MISM FAAFP FACULTY, DOD CLINICAL INFORMATICS FELLOWSHIP.
Online collaboration. Objective Know the key characteristics of online collaborative tools – multiple users; – real time and – concurrent access. Aware.
Interaction and Animation on Geolocalization Based Network Topology by Engin Arslan.
Theory and Practice of Web Technology
ALABAMA VIRTUAL LIBRARY
Assessing Hierarchical Modularity in Protein Interaction Networks
Network Science: A Short Introduction i3 Workshop
Peer-to-Peer and Social Networks Fall 2017
Department of Computer Science University of York
Network Science: A Short Introduction i3 Workshop
Data Mining Chapter 6 Search Engines
ALABAMA VIRTUAL LIBRARY
Presentation transcript:

Network Science Class 2: Graph Theory (Ch2) Albert-László Barabási with Roberta Sinatra and Sean P. Cornelius

Questions 1: The Bridges of Konisberg and mapping/defining a network. 5: WWW: Tell us about its characteristics relying on the concepts we learned in Chapter 2 (synthesis). 2: Degree, degree distribution. 8: Directed vs. undirected networks (synthesis). 4: Paths and distances and connectedness 3: Adjacency Matrices and Sparseness. 6: Bipartite Networks/Clustering coefficient 7. Weighted networks/ Value of a network

Next Class Reading For Wednesday: Read: Ch3 Watts and Strogatz Milgram.

FINAL PROJECTS

COMPONENTS OF THE PROJECT 1.DATA ACQUISITION Downloading the data and putting it in a usable format 2.NETWORK RESPRESENTATION What are the nodes and links 3.NETWORK ANALYSIS What questions do you want to answer with this network, and which tools/measurements will you use?

DATA ACQUISITION Many online data sources will have an API (application programming interface) that allows querying and downloading the data in a targeted way Example: What are all movies from starring Kevin Bacon and distributed by Paramount Pictures? This is done either through a web interface or through a library within a programming language Other sources will provide raw bulk data (e.g., Excel spreadsheets) that require processing, either manually or through a program you will write

NETWORK RECONSTRUCTION Most datasets will admit more than one representation as a network Some representations will be more or less informative than others Figuring out the “network” that’s buried in your data is part of your project!

NETWORK RECONSTRUCTION Suppose you have a list of students and the courses they are registered for One possible network Another possibility Joe PHYS 5116 BIO 1234 BIO 1234 Jane Sam Joe Jane Sam

Books

Like IMDB for books (contains books, ratings, reviews, recommendations, etc.) API available at Potential areas of investigation: Similarity network of books Community detection (discovering genres)

Comics Global comics database

Comics Many different data bout each comic, e.g.: Publisher Who wrote script/penciled/inked Publication date Wiki and advanced search interface available Potential areas of investigation: Comics linked by common characters Collaboration network between artists

Mendeley

Mendeley Large scientific publication database/social network for researchers API available (dev.mendeley.com) Idea: use readership to assign authorship credit Data consist of user profiles + papers the user has read Publications (nodes) are linked if they are both present in one or more users’ lists Use recently-developed techniques to infer authorship credit based on user perception: (

3D Printing (1)

How to lay out and visualize a network in 2 dimensional space is a well-developed field Less clear is how to embed a network in 3D so it can be 3D printed. Things to consider: Make sure most nodes are distinguishable Prevent “collisions” between links Make sure the overall result is structurally sound

3D Printing (2) C. Elegans connectome

3D Printing (2) The C elegans neural network is the most accurately mapped nervous system with 279 neurons and 95 muscles connected by about 3500 links. We want to be be able to represent and print this in 3D in an informative way Challenges Network is dense. Need to avoid a “hairball” Representation needs to distinguish (e.g. with different colors) different types of nodes (sensory neurons, interneurons, motor neurons and muscles) and links (directed synapses and undirected junctions) Known subnetworks need to be clearly identifiable

3D Printing - Notes The 3D printing projects require that the students get in touch with a 3D printing facility to get instructions on the software they use and other details about how to print their network Learning the relevant 3D layout language and translating the network structure into this format is a key part of these projects NEU has a 3D print studio at Snell library (see dmc.northeastern.edu )

GDelt

It is a dataset monitoring news (broadcast, print and web) from 1979 to today in the entire world. It identifies names, places, organizations, emotions, counts. They offer raw data files and/or possibility of querying a database Projects: (i) study the individual – individual network (two individuals are connected if they appear in the same news) over time, see how leaders emerge. (ii) study the network of locations, with two locations connected if the same news is reported. How do news travel over space? The dataset can be used for many more projects!

Baseball

Baseball Extensive database of statistics, at the player level (individual stats) and at the team level (team compositions, hall of fame, managers, etc.) WARNING: Roberta and Sean know nothing about baseball Nonetheless, possible research directions Are there characteristics of the network that distinguish hall-of-famers? Mobility of players/managers across teams

Measure: N(t), L(t) [t- time if you have a time dependent system); P(k) (degree distribution); average path length; C (clustering coefficient), C rand, C(k); Visualization/communities; P(w) if you have a weighted network; networ robustness (if appropriate); spreading (if appropriate). It is not sufficient to measure things– you need to discuss the insights they offer: What did you learn from each quantity you measured? What was your expectation? How do the results compare to your expectations? Time frame will be strictly enforced. Approx 12min + 3 min questions; No need to write a report—you will hand in the presentation. Send us an with names/titles/program. Come earlier and try out your slides with the projector. Show an entry of the data source—just to have a sense of how the source looks like. On the slide, give your program/name. Grading criteria: Use of network tools (completeness/correctness); Ability to extract information/insights from your data using the network tools; Overall quality of the project/presentation. Final project guidelines