CompSci The Internet l How valuable is a network? ä Metcalfe’s Law l Domain Name System: translates betweens names and IP addresses l Properties of the Internet ä Heterogeneity ä Redundancy ä Packet-switched ä 1.08 billion online (Computer Industry Almanac 2005) l Who has access? l How important is access?
CompSci Tim Berners-Lee I want you to realize that, if you can imagine a computer doing something, you can program a computer to do that. Unbounded opportunity... limited only by your imagination. And a couple of laws of physics. l TCP/IP, HTTP ä How, Why, What, When?
CompSci Graphs: Structures and Algorithms l How do packets of bits/information get routed on the internet ä Message divided into packets on client (your) machine ä Packets sent out using routing tables toward destination Packets may take different routes to destination What happens if packets lost or arrive out-of-order? ä Routing tables store local information, not global (why?) l What about The Oracle of Bacon, Erdos Numbers, and Word Ladders?The Oracle of BaconErdos Numbers ä All can be modeled using graphs ä What kind of connectivity does each concept model? l Graphs are everywhere in the world of algorithms (world?)
CompSci Vocabulary l Graphs are collections of vertices and edges (vertex also called node) ä Edge connects two vertices Direction can be important, directed edge, directed graph Edge may have associated weight/cost l A vertex sequence v 0, v 1, …, v n-1 is a path where v k and v k+1 are connected by an edge. ä If some vertex is repeated, the path is a cycle ä A graph is connected if there is a path between any pair of vertices NYC Phil Boston Wash DC LGALAX ORD DCA $186 $412 $1701 $441
CompSci Network/Graph questions/algorithms l What vertices are reachable from a given vertex? ä Two standard traversals: depth-first, breadth-first ä Find connected components, groups of connected vertices l Shortest path between any two vertices (weighted graphs?)! l Longest path in a graph ä No known efficient algorithm ä Longest shortest path: Diameter of graph l Visit all vertices without repeating? Visit all edges? ä With minimal cost? Hard! l What are the properties of the network? ä Structural: Is it connected? ä Statistical: What is the average number of neighbors?
CompSci Network Nature of Society l Slides from Michael Kearns - Univ. of Pennsylvania
CompSci Emerging science of networks l Examining apparent similarities between many human and technological systems & organizations l Importance of network effects in such systems l How things are connected matters greatly l Structure, asymmetry and heterogeneity l Details of interaction matter greatly l The metaphor of viral spread l Dynamics of economic and strategic interaction l Qualitative and quantitative; can be very subtle l A revolution of ä measurement ä theory ä breadth of vision (M. Kearns)
CompSci “Real World” Social Networks l Example: Acquaintanceship networks ä vertices: people in the world ä links: have met in person and know last names ä hard to measure l Example: scientific collaboration ä vertices: math and computer science researchers ä links: between coauthors on a published paper ä Erdos numbers : distance to Paul Erdos Erdos numbers ä Erdos was definitely a hub or connector; had 507 coauthors ä how do we navigate in such networks? (M. Kearns)
CompSci Online Social Networks l A somewhat recent example: Friendster ä vertices: subscribers to ä links: created via deliberate invitation l More recent and interesting: thefacebookthefacebook ä Join the Computer Science 1 group! l Older example: social interaction in LambdaMOO ä LambdaMOO: chat environment with “emotes” or verbs ä vertices: LambdaMOO usersLambdaMOO users ä links: defined by chat and verb exchange ä could also examine “friend” and “foe” sub-networks (M. Kearns)
CompSci Content Networks l Example: document similarity ä vertices: documents on the web ä links: defined by document similarity (e.g. Google) ä here’s a very nice visualizationvisualization ä not the web graph, but an overlay content network l Of course, every good scandal needs a networkscandal ä vertices: CEOs, spies, stock brokers, other shifty characters ä links: co-occurrence in the same article l Then there are conceptual networks ä a thesaurus defines a networkthesaurus ä so do the interactions in a mailing listmailing list (M. Kearns)
CompSci Business and Economic Networks l Example: eBay bidding ä vertices: eBay users ä links: represent bidder-seller or buyer-seller ä fraud detection: bidding rings l Example: corporate boardscorporate boards ä vertices: corporations ä links: between companies that share a board member l Example: corporate partnershipscorporate partnerships ä vertices: corporations ä links: represent formal joint ventures l Example: goods exchange networksgoods exchange networks ä vertices: buyers and sellers of commodities ä links: represent “permissible” transactions (M. Kearns)
CompSci Physical Networks l Example: the Internet ä vertices: Internet routersInternet routers ä links: physical connections ä vertices: Autonomous Systems (e.g. ISPs)Autonomous Systems ä links: represent peering agreements ä latter example is both physical and business network l Compare to more traditional data networkstraditional data networks l Example: the U.S. power gridU.S. power grid ä vertices: control stations on the power grid ä links: high-voltage transmission lines ä August 2003 blackout: classic example of interdependenceinterdependence (M. Kearns)
CompSci US Power Grid
CompSci Business & Economic Networks l Example: eBay bidding ä vertices: eBay users ä links: represent bidder-seller or buyer-seller ä fraud detection: bidding rings l Example: corporate boards ä vertices: corporations ä links: between companies that share a board member l Example: corporate partnerships ä vertices: corporations ä links: represent formal joint ventures l Example: goods exchange networks ä vertices: buyers and sellers of commodities ä links: represent “permissible” transactions
CompSci Content Networks l Example: Document similarity ä Vertices: documents on web ä Edges: Weights defined by similarity ä See TouchGraph GoogleBrowser l Conceptual network: thesaurus ä Vertices: words ä Edges: synonym relationships
CompSci Enron
CompSci Social networks l Example: Acquaintanceship networks ä vertices: people in the world ä links: have met in person and know last names ä hard to measure l Example: scientific collaboration ä vertices: math and computer science researchers ä links: between coauthors on a published paper ä Erdos numbers : distance to Paul Erdos ä Erdos was definitely a hub or connector; had 507 coauthors l How do we navigate in such networks?
CompSci
CompSci Acquaintanceship & more
CompSci Network Models (Barabasi) l Differences between Internet, Kazaa, Chord ä Building, modeling, predicting l Static networks, Dynamic networks ä Modeling and simulation l Random and Scale-free ä Implications? l Structure and Evolution ä Modeling via Touchgraph
CompSci Web-based social networks l Myspace73,000,000 l Passion.com23,000,000 l Friendster21,000,000 l Black Planet17,000,000 l Facebook8,000,000 l Who’s using these, what are they doing, how often are they doing it, why are they doing it?
CompSci Golbeck’s Criteria l Accessible over the web via a browser l Users explicitly state relationships ä Not mined or inferred l Relationships visible and browsable by others ä Reasons? l Support for users to make connections ä Simple HTML pages don’t suffice
CompSci CSE 112, Networked Life (UPenn) l Find the person in Facebook with the most friends ä Document your process l Find the person with the fewest friends ä What does this mean? l Search for profiles with some phrase that yields matches ä Graph degrees/friends, what is distribution?
CompSci CompSci 1: Overview CS0 l Audioscrobbler and last.fm ä Collaborative filtering ä What is a neighbor? ä What is the network?
CompSci What can we do with real data? l How do we find a graph’s diameter? ä This is the maximal shortest path between any pair of vertices ä Can we do this in big graphs? l What is the center of a graph? ä From rumor mills to terrorists ä How is this related to diameter? l Demo GUESS (as augmented at Duke) ä IM data, Audioscrobbler data
CompSci My recommendations at Amazon
CompSci And again…
CompSci Collaborative Filtering l Goal: predict the utility of an item to a particular user based on a database of user profiles ä User profiles contain user preference information ä Preference may be explicit or implicit Explicit means that a user votes explicitly on some scale Implicit means that the system interprets user behavior or selections to impute a vote l Problems ä Missing data: voting is neither complete nor uniform ä Preferences may change over time ä Interface issues