Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Slides:



Advertisements
Similar presentations
Recommender Systems & Collaborative Filtering
Advertisements

Overview of this week Debugging tips for ML algorithms
A Graph-based Recommender System Zan Huang, Wingyan Chung, Thian-Huat Ong, Hsinchun Chen Artificial Intelligence Lab The University of Arizona 07/15/2002.
Matrices, Digraphs, Markov Chains & Their Use by Google Leslie Hogben Iowa State University and American Institute of Mathematics Leslie Hogben Iowa State.
CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.
Jeff Howbert Introduction to Machine Learning Winter Collaborative Filtering Nearest Neighbor Approach.
Information Retrieval Lecture 8 Introduction to Information Retrieval (Manning et al. 2007) Chapter 19 For the MSc Computer Science Programme Dell Zhang.
CS 599: Social Media Analysis University of Southern California1 The Basics of Network Analysis Kristina Lerman University of Southern California.
Emergence of Scaling in Random Networks Barabasi & Albert Science, 1999 Routing map of the internet
Query Operations: Automatic Local Analysis. Introduction Difficulty of formulating user queries –Insufficient knowledge of the collection –Insufficient.
Lecture 14: Collaborative Filtering Based on Breese, J., Heckerman, D., and Kadie, C. (1998). Empirical analysis of predictive algorithms for collaborative.
CS 728 Lecture 4 It’s a Small World on the Web. Small World Networks It is a ‘small world’ after all –Billions of people on Earth, yet every pair separated.
Web as Graph – Empirical Studies The Structure and Dynamics of Networks.
Peer-to-Peer and Grid Computing Exercise Session 3 (TUD Student Use Only) ‏
1 Collaborative Filtering and Pagerank in a Network Qiang Yang HKUST Thanks: Sonny Chee.
Computing Trust in Social Networks
The Networked Nature of Society Networked Life CSE 112 Spring 2005 Prof. Michael Kearns.
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
The Networked Nature of Society Networked Life CSE 112 Spring 2007 Prof. Michael Kearns.
Recommender systems Ram Akella November 26 th 2008.
CONTENT-BASED BOOK RECOMMENDING USING LEARNING FOR TEXT CATEGORIZATION TRIVIKRAM BHAT UNIVERSITY OF TEXAS AT ARLINGTON DATA MINING CSE6362 BASED ON PAPER.
Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.
Overview of Web Data Mining and Applications Part I
CS246 Link-Based Ranking. Problems of TFIDF Vector  Works well on small controlled corpus, but not on the Web  Top result for “American Airlines” query:
Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology
Recommender Systems and Collaborative Filtering
The PageRank Citation Ranking: Bringing Order to the Web Presented by Aishwarya Rengamannan Instructor: Dr. Gautam Das.
Topic 13 Network Models Credits: C. Faloutsos and J. Leskovec Tutorial
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 9.1 Chapter 9 : Social Networks What is a social.
Distributed Networks & Systems Lab. Introduction Collaborative filtering Characteristics and challenges Memory-based CF Model-based CF Hybrid CF Recent.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Group Recommendations with Rank Aggregation and Collaborative Filtering Linas Baltrunas, Tadas Makcinskas, Francesco Ricci Free University of Bozen-Bolzano.
Clustering-based Collaborative filtering for web page recommendation CSCE 561 project Proposal Mohammad Amir Sharif
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Web Search. Structure of the Web n The Web is a complex network (graph) of nodes & links that has the appearance of a self-organizing structure  The.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
CompSci Today’s topics Networks ä Definitions ä Modeling ä Analysis ä Slides from Michael Kearns - Univ. of Pennsylvania ä Slides from Patrick.
Social Networking Algorithms related sections to read in Networked Life: 2.1,
Author(s): Rahul Sami and Paul Resnick, 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution.
Similar Document Search and Recommendation Vidhya Govindaraju, Krishnan Ramanathan HP Labs, Bangalore, India JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE.
Google News Personalization: Scalable Online Collaborative Filtering
Structural Properties of Networks: Introduction Networked Life NETS 112 Fall 2015 Prof. Michael Kearns.
Collaborative Filtering  Introduction  Search or Content based Method  User-Based Collaborative Filtering  Item-to-Item Collaborative Filtering  Using.
1 The Other Kind of Networking: Social Networks on the Web Dr. Jennifer Golbeck University of Maryland, College Park March 20, 2006.
Link Analysis Rong Jin. Web Structure  Web is a graph Each web site correspond to a node A link from one site to another site forms a directed edge 
1 Friends and Neighbors on the Web Presentation for Web Information Retrieval Bruno Lepri.
CompSci The Internet l How valuable is a network? ä Metcalfe’s Law l Domain Name System: translates betweens names and IP addresses l Properties.
CompSci 100E 4.1 Google’s PageRank web site xxx web site yyyy web site a b c d e f g web site pdq pdq.. web site yyyy web site a b c d e f g web site xxx.
Netlogo demo. Complexity and Networks Melanie Mitchell Portland State University and Santa Fe Institute.
CS 440 Database Management Systems Web Data Management 1.
CS 540 Database Management Systems Web Data Management some slides are due to Kevin Chang 1.
Dynamic Network Analysis Case study of PageRank-based Rewiring Narjès Bellamine-BenSaoud Galen Wilkerson 2 nd Second Annual French Complex Systems Summer.
Collaborative Filtering: Searching and Retrieving Web Information Together Huimin Lu December 2, 2004 INF 385D Fall 2004 Instructor: Don Turnbull.
CompSci The Internet l How valuable is a network? ä Metcalfe’s Law l Domain Name System: translates betweens names and IP addresses l Properties.
1 CS 430 / INFO 430: Information Retrieval Lecture 20 Web Search 2.
Structural Properties of Networks: Introduction
Recommendation in Scholarly Big Data
The Internet Domain Name System: translates betweens names and IP addresses Properties of the Internet Heterogeneity Redundancy Packet-switched 604 million.
Recommender Systems & Collaborative Filtering
CS728 The Collaboration Graph
Social Networks as a Foundation for Computer Science
Structural Properties of Networks: Introduction
Machine Learning With Python Sreejith.S Jaganadh.G.
Structural Properties of Networks: Introduction
Collaborative Filtering Nearest Neighbor Approach
Lecture 22 SVD, Eigenvector, and Web Search
Author: Kazunari Sugiyama, etc. (WWW2004)
CS 440 Database Management Systems
Lecture 22 SVD, Eigenvector, and Web Search
Lecture 22 SVD, Eigenvector, and Web Search
Presentation transcript:

Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/20062 A Future for Computer Science?

Social Networks, CompSci 49s, 11/16/20063 Is there a Science of Networks? l From Erdos numbers to random graphs to Internet  From FOAF to Selfish Routing: apparent similarities between many human and technological systems & organization  Modeling, simulation, and hypotheses  Compelling concepts Metaphor of viral spread Properties of connectivity has qualitative and quantitative effects  Computer Science? l From the facebook to tomogravity  How do we model networks, measure them, and reason about them?  What mathematics is necessary?  Will the real-world intrude?

Social Networks, CompSci 49s, 11/16/20064 Physical Networks l The Internet  Vertices: Routers  Edges: Physical connections l Another layer of abstraction  Vertices: Autonomous systems  Edges: peering agreements  Both a physical and business network l Other examples  US Power Grid  Interdependence and August 2003 blackout

Social Networks, CompSci 49s, 11/16/20065 What does the Internet look like?

Social Networks, CompSci 49s, 11/16/20066 US Power Grid

Social Networks, CompSci 49s, 11/16/20067 Business & Economic Networks l Example: eBay bidding  vertices: eBay users  links: represent bidder-seller or buyer-seller  fraud detection: bidding rings l Example: corporate boards  vertices: corporations  links: between companies that share a board member l Example: corporate partnerships  vertices: corporations  links: represent formal joint ventures l Example: goods exchange networks  vertices: buyers and sellers of commodities  links: represent “permissible” transactions

Social Networks, CompSci 49s, 11/16/20068 Content Networks l Example: Document similarity  Vertices: documents on web  Edges: Weights defined by similarity  See TouchGraph GoogleBrowser l Conceptual network: thesaurus  Vertices: words  Edges: synonym relationships

Social Networks, CompSci 49s, 11/16/20069 Enron

Social Networks, CompSci 49s, 11/16/ Social networks l Example: Acquaintanceship networks  vertices: people in the world  links: have met in person and know last names  hard to measure l Example: scientific collaboration  vertices: math and computer science researchers  links: between coauthors on a published paper  Erdos numbers : distance to Paul Erdos  Erdos was definitely a hub or connector; had 507 coauthors l How do we navigate in such networks?

Social Networks, CompSci 49s, 11/16/200611

Social Networks, CompSci 49s, 11/16/ Acquaintanceship & more

Social Networks, CompSci 49s, 11/16/ Network Models (Barabasi) l Differences between Internet, Kazaa, Chord  Building, modeling, predicting l Static networks, Dynamic networks  Modeling and simulation l Random and Scale-free  Implications? l Structure and Evolution  Modeling via Touchgraph

Social Networks, CompSci 49s, 11/16/ Web-based social networks l Myspace73,000,000 l Passion.com23,000,000 l Friendster21,000,000 l Black Planet17,000,000 l Facebook8,000,000 l Who’s using these, what are they doing, how often are they doing it, why are they doing it?

Social Networks, CompSci 49s, 11/16/ Golbeck’s Criteria l Accessible over the web via a browser l Users explicitly state relationships  Not mined or inferred l Relationships visible and browsable by others  Reasons? l Support for users to make connections  Simple HTML pages don’t suffice

Social Networks, CompSci 49s, 11/16/ CSE 112, Networked Life (UPenn) l Find the person in Facebook with the most friends  Document your process l Find the person with the fewest friends  What does this mean? l Search for profiles with some phrase that yields matches  Graph degrees/friends, what is distribution?

Social Networks, CompSci 49s, 11/16/ CompSci 1: Overview CS0 l Audioscrobbler and last.fm  Collaborative filtering  What is a neighbor?  What is the network?

Social Networks, CompSci 49s, 11/16/ What can we do with real data? l How do we find a graph’s diameter?  This is the maximal shortest path between any pair of vertices  Can we do this in big graphs? l What is the center of a graph?  From rumor mills to DDOS attacks  How is this related to diameter? l Demo GUESS (as augmented at Duke)  IM data, Audioscrobbler data

Social Networks, CompSci 49s, 11/16/ My recommendations at Amazon

Social Networks, CompSci 49s, 11/16/ And again…

Social Networks, CompSci 49s, 11/16/ How do search engines work? l Hotbot, Yahoo, Alta Vista, Excite, … l Inverted index with buckets of words  Insight: use matrix to represent how many times a term appears in one page  Columns: pages & Rows: terms  Problems? l Return pages that have the keyword - in what order?  Early solution: return those pages with most occurrences of term first  Problems?  Solution? Use structure of the web to do the work for us What did Google do?

Social Networks, CompSci 49s, 11/16/ Google’s PageRank web site xxx web site yyyy web site a b c d e f g web site pdq pdq.. web site yyyy web site a b c d e f g web site xxx Inlinks are “good” (recommendations) Inlinks from a “good” site are better than inlinks from a “bad” site but inlinks from sites with many outlinks are not as “good”... “Good” and “bad” are relative. web site xxx

Social Networks, CompSci 49s, 11/16/ Google’s PageRank web site xxx web site yyyy web site a b c d e f g web site pdq pdq.. web site yyyy web site a b c d e f g web site xxx Imagine a “pagehopper” that always either follows a random link, or jumps to random page

Social Networks, CompSci 49s, 11/16/ Google’s PageRank (Brin & Page, web site xxx web site yyyy web site a b c d e f g web site pdq pdq.. web site yyyy web site a b c d e f g web site xxx Imagine a “pagehopper” that always either follows a random link, or jumps to random page PageRank ranks pages by the amount of time the pagehopper spends on a page: or, if there were many pagehoppers, PageRank is the expected “crowd size”

Social Networks, CompSci 49s, 11/16/ Collaborative Filtering l Goal: predict the utility of an item to a particular user based on a database of user profiles  User profiles contain user preference information  Preference may be explicit or implicit Explicit means that a user votes explicitly on some scale Implicit means that the system interprets user behavior or selections to impute a vote l Problems  Missing data: voting is neither complete nor uniform  Preferences may change over time  Interface issues

Social Networks, CompSci 49s, 11/16/ Memory-based methods l Store all user votes and generalize from them to predict vote for new item l Predicted vote of active user a for item j :  where there are n users with non-zero weights, v i,j is the vote of user i and item j,  is a normalizing factor,  w () is a weighting function between users Distance metric Correlation or similarity

Social Networks, CompSci 49s, 11/16/ Computing weights - Cosine Correlation l In information retrieval, documents are represented as vectors of word frequencies  For CF, we treat preferences as vector Documents -> users Word frequencies -> votes l Similarity is then the cosine between two vectors  Dot product of the vectors divided by the product of their magnitudes

Social Networks, CompSci 49s, 11/16/ Computing weights - Pearson & Spearman correlation l Pearson Correlation  First used for CF in GroupLens project [Resnick et al., 1994]  Relatively efficient to calculate incrementally l Spearman Correlation  same as Pearson but calculations are done on rank of v a,j and v i,j

Social Networks, CompSci 49s, 11/16/ Model-based methods l Really what we want is the expected value of the user’s vote  Cluster Models Users belong to certain classes in C with common tastes Naive Bayes Formulation Calculate Pr( v i |C= c ) from training set  Bayesian Network Models