PEERSPECTIVE.MPI-SWS.ORG ALAN MISLOVE KRISHNA P. GUMMADI PETER DRUSCHEL BY RAGHURAM KRISHNAMACHARI Exploiting Social Networks for Internet Search.

Slides:



Advertisements
Similar presentations
Insurance and Risk Management Internet Searches Overall Objective: –On completion of the three course modules, you should be able to obtain and evaluate.
Advertisements

WWW Challenges : Supporting Users in Search and Navigation Natasa Milic-Frayling Microsoft Research, Cambridge UK SOFSEM 2004 January 28, 2004.
Computer Science Department, University of Toronto 1 Seminar Series Social Information Systems Toronto, Spring, 2007 Manos Papagelis Department of Computer.
Search Engines. 2 What Are They?  Four Components  A database of references to webpages  An indexing robot that crawls the WWW  An interface  Enables.
Page 1 June 2, 2015 Optimizing for Search Making it easier for users to find your content.
“ The Anatomy of a Large-Scale Hypertextual Web Search Engine ” Presented by Ahmed Khaled Al-Shantout ICS
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Semantic Search Jiawei Rong Authors Semantic Search, in Proc. Of WWW Author R. Guhua (IBM) Rob McCool (Stanford University) Eric Miller.
Research Update on WebPlaces: Application of Implicit Networks Danyel Fisher Human-Centered Computing Retreat Summer, 1999.
Master’s course Bioinformatics Data Analysis and Tools Lecture 6: Internet Basics Centre for Integrative Bioinformatics.
The Web is perhaps the single largest data source in the world. Due to the heterogeneity and lack of structure, mining and integration are challenging.
What is the Internet? The Internet is a computer network connecting millions of computers all over the world It has no central control - works through.
7DS Seven Degrees of Separation Suman Srinivasan IRT Lab Columbia University.
Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.
Overview of Web Data Mining and Applications Part I
Internet Research Search Engines & Subject Directories.
SEARCH ENGINE By Ms. Preeti Patel Lecturer School of Library and Information Science DAVV, Indore E mail:
1 Web Developer Foundations: Using XHTML Chapter 11 Web Page Promotion Concepts.
PDF Wikispaces Blogging PBWorks You are now ready to cut the red ribbon and unveil your project to your intended audience.
1 Web Developer & Design Foundations with XHTML Chapter 13 Key Concepts.
1 Anonshare 2.0 P2P Anonymous Browsing History Share Frank Chiang Terry Go Rui Ma Anita Mathew.
Towards a Safe Playground for HTTPS and Middle-Boxes with QoS2 Zhenyu Zhou CS Dept., Duke University.
CS621 : Seminar-2008 DEEP WEB Shubhangi Agrawal ( )‏ Jayalekshmy S. Nair ( )‏
A Survey on Social Network Search Ranking. Web vs. Social Networks WebSocial Network Publishing Place documents on server Post contents on social network.
Using a Web Browser What does a Web Browser do? A web browser enables you to surf the World Wide Web. What are the most popular browsers?
How did the internet develop?. What is Internet? The internet is a network of computers linking many different types of computers all over the world.
1 BINGO! and Daffodil: Personalized Exploration of Digital Libraries and Web Sources Martin Theobald Max-Planck-Institut für Informatik Claus-Peter Klas.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
NCBI/WHO PubMed/Hinari Course Introduction Session #1, Sept 13, 2005 Session #2, Sept 14, 2005 Internet Concepts and Scientific Literature Resources Ho.
ITIS 1210 Introduction to Web-Based Information Systems Chapter 27 How Internet Searching Works.
Validating, Promoting, & Publishing Your Web Site Writing For the Web The Internet Writer’s Handbook 2/e.
Pete Bohman Adam Kunk. Real-Time Search  Definition: A search mechanism capable of finding information in an online fashion as it is produced. Technology.
Dixon Jones Receptional Internet Marketing. WWW: Machine or Alive?
Web 2.0 By Martin King. Features of Web 2.0 Tags: These are one word descriptions of the entire content written by the owner. Extensions: It is software.
Getting Business Performance through Blogging Sanford Dickert Rawlings Atlantic.
استاد : مهندس حسین پور ارائه دهنده : احسان جوانمرد Google Architecture.
Internet Research Tips Daniel Fack. Internet Research Tips The internet is a self publishing medium. It must be be analyzed for appropriateness of research.
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
Efficient Instant-Fuzzy Search with Proximity Ranking Authors: Inci Centidil, Jamshid Esmaelnezhad, Taewoo Kim, and Chen Li IDCE Conference 2014 Presented.
Searching Tutorial By: Lola L. Introduction:  When you are using a topic, you might want to use “keyword topics.” Using this might help you find better.
Freelib: A Self-sustainable Digital Library for Education Community Ashraf Amrou, Kurt Maly, Mohammad Zubair Computer Science Dept., Old Dominion University.
Meet the web: First impressions How big is the web and how do you measure it? How many people use the web? How many use search engines? What is the shape.
OWL Representing Information Using the Web Ontology Language.
CONTENTS  Definition And History  Basic services of INTERNET  The World Wide Web (W.W.W.)  WWW browsers  INTERNET search engines  Uses of INTERNET.
© 2010 Deep Web Technologies, Inc. Taking the Library Back from Google Abe Lederman, President and CTO Deep Web Technologies May 12, 2010.
LIR 10: Week 10 Advanced WWW Topics. Class Announcements New features on Section 2904 Schedule Missing Homework Online Quiz due 11/16 Another WWW directory.
Web Information Retrieval Prof. Alessandro Agostini 1 Context in Web Search Steve Lawrence Speaker: Antonella Delmestri IEEE Data Engineering Bulletin.
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
Bloom Cookies: Web Search Personalization without User Tracking Authors: Nitesh Mor, Oriana Riva, Suman Nath, and John Kubiatowicz Presented by Ben Summers.
The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.
ASSIST: Adaptive Social Support for Information Space Traversal Jill Freyne and Rosta Farzan.
CSCI-235 Micro-Computers in Science The Internet and World Wide Web.
P2P Content Search: Give the Web Back to the People Matthias Bender Sebastin Michel Peter Triantafillou Gerhard Weikum Christian Zimmer Mariam John CSE.
Microsoft Office 2008 for Mac – Illustrated Unit D: Getting Started with Safari.
Web Design Terminology Unit 2 STEM. 1. Accessibility – a web page or site that address the users limitations or disabilities 2. Active server page (ASP)
Heat-seeking Honeypots: Design and Experience John P. John, Fang Yu, Yinglian Xie, Arvind Krishnamurthy and Martin Abadi WWW 2011 Presented by Elias P.
Search can be Your Best Friend You just Need to Know How to Talk to it IW 306 Ágnes Molnár.
Chapter 10: Web Basics.
Improving searches through community clustering of information
CASE STUDY -HTML,URLs,HTTP
A Brief Introduction to the Internet
Search Engines & Subject Directories
Fred Dirkse CEO, OIC Group, Inc.
Unit# 5: Internet and Worldwide Web
Agenda What is SEO ? How Do Search Engines Work? Measuring SEO success ? On Page SEO – Basic Practices? Technical SEO - Source Code. Off Page SEO – Social.
Search Engines & Subject Directories
Content Deployment Interface Issues
Search Engines & Subject Directories
The Internet: Encryption & Public Keys
Information Retrieval and Web Design
Presentation transcript:

PEERSPECTIVE.MPI-SWS.ORG ALAN MISLOVE KRISHNA P. GUMMADI PETER DRUSCHEL BY RAGHURAM KRISHNAMACHARI Exploiting Social Networks for Internet Search

Motivation WWW, Search engines, social networking Hyperlinks – author, human, index, rank Social Networks  No study to examine information exchange  Explicit links between users, not content  Can these links be used by search engines? In this paper  Compare mechanisms for publishing and location  Experiment: Social network based Web search  Challenges in leveraging social networks in the future

The Web verses Social Networks Publishing  Users place documents on Web server  Author places hyperlinks on Web page that refer to related pages  Links placed to increase rank and promote indexing Locating  Web search engines employing sophisticated technologies  Google: Uses hyperlink structure and query/page relevance  Limitations:  New pages: discovering/indexing, hyper-linking, link(s) discovery  # of links determines relevance -> reflects interests/biases of the Web  Ignored: Unlinked/private pages, pages with insufficient relevance

The Web verses Social Networks Ex: User shares Web content with friends; content is invisible to others; content is now linked between users Publishing  Content is posted by the user and is recommended by others  Links among users: Directed (distinct) & Undirected (mutual) Locating  Traversing the social network, keyword search, top 10 lists  Timely, relevant & reliable (non-)textual info can be found  Content is rated by consumers, not producers  Content is rated almost immediately; doesn’t rely on discovery

Integrating Web search and social networks Problem  No unified search tool, no unified finding tool as well  Social network-based search not used in Web and vice versa Questions  Leverage social network links to improve search results  Explore benefits of social network-based Web search Solution  Conduct an experiment to validate these

PeerSpective: The experiment Web content of 10 students/researchers are shared A HTTP proxy indexes all visited URLs by an user When a Google search (query) is performed  Local proxy forwards query to Google and to all peer proxies  All proxies execute the query on local index & return results  Results are collated and presented alongside the Google results

PeerSpective: Measurements & Experiences In a month long experimental deployment (10 users)  439, 384 HTTP requests  198, 492 distinct URLs (45%)  113, 800 HTML and PDF requests (25.9%) User base is small, with highly specialized interests The results may not represent a large, diverse group Technology  Local text search engine – Lucene  Local peer-peer overlay engine - FreePastry

Limits of hyperlink-based search Web search engines index only well linked content Limit: URLs visited by users / not indexed by Google Reasons why a page might not be indexed  The page could be too new (blogs, news)  The page could be in deep web and not well connected  The page could be in dark web (private pages)

PeerSpective verses Google For each HTTP request  Does Google’s index contain this URL  Has some peer in PeerSpective viewed this URL Static HTML content (No GET/POST)  6,679 requests (<6%) for 3,987 URLs (2%) Google Index  Covers 62.5% of the requests, 68.1% of the distinct URLs  1/3rd of all URL requests cannot be retrieved by Google PeerSpective Index  Covers 30.4% of requested URLs  Achieves half of Google’s coverage with a much smaller size  13.3% of the URLs were in PeerSpective but not in Google’s index  19.5% improvement by PeerSpective compared to Google search What are the documents that interests our users, but not Google ?

Benefits of social network-based search Search engines have to rank pages  Users rarely go beyond first 20 search results 1,730 Google searches were observere d  First page results: Google – 9.45, PeerSpective – 5.17  1,079 (62.3%) resulted in clicks on result(s)  307 (17.7%) were followed by a refined query  Users gave up 344 (19.8%) of the time  933 (86.5%) of clicked results were returned only by Google  83 (7.7%) of clicked results were returned only by PeerSpective  63 (5.7%) of clicked results were returned by both  9% improvement in search result clicks over Google alone

How PeerSpective outperforms Google Disambiguation  Search terms have multiple meanings depending on the context Ranking  Search engine: Top rank, Social Network: Nearby pages Serendipity  Making unexpected or fortunate discoveries

Opportunities and Challenges Online social networking enables new forms of information exchange  Users can very easily and conveniently publish information  Makes it possible to locate and access “WOM” information  Organizes information according to tastes and preferences of smaller groups of individuals Opportunities and Challenges  Privacy – willingness of individuals to share information  Membership and clustering of social networks  Content rating and ranking (page rank, views)  System architecture (centralized or distributed)

Thank You