Squirrel: A peer-to-peer web cache Sitaram Iyer Joint work with Ant Rowstron (MSRC) and Peter Druschel.

Slides:



Advertisements
Similar presentations
Peer-to-Peer Infrastructure and Applications Andrew Herbert Microsoft Research, Cambridge
Advertisements

Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002.
SkipNet: A Scalable Overlay Network with Practical Locality Properties Nick Harvey, Mike Jones, Stefan Saroiu, Marvin Theimer, Alec Wolman Microsoft Research.
P2P in Windows 7. P2P Capabilities in Windows 7 Distributed Routing Table Distributed Routing Table – A new public API suitable for building Distributed.
Finding a needle in Haystack Facebook’s Photo Storage
Internet Applications INTERNET APPLICATIONS. Internet Applications Domain Name Service Proxy Service Mail Service Web Service.
Justine Sherry*, Shaddi Hasan*, Colin Scott*, Arvind Krishnamurthy†,
A Survey of Web Cache Replacement Strategies Stefan Podlipnig, Laszlo Boszormenyl University Klagenfurt ACM Computing Surveys, December 2003 Presenter:
1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Scalable Content-Addressable Network Lintao Liu
Storage management and caching in PAST, a large-scale, persistent peer- to-peer storage utility Antony Rowstron, Peter Druschel.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
1 Content Delivery Networks iBAND2 May 24, 1999 Dave Farber CTO Sandpiper Networks, Inc.
Peer-to-Peer Networks as a Distribution and Publishing Model Jorn De Boever (june 14, 2007)
EEC-484/584 Computer Networks Lecture 6 Wenbing Zhao
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Storage Management and Caching in PAST, a large-scale, persistent peer- to-peer storage utility Authors: Antony Rowstorn (Microsoft Research) Peter Druschel.
EEC-484/584 Computer Networks Discussion Session for HTTP and DNS Wenbing Zhao
Chord: A Scalable Peer-to-Peer Lookup Protocol for Internet Applications Stoica et al. Presented by Tam Chantem March 30, 2007.
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
presented by Hasan SÖZER1 Scalable P2P Search Daniel A. Menascé George Mason University.
Object Naming & Content based Object Search 2/3/2003.
Squirrel: A decentralized peer- to-peer web cache Paul Burstein 10/27/2003.
Pastry And Squirrel Presented by Eirik T. Laberg Håvard Semundseth Orri G. Pálsson.
Peer To Peer Distributed Systems Pete Keleher. Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff.
Wide-area cooperative storage with CFS
Web Caching Schemes For The Internet – cont. By Jia Wang.
Storage management and caching in PAST PRESENTED BY BASKAR RETHINASABAPATHI 1.
Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.
PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.
Local Area Networks (LAN) are small networks, with a short distance for the cables to run, typically a room, a floor, or a building. - LANs are limited.
DEP351 Windows ® Rights Management (Part 2): Enterprise Readiness & Deployment Marco DeMello Group Program Manager Windows Trusted Platforms & Infrastructure.
On the Scale and Performance of Cooperative Web Proxy Caching University of Washington Alec Wolman, Geoff Voelker, Nitin Sharma, Neal Cardwell, Anna Karlin,
5.1 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED.
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
Module 11: Implementing ISA Server 2004 Enterprise Edition.
Vincent Matossian September 21st 2001 ECE 579 An Overview of Decentralized Discovery mechanisms.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
Peer-to-Peer Supported Cache System for File Transfer Joonbok Lee
1 Evaluation of Cooperative Web Caching with Web Polygraph Ping Du and Jaspal Subhlok Department of Computer Science University of Houston presented at.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
PROP: A Scalable and Reliable P2P Assisted Proxy Streaming System Computer Science Department College of William and Mary Lei Guo, Songqing Chen, and Xiaodong.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Setup and Management for the CacheRaQ. Confidential, Page 2 Cache Installation Outline – Setup & Wizard – Cache Configurations –ICP.
Pastry Antony Rowstron and Peter Druschel Presented By David Deschenes.
ITGS Network Architecture. ITGS Network architecture –The way computers are logically organized on a network, and the role each takes. Client/server network.
Squirrel: A decentralized peer-to- peer web cache Paper by Sitaram Iyer, Antony Rowstron and Peter Druschel (© 2002) Presentation* by Alexander Prohaska.
MiddleMan: A Video Caching Proxy Server NOSSDAV 2000 Brian Smith Department of Computer Science Cornell University Ithaca, NY Soam Acharya Inktomi Corporation.
Content Delivery Networks: Status and Trends Speaker: Shao-Fen Chou Advisor: Dr. Ho-Ting Wu 5/8/
Introduction to Networking
1 COMP 431 Internet Services & Protocols HTTP Persistence & Web Caching Jasleen Kaur February 11, 2016.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Overview on Web Caching COSC 513 Class Presentation Instructor: Prof. M. Anvari Student name: Wei Wei ID:
1 Evaluation of Cooperative Web Caching with Web Polygraph Ping Du and Jaspal Subhlok Department of Computer Science University of Houston presented at.
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Distributed Web Systems Peer-to-Peer Systems Lecturer Department University.
Antony Rowstron, Microsoft Research Cambridge, UK
CHAPTER 3 Architectures for Distributed Systems
Plethora: Infrastructure and System Design
On the Scale and Performance of Cooperative Web Proxy Caching
ECE 671 – Lecture 16 Content Distribution Networks
VDN: Virtual Machine Image Distribution Network for Cloud Data Centers
Early Measurements of a Cluster-based Architecture for P2P Systems
Jinyang Li’s Research Distributed Systems Wireless Networks
An Introduction to Computer Networking
Intro to Computer Networking
Your computer is the client
Presentation transcript:

Squirrel: A peer-to-peer web cache Sitaram Iyer Joint work with Ant Rowstron (MSRC) and Peter Druschel

Peer-to-peer Computing Decentralize a distributed protocol: – Scalable – Self-organizing – Fault tolerant – Load balanced Not automatic!!

Web Caching 1. Latency, 2. External bandwidth, 3. Server load. ISPs, Corporate network boundaries, etc. Cooperative Web Caching: group of web caches tied together and acting as one web cache.

Web Cache Browser Cache Browser Cache Centralized Web Cache Web Server Sharing ! LAN Internet

Decentralized Web Cache Browser Cache Browser Cache Web Server LAN Internet Why? How?

Why peer-to-peer ? 1. Cost of dedicated web cache No additional hardware 2. Administrative costs Self-organizing 3. Scaling needs upgrading Resources grow with clients 4. Single point of failure Fault-tolerant by design

Setting Corporate LAN ,000 desktop machines Single physical location Each node runs an instance of Squirrel Sets it as the browser’s proxy

Pastry Peer-to-peer object location and routing substrate Distributed Hash Table: reliably map an object key to a live node Routes in log 2 b (N) steps (e.g. 3-4 steps for 100,000 nodes, with b=16 )

Home-store model client home LAN Internet URL hash

Home-store model client home … that’s how it works!

Directory model Client nodes always store objects in local caches. Main difference between the two schemes: whether the home node also stores the object. In the directory model, it only stores pointers to recent clients, and forwards requests to them.

Directory model client home Net LAN

Directory model client delegate home rando m entry

(skip) Full directory protocol dir server e : cGET req origin other req home req client req 2 b : not-modified 3 e c,e : req c,e : object 1 4 a, d 2 a, d : req 1 a : no dir, go to origin. Also d not-modified object or dele- gate

Recap Two endpoints of design space, based on the choice of storage location. At first sight, both seem to do about as well. (e.g. hit ratio, latency).

Quirk Consider a – Web page with many images, or – Heavily browsing node In the Directory scheme, Many home nodes pointing to one delegate Home-store: natural load balancing.. evaluation on trace-based workloads..

Trace characteristics RedmondCambridge Total duration1 day31 days Number of clients36, Number of HTTP requests16.41 million0.971 million Peak request rate606 req/sec186 req/sec Number of objects5.13 million0.469 million Number of cacheable objects2.56 million0.226 million Mean cacheable object reuse5.4 times3.22 times

Total external bandwidth Total external bandwidth (in GB) [lower is better] Per-node cache size (in MB) Directory Home-store No web cache Centralized cache Redmond

Total external bandwidth Total external bandwidth (in GB) [lower is better] Per-node cache size (in MB) Directory Home-store No web cache Centralized cache Cambridge

LAN Hops Redmond

LAN Hops 0% 20% 40% 60% 80% 100% Fraction of cacheable requests Total hops within the LAN CentralizedHome-storeDirectory Cambridge

Load in requests per sec Number of such seconds Max objects served per-node / second Home-store Directory Redmond

Load in requests per sec e+06 1e Number of such seconds Max objects served per-node / second Home-store Directory Cambridge

Load in requests per min Number of such minutes Max objects served per-node / minute Home-store Directory Redmond

Load in requests per min Number of such minutes Max objects served per-node / minute Home-store Directory Cambridge

Conclusion Possible to decentralize web caching Performance comparable to centralized cache Is better in terms of cost, administration, scalability and fault tolerance.

(backup) Storage utilization Redmond Home-storeDirectory Total MB61652 MB Mean per-node 2.6 MB1.6 MB Max per-node 1664 MB

(backup) Fault tolerance Home-storeDirectory Equations Mean H/O Max H max /O Mean (H+S)/O Max max(H max,S max )/O Redmond Mean % Max % Mean 0.198% Max 1.5% Cambridge Mean 0.95% Max 3.34% Mean 1.68% Max 12.4%

(backup) Full home-store protocol server client other req home req a : object or notmod from home b : object or notmod from origin 3 1 b 2 (WAN) (LAN) origin b : req

(backup) Full directory protocol dir server e : cGET req origin other req home req client req 2 b : not-modified 3 e c,e : req c,e : object 1 4 a, d 2 a, d : req 1 a : no dir, go to origin. Also d not-modified object or dele- gate