COS 518: Advanced Computer Systems Lecture 9 Michael Freedman

Slides:

Advertisements

Similar presentations

Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.

Advertisements

Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol Li Fan, Pei Cao and Jussara Almeida University of Wisconsin-Madison Andrei Broder Compaq/DEC.

IP Router Architectures. Outline Basic IP Router Functionalities IP Router Architectures.

Efficient Event-based Resource Discovery Wei Yan*, Songlin Hu*, Vinod Muthusamy +, Hans-Arno Jacobsen +, Li Zha* * Chinese Academy of Sciences, Beijing.

Evaluating scalability Peer-to-Peer File Sharing Networks of Sayantan Mitra Vibhor Goyal.

Scalable Content-Addressable Network Lintao Liu

CompSci 356: Computer Network Architectures Lecture 21: Content Distribution Chapter 9.4 Xiaowei Yang

Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.

Internet Networking Spring 2006 Tutorial 12 Web Caching Protocols ICP, CARP.

1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #13 Web Caching Protocols ICP, CARP.

Internet Networking Spring 2002 Tutorial 13 Web Caching Protocols ICP, CARP.

Object Naming & Content based Object Search 2/3/2003.

Squirrel: A decentralized peer- to-peer web cache Paul Burstein 10/27/2003.

1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.

Web Caching Schemes For The Internet – cont. By Jia Wang.

Content Delivery Networks. History Early 1990s sees 100% growth in internet traffic per year 1994 o Netscape forms and releases their first browser.

1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.

Storage management and caching in PAST PRESENTED BY BASKAR RETHINASABAPATHI 1.

1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers.

Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.

Supporting Strong Cache Coherency for Active Caches in Multi-Tier Data-Centers over InfiniBand S. Narravula, P. Balaji, K. Vaidyanathan, S. Krishnamoorthy,

Interposed Request Routing for Scalable Network Storage Darrell Anderson, Jeff Chase, and Amin Vahdat Department of Computer Science Duke University.

Infrastructure for Better Quality Internet Access & Web Publishing without Increasing Bandwidth Prof. Chi Chi Hung School of Computing, National University.

Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.

Scalable Web Server on Heterogeneous Cluster CHEN Ge.

1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)

Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.

1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.

Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.

CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.

Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks

Scalable Data Scale #2 site on the Internet (time on site) >200 billion monthly page views Over 1 million developers in 180 countries.

Hiearchial Caching in Traffic Server. Hiearchial Caching  A set of techniques and mechanisms to increase the size and performance of network caches.

09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.

Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.

Overview on Web Caching COSC 513 Class Presentation Instructor: Prof. M. Anvari Student name: Wei Wei ID:

Wide-area Network Acceleration for the Developing World

Clustered Web Server Model

Web Server Load Balancing/Scheduling

Content Distribution Networks

Coral: A Peer-to-peer Content Distribution Network

University of Maryland College Park

Web Server Load Balancing/Scheduling

Proxy Caching for Streaming Media

Scale and Performance in the CoBlitz Large-File Distribution Service

Energy Conservation in Content Distribution Networks

Presented by Haoran Wang

VIRTUAL SERVERS Presented By: Ravi Joshi IV Year (IT)

COS 518: Advanced Computer Systems Lecture 9 Michael Freedman

PA an Coordinated Memory Caching for Parallel Jobs

Load Balancing Memcached Traffic Using SDN

Internet Networking recitation #12

Plethora: Infrastructure and System Design

Be Fast, Cheap and in Control

HWP2 – Distributed search

Jinyang Li’s Research Distributed Systems Wireless Networks

Distributed Systems CS

A Scalable content-addressable network

Edge computing (1) Content Distribution Networks

Building Peer-to-Peer Systems with Chord, a Distributed Lookup Service

Content Distribution Networks

EECS 498 Introduction to Distributed Systems Fall 2017

Content Distribution Networks + P2P File Sharing

Group Based Management of Distributed File Caches

Content Delivery and Remote DNS services

A Scalable Content Addressable Network

Consistent Hashing and Distributed Hash Table

Caching 50.5* + Apache Kafka

Content Distribution Networks + P2P File Sharing

CSE 552 preparation for reading

Presentation transcript:

COS 518: Advanced Computer Systems Lecture 9 Michael Freedman Caching 50.5* COS 518: Advanced Computer Systems Lecture 9 Michael Freedman * Half of 101

Basic caching rule Tradeoff Based on two assumptions Fast: Costly, small, close Slow: Cheap, large, far Based on two assumptions Temporal location: Will be accessed again soon Spatial location: Nearby data will be accessed soon

Multi-level caching in hardware https://en.wikipedia.org/wiki/Cache_memory

Caching in distributed systems Web Caching and Zipf-like Distributions: Evidence and Implications Lee Breslau, Pei Cao, Li Fan, Graham Phillips, Scott Shenker

Caching common in distributed systems Web Web proxies at edge of enterprise networks “Server surrogates” in CDNs downstream of origin DNS Caching popular NS, A records File sharing Gnutella & flooding-based p2p networks

Caching within datacenter systems load balancers front-end web servers DB / backend identical identical partitioned

Caching within datacenter systems load balancers front-end web servers DB / backend identical identical partitioned cache

Caching within datacenter systems look-through cache load balancers front-end web servers DB / backend identical identical partitioned partitioned

Caching within datacenter systems look-aside cache load balancers front-end web servers DB / backend identical identical partitioned partitioned

Cache management Write-through Write-back Data written simultaneously to cache and storage Write-back Data updated only in cache On cache eviction, written “back” to storage

Caching within datacenter systems look-aside cache load balancers front-end web servers DB / backend identical identical partitioned partitioned

New system / hardware architectures: New opportunities for caching

Be Fast, Cheap and in Control with SwitchKV Xiaozhou Li Raghav Sethi Michael Kaminsky David G. Andersen Michael J. Freedman NSDI 2016

Traditional architectures: High-overhead for skewed/dynamic workloads clients backends cache miss clients failure point miss backends cache (load balancer) Look-aside Look-through Cache must process all queries and handle misses In our case, cache is small and hit ratio could be low Throughput is bounded by the cache I/O High latency for queries for uncached keys

SwitchKV: content-aware routing clients controller OpenFlow Switches cache backends Switches route requests directly to the appropriate nodes Latency can be minimized for all queries Throughput can scale out with # of backends Availability would not be affected by cache node failures

Exploit SDN and switch hardware Clients encode key information in packet headers Encode key hash in MAC for read queries Encode destination backend ID in IP for all queries Switches maintain forwarding rules and route query packets L2 table hit Packet In Packet Out exact match rule per cached key to the cache miss TCAM table Packet Out match rule per physical machine

Keep cache and switch rules updated New challenges for cache updates Only cache the hottest O(nlogn) items Limited switch rule update rate Goal: react quickly to workload changes with minimal updates switch rule update top-k <key, load> list (periodic) cache fetch <key, value> backend controller bursty hot <key, value> (instant)

Wednesday: Welcome to BIG DATA