Group Based Management of Distributed File Caches

Slides:



Advertisements
Similar presentations
Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.
Advertisements

Google News Personalization: Scalable Online Collaborative Filtering
Dissemination-based Data Delivery Using Broadcast Disks.
Song Jiang1 and Xiaodong Zhang1,2 1College of William and Mary
A Survey of Web Cache Replacement Strategies Stefan Podlipnig, Laszlo Boszormenyl University Klagenfurt ACM Computing Surveys, December 2003 Presenter:
Scalable Content-Addressable Network Lintao Liu
Peer-to-Peer (P2P) Distributed Storage 1Dennis Kafura – CS5204 – Operating Systems.
Consistency and Replication Chapter 7 Part II Replica Management & Consistency Protocols.
Fuzzy Logic and its Application to Web Caching
Load Rebalancing for Distributed File Systems in Clouds Hung-Chang Hsiao, Member, IEEE Computer Society, Hsueh-Yi Chung, Haiying Shen, Member, IEEE, and.
What should you Cache? A Global Analysis on YouTube Related Video Caching Dilip Kumar Krishnappa, Michael Zink and Carsten Griwodz NOSSDAV 2013.
Small-world Overlay P2P Network
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
1 Efficient Massive Sharing of Content among Peers by Peter Triantafillou, Chryssani Xiruhaki and Manolis Koubarakis Dept. of Electronics and Computer.
Peer-to-Peer Based Multimedia Distribution Service Zhe Xiang, Qian Zhang, Wenwu Zhu, Zhensheng Zhang IEEE Transactions on Multimedia, Vol. 6, No. 2, April.
Improving Proxy Cache Performance: Analysis of Three Replacement Policies Dilley, J.; Arlitt, M. A journal paper of IEEE Internet Computing, Volume: 3.
Locality-Aware Request Distribution in Cluster-based Network Servers 1. Introduction and Motivation --- Why have this idea? 2. Strategies --- How to implement?
P2P: Advanced Topics Filesystems over DHTs and P2P research Vyas Sekar.
Exploiting Content Localities for Efficient Search in P2P Systems Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang 1 1 College of William and Mary,
1 Probabilistic Models for Web Caching David Starobinski, David Tse UC Berkeley Conference and Workshop on Stochastic Networks Madison, Wisconsin, June.
Internet Cache Pollution Attacks and Countermeasures Yan Gao, Leiwen Deng, Aleksandar Kuzmanovic, and Yan Chen Electrical Engineering and Computer Science.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Web Caching Robert Grimm New York University. Before We Get Started  Illustrating Results  Type Theory 101.
Squirrel: A decentralized peer- to-peer web cache Paul Burstein 10/27/2003.
Caching And Prefetching For Web Content Distribution Presented By:- Harpreet Singh Sidong Zeng ECE Fall 2007.
Web Caching Schemes For The Internet – cont. By Jia Wang.
Storage management and caching in PAST PRESENTED BY BASKAR RETHINASABAPATHI 1.
Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.
By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching.
Hybrid Prefetching for WWW Proxy Servers Yui-Wen Horng, Wen-Jou Lin, Hsing Mei Department of Computer Science and Information Engineering Fu Jen Catholic.
Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03)
GeoGrid: A scalable Location Service Network Authors: J.Zhang, G.Zhang, L.Liu Georgia Institute of Technology presented by Olga Weiss Com S 587x, Fall.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Using the Small-World Model to Improve Freenet Performance Hui Zhang Ashish Goel Ramesh Govindan USC.
Segment-Based Proxy Caching of Multimedia Streams Authors: Kun-Lung Wu, Philip S. Yu, and Joel L. Wolf IBM T.J. Watson Research Center Proceedings of The.
1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
Web Caching and Content Distribution: A View From the Interior Syam Gadde Jeff Chase Duke University Michael Rabinovich AT&T Labs - Research.
1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
Multicache-Based Content Management for Web Caching Kai Cheng and Yahiko Kambayashi Graduate School of Informatics, Kyoto University Kyoto JAPAN.
Multicache-Based Content Management for Web Caching Kai Cheng and Yahiko Kambayashi Graduate School of Informatics, Kyoto University Kyoto JAPAN.
CS 347Notes101 CS 347 Parallel and Distributed Data Processing Distributed Information Retrieval Hector Garcia-Molina Zoltan Gyongyi.
Ceph: A Scalable, High-Performance Distributed File System
Efficient P2P Search by Exploiting Localities in Peer Community and Individual Peers A DISC’04 paper Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang.
P2P Group Meeting (ICS/FORTH) Monday, 28 March, 2005 A Scalable Content-Addressable Network Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp,
Client Cache Management Improving the broadcast for one probability access distribution will hurt the performance of other clients with different access.
NUS.SOC.CS5248 Ooi Wei Tsang 1 Proxy Caching for Streaming Media.
Content caching and scheduling in wireless networks with elastic and inelastic traffic Group-VI 09CS CS CS30020 Performance Modelling in Computer.
Peer-to-Peer Video Systems: Storage Management CS587x Lecture Department of Computer Science Iowa State University.
On the Placement of Web Server Replicas Yu Cai. Paper On the Placement of Web Server Replicas Lili Qiu, Venkata N. Padmanabhan, Geoffrey M. Voelker Infocom.
Clustered Web Server Model
Proxy Caching for Streaming Media
The Impact of Replacement Granularity on Video Caching
CHAPTER 3 Architectures for Distributed Systems
Dissemination-based Data Delivery Using Broadcast Disks
Memory Management for Scalable Web Data Servers
Plethora: Infrastructure and System Design
COS 518: Advanced Computer Systems Lecture 9 Michael Freedman
Be Fast, Cheap and in Control
Distributed Systems CS
Peer-to-Peer Video Services
Cooperative Caching, Simplified
Dissemination of Dynamic Data on the Internet
Replica Placement Model: We consider objects (and don’t worry whether they contain just data or code, or both) Distinguish different processes: A process.
How Yahoo! use to serve millions of videos from its video library.
Replica Placement Heuristics of Application-level Multicast
Chord and CFS Philip Skov Knudsen
Caching 50.5* + Apache Kafka
Presentation transcript:

Group Based Management of Distributed File Caches Amer,Long,Burns

Prefetching vs. Automated Grouping “Reducing Access Latency” Dynamic File Grouping Prefetching vs. Automated Grouping “Reducing Access Latency” Implicit prefetching Increasing retention priority Modelling allowing overlapping partitions Why? Scenarios.

Success and accuracy dependent on model and strength of relationships. Edge priority Predicting file access behavior: Frequency or Recency ? Location? Single or group of files? Input data?

Hot data items where access costs are less. Tracking sequence, not the time. Aggregating Cache Successors: Immediate and Transitive Group [g] for retrieval. Server maintains immediate successors Client then uses LRU replacement in cache, requested at the front followed by [g –1].

Server Side: Comparing LRU, LFU, Dgrouping Client Side: Demand fetches,Cache hit ratios,Miss rates. Server Side: Comparing LRU, LFU, Dgrouping Statistics are accessed piggy backed with client file requests. Client cache -> Server cache, drop in hit rate for server cache. Interdependency among file access events Comparisons based on recency vs. frequency for maintaining successors

Associating files with successors: Sequences or single files? CAB,CDB….successors are A and D Tracking sequences: More metadata, reduce likelihood of repeated successors. Single file: Greater predictability Ideal may be a combination of frequency and recency.

Efficient Massive Sharing of Content Among Peers Triantafillou, Xiruhaki, Koubarakis

Exploit semantics of documents, construct clusters. Goal is to achieve Global load balancing Short response time Exploit semantics of documents, construct clusters. Assumptions: Known popularity Documents accompanied by key words

Idea is to impose order in chaos! Goal is then to achieve Inter and Intra cluster load balancing. Idea is to impose order in chaos! Each doc is associated with a popularity. Requests can either retrieve or publish

Response time bound by no. of nodes in the cluster. Processing: Obtain semantic category through keywords in the request. Associate categories to clusters using metadata present locally. Forward to cluster. Intra cluster load balancing: forwarding requests to a random node each time. Response time bound by no. of nodes in the cluster.

Inter cluster load balancing: NP complete Assumptions: All peers have the same capacities Enough storage space to store documents of contributed categories Q: Partition N into k clusters,such that If two docs belong to same category, then the host nodes also belong to the sme cluster. Clusters have equal normalized probabilities.

Partitioning into clusters with nearly equal normalized probabilities. Using a greedy algorithm, MinDiff All clusters initially empty For each semantic category, assign to the cluster which min total diff of popularities among clusters.

Discussions