Download presentation
Presentation is loading. Please wait.
Published byRobyn McKinney Modified over 9 years ago
1
Peer-to-Peer Supported Cache System for File Transfer 2003.8.28 Joonbok Lee KAISTjblee@cosmos.kaist.ac.kr
2
Contents 1. Motivation 2. Problem Statement 3. Related Work 4. Approach 5. Simulation 6. Conclusion 7. Reference
3
1. Motivation ► KAIST Netflow Measurement (2002.10.4) Analyze the flow data of KAIST Border Router. Fig 2. Cumulative Distribution Function of the files transferred by FTP and HTTP. 1/17 10MB Some Findings: 1) The amount of bandwidth consumed by FTP is similar with the one consumed by HTTP 2) 78% of the FTP traffic is due to the large files which is larger than 10MB. Fig 1. The byte ratio in terms of Protocols
4
2. Problem Statement ► Unnegligible access to the large multimedia data. [Jung00] ► FTP Traffic: 17% of total traffic. 78% of them are larger than 10MB. 11% of them were failed during transfer. ► The large files transferred by FTP generate much traffic, and many of them takes long time. ► To solve this problem, we propose HTTP/FTP proxy cache which is scalable in terms of bandwidth and storage. 2/17
5
3. Related Work ► The researches which solve large files’ transfer. RepliCache: A New Approach to Scalable Networking Storage System for Large Objects [Jung97] Proactive Web caching with cumulative prefetching [Jung00] ► The researches which has scalable architecture. Squirrel: A decentralized peer-to-peer web cache [Iyer02] Peer-to-Peer Caching Scheme to Address Flash Crowds[Stading02] 3/17
6
4. Approach 4.1 Motivation 4.2 Cache with Peer-to-Peer Storage 4.3 Model 4.4 Detail Design 4/17
7
4.1 Motivation ► Peer-to-Peer Architecture as a Cache Scalability (bandwidth, computing power and storage) Cost Overhead (to find object and to persist system) ► The Latency One of the important metric of cache performance. the lookup time + delivery time Delivery time is depend on the file size. Small files: the lookup time dominate Large files: the deliver time dominate 5/17
8
4.2 Cache with Peer-to-Peer Storage ► Hybrid Approach Scalability: peer-to-peer storage Lookup and control: central cache. ► Peer-to-Peer two-layer storage The storage in central cache ► Expected to be always available, low latency. ► Store small files. The second tier storages ► can be unavailable. ► Store large files. 6/17
9
O s1 Connectivity Cloud Peer 1 O S1,O S2 : Small object O L1, O L2 : Large object 4.3 Model HTTP/ FTP Server A Local Area Network Peer 2 Peer n,O s2 O L1 O L2 O L1 Peer-to-Peer Storage O s1 O L1 Web Proxy Cache with FTP supporting module HTTP/ FTP Server B O s1 Fig 3. Cache with two-layer storage 7/17
10
4.4 Detail Design ► 2 new components to support FTP and large files. Preserve transparency of File Location ► FTP Cache Daemon Store the state of FTP connection Make the URL of files transferred by FTP Check consistency. ► P2P Storage Manager Control its own storage. Managed by object table in central cache. HTTP Cache Daemon FTP Cache Daemon Object Table Storage Manager FTP/HTTP Server FTP/ HTTP Client P2P Storage Manager FTP/ HTTP Client P2P Storage Manager 1 3 4 44 2 Control Data Fig 4. Control and Data connection between components 8/17
11
5. Simulation 5.1 Simulation Environment 5.2 Simulation Result 9/17
12
5.1 Simulation Environment ► Trace Requested FTP file list Gather the FTP control (port 21) packet and produce the trace ► 2002.10.23 ~ 2002.11.5 ( two weeks ) 76,880 (783GB) file requests. 417 clients ► Assumption Local Network: 100Mbps ► Simulated Caches Cache A: 100GB Storage, 100Mbps Cache B: Infinite Storage, 100Mbps Cache C: Infinite Storage, Infinite Bandwidth Cache D: Cache with Peer-to-Peer Storage 10/17
13
5.2 Simulation Result: Hit Ratio Fig 5. Cache Hit Ratio 11/17 Fig 6. Outbound traffic No strict storage control Some peers may have same files in their storage Even though some peers have available storage, the other peers can remove the file from their cache as a victim. degrade the performance of storage, but not much.
14
5.2 Simulation Result: Latency Fig 7. Average latency of 95~105MB files 12/17 Fig 8. Average latency of 95~105KB files Without the increase of small files’ latency, we can reduce the latency of large files.
15
5.2 Simulation Result :Cache Hit Ratio degradation by the peer failure Fig 8. Cache hit ratio degradation by the peer failure 13/17 30%
16
6. Conclusion 1) Shows that much amount of traffic is produced by FTP by the measurement. Among them,78% were occurred by the files larger than 10MB. 2) Propose the cache system which has two-layer storage using peer-to-peer architecture. It is transparent to the location of files. 3) Shows that two layer storage has good performance for the large files as well as small files using trace-driven simulation. 4) Can reduce the outbound traffic and latency by caching using our sistem. ► Other issues Collaboration between proposed systems. Load balancing between peers. Security problem. 15/17
17
7. Reference ► Jaeyeon Jung, “RepliCache: Enhancing Web Caching Architecture with Replication of Large Objects” ► Jaeyeon Jung, Dongman Lee and Kilnam Chon, "Proactive Web Caching with Cumulative Prefetching for Large Multimedia Data", Computer Networks 33 (2000) pp. 645-655 ► Sitaram Iyer, Ant Rowstron and Peter Druschel, “Squirrel: A decentralized peer-to-peer web cache” In Proceedings of the PODC ’02, Monterey, CA ► Tyron Stading, Petros Maniatis, Mary Baker, “Peer-to-Peer Caching Schemes to Address Flash Crowds”, In Proceedings of the IPTPS ’02, MA, USA ► Hyun-chul Kim, Joonbock Lee, Jungwon Suh, and Kilnam Chon, “Measurements of File-Systems Deployed on High-Performance Research and Education Networks”, Technical Report ► I.Stoica, R. Morris, D. Karger, F.Kaas hoek, and H.Balakrishnan. Chord: A scalable content-addressable network. In Proceedings of the ACM SIGCOMM 2001 Technical Conference, San Diego, CA, USA, August 2001 ► S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker. “A scalable content- addressable network.” In Proceedings of the ACM SIGCOMM 2001 Technical Conference, San Diego, CA, USA, August 2001. 16/17
18
7. Reference ► A. Rowstron and P. Druschel, "Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems". IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), Heidelberg, Germany, pages 329-350, November, 2001. ► Ian Clarke, Theodore W. Hong, Scott G. Miller, Oskar Sandberg, and Brandon Wiley, "Protecting Free Expression Online with Freenet," IEEE Internet Computing 6(1), January/February 2002. ► William J. Bolosky, John R. Douceur, David Ely, and Marvin Theimer, Feasibility of a Serverless Distributed File System Deployed on an Existing Set of Desktop PCs In proceeding of SIGMETRICS 2000 ► Internet RFC 959 File Transfer Protocol 17/17
19
Request File Check Protocol Lookup Object Table Check Consistency Check Cached Location Open FTP control connections to both peer which has file and peer which requests is. Make FTP data connections between two the peers. HTTP FTP not cached cached inconsistent consistent peer Handle a request like web proxy cache Transfer file Check File Size Central cache opens data connection to client. central server Update Object Table Transfer file Opens data connection between server and client Transfer file Server opens data connection to central cache. Update Object Table small Large Central cache opens data connection to client. Transfer file Update Object Table Appendix A
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.