VDN: Virtual Machine Image Distribution Network for Cloud Data Centers

Slides:



Advertisements
Similar presentations
Welcome to Middleware Joseph Amrithraj
Advertisements

SLA-Oriented Resource Provisioning for Cloud Computing
INTRODUCTION TO CLOUD COMPUTING CS 595 LECTURE 6 2/13/2015.
Low-Cost Data Deduplication for Virtual Machine Backup in Cloud Storage Wei Zhang, Tao Yang, Gautham Narayanasamy University of California at Santa Barbara.
James 1:5 If any of you lacks wisdom, he should ask God, who gives generously to all without finding fault, and it will be given to him.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Tradeoffs in CDN Designs for Throughput Oriented Traffic Minlan Yu University of Southern California 1 Joint work with Wenjie Jiang, Haoyuan Li, and Ion.
A Scalable, Commodity Data Center Network Architecture.
SaaS, PaaS & TaaS By: Raza Usmani
Client-Server Computing in Mobile Environments
An Introduction to Cloud Computing. The challenge Add new services for your users quickly and cost effectively.
Software Engineering for Cloud Computing Rao, Feng 04/27/2011.
Plan Introduction What is Cloud Computing?
VAP What is a Virtual Application ? A virtual application is an application that has been optimized to run on virtual infrastructure. The application software.
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
1 The Google File System Reporter: You-Wei Zhang.
Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over the Internet. Cloud is the metaphor for.
Cloud Computing. What is Cloud Computing? Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable.
+ CS 325: CS Hardware and Software Organization and Architecture Cloud Architectures.
The Center for Autonomic Computing is supported by the National Science Foundation under Grant No NSF CAC Seminannual Meeting, October 5 & 6,
Windows Azure: Microsoft’s Cloud Platform By Shahed Chowdhuri.
Overview of Cloud Computing Sven Rosvall ACCU
Plan  Introduction  What is Cloud Computing?  Why is it called ‘’Cloud Computing’’?  Characteristics of Cloud Computing  Advantages of Cloud Computing.
PROP: A Scalable and Reliable P2P Assisted Proxy Streaming System Computer Science Department College of William and Mary Lei Guo, Songqing Chen, and Xiaodong.
Enterprise Cloud Computing
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
Cloud Computing is a Nebulous Subject Or how I learned to love VDF on Amazon.
Chapter 8 – Cloud Computing
MiddleMan: A Video Caching Proxy Server NOSSDAV 2000 Brian Smith Department of Computer Science Cornell University Ithaca, NY Soam Acharya Inktomi Corporation.
Windows Azure poDRw_Xi3Aw.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Cloud Computing – UNIT - II. VIRTUALIZATION Virtualization Hiding the reality The mantra of smart computing is to intelligently hide the reality Binary->
Configuring SQL Server for a successful SharePoint Server Deployment Haaron Gonzalez Solution Architect & Consultant Microsoft MVP SharePoint Server
Cost-Effective Video Streaming Techniques Kien A. Hua School of EE & Computer Science University of Central Florida Orlando, FL U.S.A.
A Hierarchical Edge Cloud Architecture for Mobile Computing IEEE INFOCOM 2016 Liang Tong, Yong Li and Wei Gao University of Tennessee – Knoxville 1.
Agenda  What is Cloud Computing?  Milestone of Cloud Computing  Common Attributes of Cloud Computing  Cloud Service Layers  Cloud Implementation.
Net-Centric Computing Overview
Prof. Jong-Moon Chung’s Lecture Notes at Yonsei University
Lecture 6: Cloud Computing
Unit 3 Virtualization.
CLOUD ARCHITECTURE Many organizations and researchers have defined the architecture for cloud computing. Basically the whole system can be divided into.
Wide-area Network Acceleration for the Developing World
Optimizing Distributed Actor Systems for Dynamic Interactive Services
Accelerating Peer-to-Peer Networks for Video Streaming
Organizations Are Embracing New Opportunities
DC Market Trends and the key focus areas within
By: Raza Usmani SaaS, PaaS & TaaS By: Raza Usmani
Architecture and Algorithms for an IEEE 802
Graciela Perera Introduction Graciela Perera
Heitor Moraes, Marcos Vieira, Italo Cunha, Dorgival Guedes
Cloud computing-The Future Technologies
Prepared by: Assistant prof. Aslamzai
Scale and Performance in the CoBlitz Large-File Distribution Service
What is Cloud Computing - How cloud computing help your Business?
An Introduction to Cloud Computing
Distributed Systems CS
1. Public Network - Each Rackspace Cloud Server has two networks
Memory Management for Scalable Web Data Servers
Cloud Computing.
Plethora: Infrastructure and System Design
Chapter 16: Distributed System Structures
Cloud Computing Dr. Sharad Saxena.
Software Defined Networking (SDN)
Outline Virtualization Cloud Computing Microsoft Azure Platform
Cloud computing mechanisms
Cloud Web Filtering Platform
Internet and Web Simple client-server model
Distributed Systems CS
Introduction to Cyberspace
Agenda Need of Cloud Computing What is Cloud Computing
Presentation transcript:

VDN: Virtual Machine Image Distribution Network for Cloud Data Centers Chunyi Peng1, Minkyong Kim2, Zhe Zhang2, Hui Lei2 1University of California, Los Angeles 2IBM T.J. Watson Research Center IEEE INFOCOM 2012 Orlando, Florida USA

Cloud Computing the delivery of Computing as a Service Infocom 2012 C Peng (UCLA)

Service Access in Virtual Machine Instances Cloud Clients Web browser, mobile app, thin client, terminal emulator, … Software as a Service (SaaS) CRM, Email, virtual desktop, communications, games, … Application Platform as a Service (PaaS) Execution runtime, database, web server, development tools, … Platform Infrastructure as a Service (IaaS) Virtual machines, server storage, load balancer, networks, … structure Infra VM Client Service Requests (e.g. HTTP) Problem: On-demand VM provisioning Picture source: http://www.wikimedia.org Infocom 2012 C Peng (UCLA)

Time for VM Image Provisioning User request Req process VM Bootup VM image transfer Our focus: Transfer time time Response in several or tens of minutes in reality! Infocom 2012 C Peng (UCLA)

Why Slow? VM image files are large (several or tens of GB) Centralized image storage becomes a bottleneck ToR switch Access Data Center Aggregation Core Image-server RH5.6 RH5.6 Infocom 2012 C Peng (UCLA)

Roadmap Basic VDN idea: enable collaborative sharing VDN solution on efficient sharing Basic sharing units Metadata management Performance evaluation Conclusion Infocom 2012 C Peng (UCLA)

VDN: Speedup VM Image Distribution Enable collaborative sharing Utilize the “free” VM images Exploit source diversity and make full use of network bandwidth ToR switch Access Aggregation Core Image-server RH5.6 RH6.0 RH5.5 RH5.6 Infocom 2012 C Peng (UCLA)

How to Enable Collaborative Sharing? What is the basic data unit for sharing? File-based sharing: Allow sharing only among same files Chunk-based sharing: Allow sharing of common chunks from different files How to manage content location information? Centralized solution: directory service, etc. Distributed solution: P2P overlay, etc. Infocom 2012 C Peng (UCLA)

What is the Appropriate Sharing Unit? Two factors The number of the same, alive VM image instances The similarity of different VM images Conduct real trace analysis and cross-image similarity measurement VM traces from six operational data centers for 4 months VM images including different Linux/Windows versions, IBM services (DB2, Rational, WebSphere) etc Infocom 2012 C Peng (UCLA)

VM Instance Popularity The distribution of image popularity is highly skewed A few popular images take a large portion of VM instances Many unpopular images have a small number of VM instances (< 5) Few peers can involve in file-based sharing Unpopular VM images Infocom 2012 C Peng (UCLA)

VM Instance Lifetime The lifetime of VM instance varies 40% instances (more popular VM instances) < 13 minutes The unpopular VM images have longer lifetime VM image distribution network should cope with various lifetime instances 13 min Infocom 2012 C Peng (UCLA)

VM Image Structure Tree-based VM image structure Red Hat Linux SUSE …… Windows Services Misc (60%) (25%) (11%) (4%) Red Hat SUSE (53%) …… Enterprise Linux v5.5 (32bit) (26.6%) Enterprise Linux v5.5 (64bit) (18.7%) … Enterprise Linux v5.4 (32bit) (4%) Enterprise Linux v5.6 (32bit) (0.2%) Database …… IDE …… V7.0 B (0.7%) V7.0.0.11 S P (0.7%) V7.0.0.11 R B (0.3%) V7.0.0.11 S B (0.3%) V7.0.0.11 S D (0.2%) V7.0 P (0.1%) (7%) Web app. server Infocom 2012 C Peng (UCLA)

VM Image Similarity High similarity across VM images Chunk schemes: fixed size and Rabin fingerprinting Similarity: Sim(A,B) = |A’s chunks that appear in B| /|A| Chunk-based sharing can exploit cross-image similarity Infocom 2012 C Peng (UCLA)

Enable Chunk-based Sharing Decouple VM images into VM chunks Exploit similarity across VM images Provide a higher source diversity and sharing opportunity RH5.5 RH5.6 RH5.6 RH6.0 RH5.6 RH5.6 Questions: How to maintain chunk location information (metadata) How to be scalable and also enable fast data transmission Infocom 2012 C Peng (UCLA)

How to Manage Location Information? Solution I: centralized metadata server Cons: be simple Pros: bottleneck at metadata server Solution II: P2P overlay network, e.g., DHT Cons: distributed operations Pros: be unaware of data center topology and may introduce high network overhead Internet I-S Infocom 2012 C Peng (UCLA)

Issues in Conventional P2P Practice One logic operation (lookup/publish) Multiple physical hops Hop costs (e.g. time) can be high! Solution: Reduce # of hops Reduce the cost of physical hops Keep it local or with close buddies Infocom 2012 C Peng (UCLA)

Topology-aware Metadata Management Divide all the hosts into different-level hierarchies and manage chunks in each hierarchy Utilize static/quasi-static (controlled) topology Exploit high bandwidth local links in hierarchical structure Internet I-S L1 H L2 L3 Infocom 2012 C Peng (UCLA)

VDN: Encourage Local Communication Local chunk metadata storage Index nodes maintain only metadata within this hierarchy Unnecessary to maintain a global view at all index nodes Local chunk metadata operation (e.g., lookup/publish) Ask close index nodes first Lower operation overhead Local chunk data delivery Enable high bandwidth transmission between close hosts (e.g. within the rack) Infocom 2012 C Peng (UCLA)

VDN Operation Flows Recursive operation from lower-hierarchy to higher- hierarchy L2 Image-server Local Cache L3 A. Metadata update B. Metadata lookup C. Data transmission L1 1. 2. 3C. 3B. 3A. 4A 4B 5 Infocom 2012 C Peng (UCLA)

Performance Evaluation Setting One-month real trace driven simulation VM image: 128MB~ 8GB Tree topology: 4x 4 x 8 (128 nodes) Network bandwidth: Static throughput for one physical link Queue-based simulation for multiple transmissions on one link Schemes Baseline: centralized operation Local: fetch VM chunks from local host if possible VDN: enable collaborative sharing I-S disk I/O: 1Gbps Net BW: 1Gbps 2Gbps 500Mbps 200Mbps (4-) (8-nodes) Infocom 2012 C Peng (UCLA)

Great Speedup on Image Distribution S1 data center S6 data center at S6, VM image size = 4GB Infocom 2012 C Peng (UCLA)

Scalable to Heavy Traffic Loads Adjust time-of-arrival using factor 1-60 S6, Median S6, 90th Infocom 2012 C Peng (UCLA)

Low Metadata Management Overhead Compare with three metadata management schemes Naïve: on-demand topology-aware broadcast Flat: manage metadata in a ring (e.g. DHT, P2P) Topo: topology-aware design (VDN) Assume the communication cost is 1:4:10 (reverse to bandwidth) (a) Number of messages (b) Communication cost Infocom 2012 C Peng (UCLA)

Conclusion VDN is a network-aware P2P paradigm for VM image distribution Reduce image provisioning time Achieve the reasonable overhead Chunk-based sharing exploit inherent cross-image similarity Network-aware operations can optimize the performance in the context of data centers Infocom 2012 C Peng (UCLA)

THANKs Infocom 2012 C Peng (UCLA)