1 Fast, Scalable Disk Imaging with Frisbee University of Utah Mike Hibler, Leigh Stoller, Jay Lepreau, Robert Ricci, Chad Barb.

Slides:



Advertisements
Similar presentations
NAS vs. SAN 10/2010 Palestinian Land Authority IT Department By Nahreen Ameen 1.
Advertisements

1 Securing the Frisbee Multicast Disk Loader Robert Ricci, Jonathon Duerig University of Utah.
1 Cplant I/O Pang Chen Lee Ward Sandia National Laboratories Scalable Computing Systems Fifth NASA/DOE Joint PC Cluster Computing Conference October 6-8,
Scalable On-demand Media Streaming with Packet Loss Recovery Anirban Mahanti Department of Computer Science University of Calgary Calgary, AB T2N 1N4 Canada.
Transparent Checkpoint of Closed Distributed Systems in Emulab Anton Burtsev, Prashanth Radhakrishnan, Mike Hibler, and Jay Lepreau University of Utah,
1 Web Server Performance in a WAN Environment Vincent W. Freeh Computer Science North Carolina State Vsevolod V. Panteleenko Computer Science & Engineering.
Emulab.net: An Emulation Testbed for Networks and Distributed Systems Jay Lepreau and many others University of Utah Intel IXA University Workshop June.
Operating System Support Focus on Architecture
1 Web Server Administration Chapter 3 Installing the Server.
MIS 431 Chapter 71 Ch. 7: Advanced File Management System MIS 431 Created Spring 2006.
Emulab Federation Preliminary Design Robert Ricci with Jay Lepreau, Leigh Stoller, Mike Hibler University of Utah USC/ISI Federation Workshop December.
PRASHANTHI NARAYAN NETTEM.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 11 Managing and Monitoring a Windows Server 2008 Network.
File Systems (2). Readings r Silbershatz et al: 11.8.
1 MASTERING (VIRTUAL) NETWORKS A Case Study of Virtualizing Internet Lab Avin Chen Borokhovich Michael Goldfeld Arik.
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.
1 Proxy-Assisted Techniques for Delivering Continuous Multimedia Streams Lixin Gao, Zhi-Li Zhang, and Don Towsley.
Network File Systems Victoria Krafft CS /4/05.
Object-based Storage Long Liu Outline Why do we need object based storage? What is object based storage? How to take advantage of it? What's.
Interposed Request Routing for Scalable Network Storage Darrell Anderson, Jeff Chase, and Amin Vahdat Department of Computer Science Duke University.
Bob Thome, Senior Director of Product Management, Oracle SIMPLIFYING YOUR HIGH AVAILABILITY DATABASE.
A Virtual Circuit Multicast Transport Protocol (VCMTP) for Scientific Data Distribution Jie Li and Malathi Veeraraghavan University of Virginia Steve Emmerson.
EE616 Technical Project Video Hosting Architecture By Phillip Sutton.
I/O – Chapter 8 Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
Seminar on Linux-based embedded systems
CPSC 441: Multimedia Networking1 Outline r Scalable Streaming Techniques r Content Distribution Networks.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Chapter 3 Installing Windows XP Professional. Preparing for installation Pre-installation requirement; ◦ Hardware requirements ◦ Hardware compatibility.
1 Computer and Network Bottlenecks Author: Rodger Burgess 27th October 2008 © Copyright reserved.
Designing and Deploying a Scalable EPM Solution Ken Toole Platform Test Manager MS Project Microsoft.
May PEM status report. O.Bärring 1 PEM status report Large-Scale Cluster Computing Workshop FNAL, May Olof Bärring, CERN.
DBI313. MetricOLTPDWLog Read/Write mixMostly reads, smaller # of rows at a time Scan intensive, large portions of data at a time, bulk loading Mostly.
A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.
DONE-08 Sizing and Performance Tuning N-Tier Applications Mike Furgal Performance Manager Progress Software
Achieving Scalability, Performance and Availability on Linux with Oracle 9iR2-RAC Grant McAlister Senior Database Engineer Amazon.com Paper
Kiew-Hong Chua a.k.a Francis Computer Network Presentation 12/5/00.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
1 An Adaptive File Distribution Algorithm for Wide Area Network Takashi Hoshino, Kenjiro Taura, Takashi Chikayama University of Tokyo.
GVis: Grid-enabled Interactive Visualization State Key Laboratory. of CAD&CG Zhejiang University, Hangzhou
Latency Reduction Techniques for Remote Memory Access in ANEMONE Mark Lewandowski Department of Computer Science Florida State University.
Large-scale Virtualization in the Emulab Network Testbed Mike Hibler, Robert Ricci, Leigh Stoller Jonathon Duerig Shashi Guruprasad, Tim Stack, Kirk Webb,
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Based upon slides from Jay Lepreau, Utah Emulab Introduction Shiv Kalyanaraman
© 2002 Global Knowledge Network, Inc. All rights reserved. Windows Server 2003 MCSA and MCSE Upgrade Clustering Servers.
Distributed applications monitoring at system and network level A.Brunengo (INFN- Ge), A.Ghiselli (INFN-Cnaf), L.Luminari (INFN-Roma1), L.Perini (INFN-Mi),
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Review CS File Systems - Partitions What is a hard disk partition?
1 Farm Issues L1&HLT Implementation Review Niko Neufeld, CERN-EP Tuesday, April 29 th.
Trickles :A stateless network stack for improved Scalability, Resilience, and Flexibility Alan Shieh,Andrew C.Myers,Emin Gun Sirer Dept. of Computer Science,Cornell.
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
CDA-5155 Computer Architecture Principles Fall 2000 Multiprocessor Architectures.
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Robustness in the Salus scalable block store Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike.
Studies of LHCb Trigger Readout Network Design Karol Hennessy University College Dublin Karol Hennessy University College Dublin.
10/18/01Linux Reconstruction Farms at Fermilab 1 Steven C. Timm--Fermilab.
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)
Using Deduplicating Storage for Efficient Disk Image Deployment Xing Lin, Mike Hibler, Eric Eide, Robert Ricci University of Utah.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Deploy Containerized OPNFV Cluster Efficiently Using Daisy Installer
Diskpool and cloud storage benchmarks used in IT-DSS
Securing the Frisbee Multicast Disk Loader
2018 Huawei H Real Questions Killtest
by Mikael Bjerga & Arne Lange
Database System Architectures
Cluster Computers.
Presentation transcript:

1 Fast, Scalable Disk Imaging with Frisbee University of Utah Mike Hibler, Leigh Stoller, Jay Lepreau, Robert Ricci, Chad Barb

2 Key Points Frisbee clones whole disks from a server to many clients using multicast Fast  34 seconds for standard FreeBSD to 1 machine Scalable  34 seconds to 80 machines! Due to careful design and engineering  Straightforward implementation loaded in 30 minutes

3 Disk Imaging Matters Data on a disk or partition, rather than file, granularity Uses  OS installation  Catastrophe recovery Environments  Enterprise  Clusters  Utility computing  Research/education environments

4 Emulab

5 The Emulab Environment Network testbed for emulation  Cluster of 168 PCs 100Mbps Ethernet LAN Users have full root access to nodes Configuration stored in a central database  Fast reloading encourages aggressive experiments  Swapping to free idle resources Custom disk images Frisbee in use 18 months, loaded > 60,000 disks

6 Disk Imaging Unique Features General and Versatile  Does not require knowledge of filesystem  Can replace one filesystem type with another Robust  Old disk contents irrelevant Fast

7 Disk Imaging Tasks Distribution Installation Targets Creation Source Server

8 Key Design Aspects Domain-specific data compression Two-level data segmentation LAN-optimized custom multicast protocol High levels of concurrency in the client

9 Image Creation Segments images into self-describing “chunks” Compresses with zlib Can create “raw” images with opaque contents Optimizes some common filesystems  ext2, FFS, NTFS  Skips free blocks

10 Image Layout Chunk logically divided into 1024 blocks Medium-sized chunks good for  Fast I/O  Compression  Pipelining Small blocks good for  Retransmits

11 Image Distribution Environment LAN environment  Low latency, high bandwidth  IP multicast  Low packet loss Dedicated clients  Consuming all bandwidth and CPU OK

12 Custom Multicast Protocol Receiver-driven  Server is stateless  Server consumes no bandwidth when idle Reliable, unordered delivery “Application-level framing” Requests block ranges within 1MB chunk

13 Client Operation Joins multicast channel  One per image Asks server for image size Starts requesting blocks  Requests are multicast Client start not synchronized

14 Client Requests Request

15 Client Requests Block

16 Tuning is Crucial Client side  Timeouts  Read-ahead amount Server side  Burst size  Inter-burst gap

17 Image Installation Pipelined with distribution  Can install chunks in any order  Segmented data makes this possible Three threads for overlapping tasks Disk write speed the bottleneck Can skip or zero free blocks DecompressionDisk Writer BlocksChunk Distribution Decompressed Data

18 Evaluation

19 Performance Disk image  FreeBSD installation used on Emulab  3 GB filesystem, 642 MB of data  80% free space  Compressed image size is 180 MB Client PCs  850 MHz CPU, 100 MHz memory bus  UDMA 33 IDE disks, 21.4 MB/sec write speed  100 Mbps Ethernet, server has Gigabit

20 Speed and Scaling

21 FS-Aware Compression

22 Packet Loss

23 Related Work Disk imagers without multicast  Partition Image [ Disk imagers with multicast  PowerQuest Drive Image Pro  Symantec Ghost Differential Update  rsync 5x slower with secure checksums Reliable multicast  SRM [Floyd ’97]  RMTP [Lin ’96]

24 Comparison to Symantec Ghost

25 Ghost with Packet Loss

26 How Frisbee Changed our Lives (on Emulab, at least) Made disk loading between experiments practical Made large experiments possible  Unicast loader maxed out at 12 Made swapping possible  Much more efficient resource usage

27 The Real Bottom Line “I used to be able to go to lunch while I loaded a disk, now I can’t even go to the bathroom!” - Mike Hibler (first author)

28 Conclusion Frisbee is  Fast  Scalable  Proven Careful domain-specific design from top to bottom is key Source available at

29

30 Comparison to rsync Timestamps not robust Checksums slow Conclusion: Bulk writes beat data comparison

31 How to Synchronize Disks Differential update - rsync  Operates through filesystem  + Only transfers/writes changes  + Saves bandwidth Whole-disk imaging  Operates below filesystem  + General  + Robust  + Versatile Whole-disk imaging essential for our task

32 Image Distribution Performance: Skewed Starts

33 Future Server pacing Self tuning

34 The Frisbee Protocol Chunk Finished? More Chunks Left? Wait for BLOCKs Outstanding Requests? Send REQUEST Start Finished No BLOCK Received Yes Timeout No

35 The Evolution of Frisbee First disk imager: Feb, 1999 Started with NFS distribution Added compression  Naive  FS-aware Overlapping I/O Multicast 30 minutes down to 34 seconds!