Ji-Yong Shin Cornell University In collaboration with Mahesh Balakrishnan (MSR SVC), Tudor Marian (Google), Lakshmi Ganesh (UT Austin), and Hakim Weatherspoon.

Slides:



Advertisements
Similar presentations
Gecko Storage System Tudor Marian, Lakshmi Ganesh, and Hakim Weatherspoon Cornell University.
Advertisements

Pelican: A Building Block for Exascale Cold Data Storage
Snapshots in a Flash with ioSnap TM Sriram Subramanian, Swami Sundararaman, Nisha Talagala, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau Copyright © 2014.
RAID- Redundant Array of Inexpensive Drives. Purpose Provide faster data access and larger storage Provide data redundancy.
Log-Structured Memory for DRAM-Based Storage Stephen Rumble, Ankita Kejriwal, and John Ousterhout Stanford University.
The Google File System (GFS). Introduction Special Assumptions Consistency Model System Design System Interactions Fault Tolerance (Results)
Smoke and Mirrors: Shadowing Files at a Geographically Remote Location Without Loss of Performance August 2008 Hakim Weatherspoon, Lakshmi Ganesh, Tudor.
G Robert Grimm New York University Sprite LFS or Let’s Log Everything.
1 File Systems Chapter Files 6.2 Directories 6.3 File system implementation 6.4 Example file systems.
Multiprocessing Memory Management
Cse Feb-001 CSE 451 Section February 24, 2000 Project 3 – VM.
Ext3 Journaling File System “absolute consistency of the filesystem in every respect after a reboot, with no loss of existing functionality” chadd williams.
Lab Manager Maintenance July, 2008 VMware Confidential Lab Manager 3 Training Series Module 9.
G Robert Grimm New York University Sprite LFS or Let’s Log Everything.
Everest: scaling down peak loads through I/O off-loading D. Narayanan, A. Donnelly, E. Thereska, S. Elnikety, A. Rowstron Microsoft Research Cambridge,
Ji-Yong Shin Cornell University In collaboration with Mahesh Balakrishnan (MSR SVC), Tudor Marian (Google), and Hakim Weatherspoon (Cornell) Gecko: Contention-Oblivious.
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
Solid State Drive Feb 15. NAND Flash Memory Main storage component of Solid State Drive (SSD) USB Drive, cell phone, touch pad…
Frangipani: A Scalable Distributed File System C. A. Thekkath, T. Mann, and E. K. Lee Systems Research Center Digital Equipment Corporation.
DISKS IS421. DISK  A disk consists of Read/write head, and arm  A platter is divided into Tracks and sector  The R/W heads can R/W at the same time.
1 File System Implementation Operating Systems Hebrew University Spring 2010.
Interposed Request Routing for Scalable Network Storage Darrell Anderson, Jeff Chase, and Amin Vahdat Department of Computer Science Duke University.
Computing Hardware Starter.
Smoke and Mirrors: Shadowing Files at a Geographically Remote Location Without Loss of Performance Hakim Weatherspoon Joint with Lakshmi Ganesh, Tudor.
Storage Management in Virtualized Cloud Environments Sankaran Sivathanu, Ling Liu, Mei Yiduo and Xing Pu Student Workshop on Frontiers of Cloud Computing,
Lecture 19: Virtual Memory
Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, Institute of Network and Information Systems School of Electrical Engineering.
Amy Apon, Pawel Wolinski, Dennis Reed Greg Amerson, Prathima Gorjala University of Arkansas Commercial Applications of High Performance Computing Massive.
3 Computing System Fundamentals
CS414 Review Session.
Enabling Large-Scale Storage in Sensor Networks with the Coffee File System ISPN 2009 Lawrence.
The Design and Implementation of Log-Structure File System M. Rosenblum and J. Ousterhout.
26-Oct-15CSE 542: Operating Systems1 File system trace papers The Design and Implementation of a Log- Structured File System. M. Rosenblum, and J.K. Ousterhout.
Log-structured Memory for DRAM-based Storage Stephen Rumble, John Ousterhout Center for Future Architectures Research Storage3.2: Architectures.
CS 153 Design of Operating Systems Spring 2015 Lecture 22: File system optimizations.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 11: File System Implementation.
CS 153 Design of Operating Systems Spring 2015 Lecture 21: File Systems.
RevDedup: A Reverse Deduplication Storage System Optimized for Reads to Latest Backups Chun-Ho Ng, Patrick P. C. Lee The Chinese University of Hong Kong.
Improving Disk Throughput in Data-Intensive Servers Enrique V. Carrera and Ricardo Bianchini Department of Computer Science Rutgers University.
Virtual Memory The memory space of a process is normally divided into blocks that are either pages or segments. Virtual memory management takes.
CS333 Intro to Operating Systems Jonathan Walpole.
Lecture 10 Page 1 CS 111 Summer 2013 File Systems Control Structures A file is a named collection of information Primary roles of file system: – To store.
Memory The term memory is referred to computer’s main memory, or RAM (Random Access Memory). RAM is the location where data and programs are stored (temporarily),
Embedded System Lab. 정영진 The Design and Implementation of a Log-Structured File System Mendel Rosenblum and John K. Ousterhout ACM Transactions.
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
Transactional Flash V. Prabhakaran, T. L. Rodeheffer, L. Zhou (MSR, Silicon Valley), OSDI 2008 Shimin Chen Big Data Reading Group.
Introduction to Exadata X5 and X6 New Features
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)
 The emerged flash-memory based solid state drives (SSDs) have rapidly replaced the traditional hard disk drives (HDDs) in many applications.  Characteristics.
CPSC 426: Building Decentralized Systems Persistence
Lecture 17 Raid. Device Protocol Variants Status checks: polling vs. interrupts Data: PIO vs. DMA Control: special instructions vs. memory-mapped I/O.
Log-Structured Memory for DRAM-Based Storage Stephen Rumble and John Ousterhout Stanford University.
Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung
Answer to Summary Questions
Memory COMPUTER ARCHITECTURE
FileSystems.
Isotope: Transactional Isolation for Block Storage
Windows Azure Migrating SQL Server Workloads
Operating Systems ECE344 Lecture 11: SSD Ding Yuan
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung Google Presented by Jiamin Huang EECS 582 – W16.
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
Overview Continuation from Monday (File system implementation)
Overview: File system implementation (cont)
Persistence: hard disk drive
2.C Memory GCSE Computing Langley Park School for Boys.
Co-designed Virtual Machines for Reliable Computer Systems
Page Cache and Page Writeback
[Altinbuke, Walsh, Weatherspoon, Bala, Bracy, McKee, and Sirer]
The Design and Implementation of a Log-Structured File System
Presentation transcript:

Ji-Yong Shin Cornell University In collaboration with Mahesh Balakrishnan (MSR SVC), Tudor Marian (Google), Lakshmi Ganesh (UT Austin), and Hakim Weatherspoon (Cornell) HotStorage Talk on June 13, 2012 Gecko: A Contention-Oblivious Design for Cloud Storage

What happens to storage? Cloud and Virtualization VMM Guest VM Guest VM Guest VM Guest VM … Shared Disk Shared Disk Shared Disk SEQUENTIAL RANDOM 2

Sequential Writers Only Sequential streams are no longer sequential – 1~8 VM + EXT4 FS – 4-disk RAID-0 setting – Sequential Writer (256KB) – Random Writer (4KB) 3

Existing Solutions for IO Contention? IO scheduling – Entails increased latency for certain workload – May still require moving disk head Workload placement – Requires prior knowledge or dynamic prediction – Limits freedom of placing VMs in the cloud 4

… Log Tail Shared Disk Log-structured File System to the Rescue? – Write everything as log to tail – Perfect prediction for writes – Assume reads are handled by cache Addr … … N … Write Log Tail Shared Disk Shared Disk Log Tail RAID0 + LFS … 5

Garbage collection is the Achilles’ Heel of LFS … Log Tail Shared Disk Challenges of Log-Structured File System … N … Log Tail Shared Disk Shared Disk Log Tail … GC Garbage Collection (GC) from Log Head RAID0 + LFS 6

Challenges of Log-Structured File System Garbage collection is the Achilles’ Heel of LFS – 2-disk RAID-0 setting of LFS – GC under write-only workload RAID 0 + LFS 7

Summary of Challenges in the Cloud Server consolidation through cloud and virtualization – Numbers of core and VM per server increase – Storage is not yet maturely virtualized RAID cannot preserve high throughput – IO performance varies depending on coexisting VMs LFS only solves write-write contention – GC operation interferes with logging – First class reads can interfere with logging 8

Rest of the Talk Gecko, a chain logging design – Overview – Caching reads – Garbage collection strategies – Metadata management Evaluation Summary 9

Gecko: Chain logging Design Cutting the log tail from the body – GC reads do not interrupt the sequential write – 1 uncontended drive >>faster>> N contended drives Disk 2Disk 1Disk 0 Log Tail Physical Addr Space GC Garbage Collection from Log Head Disk 0Disk 1Disk 2 10

Disk 2 Read Gecko Overview and Properties Fault tolerance + Read performance Disk 1 Disk 0 Log Tail Read No write-write contention, No GC-write contention Disk 2’ Disk 1’ Disk 0’ Primary Mirror Read Log Tail Disk 1’ Off Disk 0’ Off Power saving w/o Consistency concerns Read 11

Gecko Caching What happens to reads going to tail drives? Disk 2 Disk 1 Disk 0 Write Tail Cache (Flash ) Read Blocks AT LEAST 86% of reads from real workload. (500GB disk, 34GB cache) Prevents first-class read-write contention. Revival of LFS using Flash 12

Gecko Garbage Collection (GC) Disk 2 Disk 1 Move-to-tail GC Compact-in-body GC + Simple - GC shares write bandwidth + GC is independent from writes - Complicates metadata management Disk 0 13

Gecko Metadata and Persistence Primary map: less than 8 GB RAM for a 8 TB storage Inverse map: 8 GB flash for a 8 TB storage (written every 1024 writes) 4KB pages emptyfilled Data (in disk) Disk 0Disk 2Disk 1 Physical-to-logical map (in flash) headtail headtail 4-byte entries Logical-to-physical map (in memory) 14

Evaluation Setup In-kernel version – Implemented as block device for portability – Similar to software RAID – Move-to-tail GC User-level emulator – For fast prototyping – Runs block traces – Compact-in-body GC 15

Evaluation Performance under move-to-tail GC – 2-disk Gecko chain, write only workload – GC does not affect aggregate throughput RAID 0 + LFS Gecko 16

Evaluation Performance under compact-in-body GC (CIB GC) – Write only workload is used – Application throughput is not affected 17

Summary Log-structured designs – Oblivious to write-write contention – Sensitive to GC/read-write contention Gecko fixes the GC-write and read-write contention – Separates the tail of the log from its body – Flash re-enables log-structured designs Tail flash cache for read-write contention Small flash memory for persistence 18

Future work Experiments with real workloads Exploration to minimize read-read contention IO handling policy inside Gecko 19