X-RAY: A Non-Invasive Exclusive Caching Mechanism for RAIDs Lakshmi N. Bairavasundaram Muthian Sivathanu Andrea C. Arpaci-Dusseau Remzi H. Arpaci-Dusseau.

Slides:

Advertisements

Similar presentations

Conserving Disk Energy in Network Servers ACM 17th annual international conference on Supercomputing Presented by Hsu Hao Chen.

Advertisements

Virtual Memory 3 Fred Kuhns

1 CMPT 300 Introduction to Operating Systems File Systems Sample Questions.

SE-292 High Performance Computing Memory Hierarchy R. Govindarajan

A KTEC Center of Excellence 1 Cooperative Caching for Chip Multiprocessors Jichuan Chang and Gurindar S. Sohi University of Wisconsin-Madison.

Lecture 8: Memory Hierarchy Cache Performance Kai Bu

CS4432: Database Systems II Buffer Manager 1. 2 Covered in week 1.

Predictable Computer Systems Remzi Arpaci-Dusseau University of Wisconsin, Madison.

CSIE30300 Computer Architecture Unit 10: Virtual Memory Hsin-Chou Chi [Adapted from material by and

Exploiting Gray-Box Knowledge of Buffer Cache Management Nathan C. Burnett, John Bent, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of.

CMPT 300: Final Review Chapters 8 – Memory Management: Ch. 8, 9 Address spaces Logical (virtual): generated by the CPU Physical: seen by the memory.

Lecture 17 I/O Optimization. Disk Organization Tracks: concentric rings around disk surface Sectors: arc of track, minimum unit of transfer Cylinder:

CMPT 300: Final Review Chapters 8 – Memory Management: Ch. 8, 9 Address spaces Logical (virtual): generated by the CPU Physical: seen by the memory.

Csci4203/ece43631 Review Quiz. 1)It is less expensive 2)It is usually faster 3)Its average CPI is smaller 4)It allows a faster clock rate 5)It has a simpler.

CSCI2413 Lecture 6 Operating Systems Memory Management 2 phones off (please)

THE DESIGN AND IMPLEMENTATION OF A LOG-STRUCTURED FILE SYSTEM M. Rosenblum and J. K. Ousterhout University of California, Berkeley.

DEDUPLICATION IN YAFFS KARTHIK NARAYAN PAVITHRA SESHADRIVIJAYAKRISHNAN.

Compressed Memory Hierarchy Dongrui SHE Jianhua HUI.

Guide to Linux Installation and Administration, 2e 1 Chapter 9 Preparing for Emergencies.

Page 19/17/2015 CSE 30341: Operating Systems Principles Optimal Algorithm  Replace page that will not be used for longest period of time  Used for measuring.

Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?

Web Cache Replacement Policies: Properties, Limitations and Implications Fabrício Benevenuto, Fernando Duarte, Virgílio Almeida, Jussara Almeida Computer.

Module – 4 Intelligent storage system

CSE 451: Operating Systems Section 10 Project 3 wrap-up, final exam review.

1 Geiger: Monitoring the Buffer Cache in a Virtual Machine Environment Stephen T. Jones Andrea C. Arpaci-Dusseau Remzi H. Arpaci-Dusseau Department of.

THE DESIGN AND IMPLEMENTATION OF A LOG-STRUCTURED FILE SYSTEM M. Rosenblum and J. K. Ousterhout University of California, Berkeley.

Lecture 10 Memory Hierarchy and Cache Design Computer Architecture COE 501.

Journal-guided Resynchronization for Software RAID

The Design and Implementation of Log-Structure File System M. Rosenblum and J. Ousterhout.

CE Operating Systems Lecture 3 Overview of OS functions and structure.

Serverless Network File Systems Overview by Joseph Thompson.

Exploiting Gray-Box Knowledge of Buffer Cache Management Nathan C. Burnett, John Bent, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of.

1 Lecture: Large Caches, Virtual Memory Topics: cache innovations (Sections 2.4, B.4, B.5)

Abdullah Aldahami ( ) March 23, Introduction 2. Background 3. Simulation Techniques a.Experimental Settings b.Model Description c.Methodology.

Deconstructing Storage Arrays Timothy E. Denehy, John Bent, Florentina I. Popovici, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin,

COT 4600 Operating Systems Fall 2009 Dan C. Marinescu Office: HEC 304 Office hours: Tu-Th 3:00-4:00 PM.

CS 153 Design of Operating Systems Spring 2015 Lecture 21: File Systems.

Semantically-Smart Disk Systems Muthian Sivathanu, Vijayan Prabhakaran, Florentina Popovici, Tim Denehy, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau University.

Cache Memory By Tom Austin. What is cache memory? A cache is a collection of duplicate data, where the original data is expensive to fetch or compute.

Improving Disk Throughput in Data-Intensive Servers Enrique V. Carrera and Ricardo Bianchini Department of Computer Science Rutgers University.

Virtual Memory The memory space of a process is normally divided into blocks that are either pages or segments. Virtual memory management takes.

Time Parallel Simulations I Problem-Specific Approach to Create Massively Parallel Simulations.

Fast File System 2/17/2006. Introduction Paper talked about changes to old BSD 4.2 File System (FS) Motivation - Applications require greater throughput.

Operating Systems: Wrap-Up Questions answered in this lecture: What is an Operating System? Why are operating systems so interesting? What techniques can.

Virtual Memory Questions answered in this lecture: How to run process when not enough physical memory? When should a page be moved from disk to memory?

Energy Efficient Prefetching and Caching Athanasios E. Papathanasiou and Michael L. Scott. University of Rochester Proceedings of 2004 USENIX Annual Technical.

Transforming Policies into Mechanisms with Infokernel Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Nathan C. Burnett, Timothy E. Denehy, Thomas J.

Memory Management Continued Questions answered in this lecture: What is paging? How can segmentation and paging be combined? How can one speed up address.

Distributed File Systems Questions answered in this lecture: Why are distributed file systems useful? What is difficult about distributed file systems?

LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”

Bridging the Information Gap in Storage Protocol Stacks Timothy E. Denehy, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau University of Wisconsin,

1 Contents Memory types & memory hierarchy Virtual memory (VM) Page replacement algorithms in case of VM.

Taeho Kgil, Trevor Mudge Advanced Computer Architecture Laboratory The University of Michigan Ann Arbor, USA CASES’06.

Memory Hierarchy Ideal memory is fast, large, and inexpensive

Jonathan Walpole Computer Science Portland State University

Lecture: Cache Hierarchies

Journaling File Systems

Filesystems 2 Adapted from slides of Hank Levy

UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department

(Architectural Support for) Semantically-Smart Disk Systems

Lecture: Cache Innovations, Virtual Memory

Bridging the Information Gap in Storage Protocol Stacks

COT 4600 Operating Systems Spring 2011

Contents Memory types & memory hierarchy Virtual memory (VM)

Lecture: Cache Hierarchies

CSC3050 – Computer Architecture

Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.

Virtual Memory: Working Sets

COMP755 Advanced Operating Systems

Presentation transcript:

X-RAY: A Non-Invasive Exclusive Caching Mechanism for RAIDs Lakshmi N. Bairavasundaram Muthian Sivathanu Andrea C. Arpaci-Dusseau Remzi H. Arpaci-Dusseau ADvanced Systems Laboratory Computer Sciences Department University of Wisconsin – Madison

Introduction  Caching in modern systems Multiple levels Storage: 2-level hierarchy  Level 1: File system (FS) cache Software-managed Main memory of host/client LRU-like cache replacement  Level 2: RAID cache Firmware-managed Memory inside RAID system Usually LRU replacement File system cache RAID cache RAID Application Host

Introduction – contd.  LRU Replace LRU block Cache placement on read Read Block no. 10 LRUMRU Read Block no …… ……..4523

Introduction – contd.  LRU Replace LRU block Cache placement on read  2 levels of LRU Redundant contents …….. Read Block no MRU LRU 10 LRU10 MRU LRU …. FS Cache RAID Cache

Introduction – contd.  LRU Cache placement on read Replace LRU block  2 levels of LRU Redundant contents  Goal: Exclusive caching 10 LRU10 MRU LRU …. FS Cache RAID Cache

Improved RAID Caching  Multi-Queue (Zhou et al. 2001) Add frequency component to cache policy Not strictly exclusive!  DEMOTE (Wong and Wilkes 2002) Change interface to disk File system issues “cache place” command Has perfect information and hence perfectly exclusive caches Interface changes – difficult to deploy

Ideal RAID Cache  Exclusive caching File system and RAID caches should have different contents  Global LRU Known to work well RAID cache should be a victim cache  No interface changes …. …… FS Cache RAID Cache Block Read Victim Block LRU MRU

X-RAY  Observes disk traffic Reads and writes to data and metadata  Builds a model of the FS cache Uses semantic knowledge Predicts size and contents of FS cache  Identifies set of exclusive blocks Recent victims of the FS cache  Reads blocks from disk into cache  Result A nearly exclusive cache without interface changes File system cache RAID cache RAID Host Model of FS cache X-RAY

Talk Outline  Introduction  File Systems  Information and Inferences  X-RAY Cache Design  Results  Conclusion

File System Operation  Applications perform file reads and writes  File system (Unix) Translates file accesses to disk block requests Metadata  To maintain application data on disk and manage disk blocks  Periodically written to disk  Examples: inodes, bitmap blocks

File System Operation  Inode  Pointers to data blocks  File access information Inode Data Blocks Latest access time Pointers to data blocks File

File System Operation  File access Use inode to obtain pointers to disk data blocks Read corresponding blocks from disk if they are not in FS cache Update the access time information in inode  Metadata updates Periodically check for “dirty” inodes and write to disk

The Problem  To observe disk traffic and infer the contents of FS cache  Why difficult? FS cache size changes over time  Shares main memory with virtual memory system

The Problem  To observe disk traffic and infer the contents of FS cache  Why difficult? FS cache size changes over time Disk cannot observe all FS-level accesses Read block: 10 Disk Read LRU MRU FS Cache FS Cache Model RAID

The Problem  To observe disk traffic and infer the contents of FS cache  Why difficult? FS cache size changes over time Disk cannot observe all FS-level accesses Read block: 10 Disk Read LRU MRU 13 FS Cache FS Cache Model RAID

The Problem Read block: LRU MRU FS Cache FS Cache Model RAID  To observe disk traffic and infer the contents of FS cache  Why difficult? FS cache size changes over time Disk cannot observe all FS-level accesses

The Problem Read block: LRU MRU FS Cache FS Cache Model RAID  To observe disk traffic and infer the contents of FS cache  Why difficult? FS cache size changes over time Disk cannot observe all FS-level accesses  Key observation We need information about accesses that hit in FS cache File system maintains access information in inodes

Talk Outline  Introduction  File Systems  Information and Inferences  X-RAY Cache Design  Results  Conclusion

Information  Obtain information from observing disk traffic  Knowledge of file system structures and operations File system maintains time of last access in inodes Periodic inode writes Assuming whole file access, all blocks are in FS cache  Assume file system cache policy is LRU

Inferences  Read for data block Block will be placed in file system cache (MRU block)  Read for previously read data block Block became victim in file system cache Blocks with an earlier access time should also be victims  Inode write: new access time, no disk read observed All blocks belonging to file are in FS cache Other blocks with later access time should also be present

Talk Outline  Introduction  File Systems  Information and Inferences  X-RAY Cache Design  Results  Conclusion

Design  Recency list (R-list) List of data blocks ordered by access time  Cache Begin (CB) pointer Divides R-list into inclusive and exclusive regions  RAID Cache contents Subset of blocks in exclusive region LRU MRU A, 1 B, 1 D, 3C, 2 F, 5 E, 3 CB Inclusive region Exclusive region Block numberAccess time Blocks the RAID should cache Blocks expected to be in FS cache

Disk Read LRUMRU A, 1B, 1C, 2D, 3 E, 3F, 4 CB Inclusive region Exclusive region Read Block ‘D’ ; time = 6

Disk Read LRUMRU A, 1B, 1C, 2D, 3 E, 3F, 4 CB Inclusive region Exclusive region Read Block ‘D’ ; time = 6

Disk Read LRUMRU A, 1B, 1C, 2D, 6 E, 3F, 4 CB Inclusive region Exclusive region Read Block ‘D’ ; time = 6

Inode Write – Access time change LRUMRU A, 1B, 1C, 2D, 3 E, 4F, 5 CB Inclusive region Exclusive region G, 7 Inode “23” : access time = 6 Semantic knowledge Inode “23” == blocks D & E Blocks D, E : access time = 6

Inode Write – Access time change LRUMRU A, 1B, 1C, 2D, 3 E, 4F, 5 CB Inclusive region Exclusive region G, 7 Blocks D, E : access time = 6 Inode “23” : access time = 6

Inode Write – Access time change LRUMRU A, 1B, 1C, 2F, 5 D, 6E, 6 CB Inclusive region Exclusive region G, 7 Blocks D, E : access time = 6 Inode “23” : access time = 6

X-RAY Cache LRUMRU A, 1B, 1 C, 2F, 5D, 6E, 6 CB Inclusive region Exclusive region G, 7 RAID Cache (size = 2 blocks)  Keep track of additions to window in exclusive region

X-RAY Cache  Read newly-added blocks from disk Replace blocks no longer in the window Additional disk bandwidth Idle time, extra internal bandwidth, freeblock scheduling LRUMRU A, 1B, 1 C, 2F, 5D, 6E, 6 CB Inclusive region Exclusive region G, 7 RAID Cache (size = 2 blocks)

Talk Outline  Introduction  File Systems  Information and Inferences  X-RAY Cache Design  Results Tracking FS Cache Contents RAID Cache Performance  Conclusion

Results – Tracking  Accurate size and content prediction  Highly responsive to FS cache size changes  Tolerates changes in inode write interval  Partial file reads X-RAY performs well if percentage of partially accessed files is < 40% (typical traces have less than 30%)

Results – Cache Performance  Performs better than LRU and Multi-Queue  Close to DEMOTE, in spite of imperfect information  Hit rate advantage translates to lower read latency

Additional Results  File system cache policy is not LRU Clock, 2Q X-RAY performs nearly as well as before It performs better than both LRU and Multi-Queue  Idle time requirements X-RAY reads blocks into cache only during idle time It performs well if idle time is greater than one-third of actual idle time observed in the trace  More in the paper …

Conclusion  Easy deployment is an important goal in developing technology Avoid interface changes – use non-invasive mechanisms  Higher-level systems maintain various pieces of information about data they manage Provide low-level systems with basic semantic knowledge  Semantic intelligence for managing RAID caches Use access information in metadata to track file system cache contents and cache exclusive blocks In spite of imperfect information, X-RAY performs nearly as well as changing the interface  Semantically-smart Disk Systems Availability, security and performance improvements

Questions ? ADvanced Systems Laboratory (ADSL) Computer Sciences, University of Wisconsin-Madison