University of Pittsburgh Memorage: Emerging Persistent RAM based Malleable Main Memory and Storage Architecture Juyoung Jung and Sangyeun Cho Computer.

Slides:



Advertisements
Similar presentations
Conserving Disk Energy in Network Servers ACM 17th annual international conference on Supercomputing Presented by Hsu Hao Chen.
Advertisements

Sabyasachi Ghosh Mark Redekopp Murali Annavaram Ming-Hsieh Department of EE USC KnightShift: Enhancing Energy Efficiency by.
Virtual Switching Without a Hypervisor for a More Secure Cloud Xin Jin Princeton University Joint work with Eric Keller(UPenn) and Jennifer Rexford(Princeton)
Discussion Week 7 TA: Kyle Dewey. Overview Midterm debriefing Virtual memory Virtual Filesystems / Disk I/O Project #3.
Kernel memory allocation
© 2010 VMware Inc. All rights reserved Confidential Performance Tuning for Windows Guest OS IT Pro Camp Presented by: Matthew Mitchell.
1 Characterization of Software Aging Effects in Elastic Storage Mechanisms for Private Clouds Rubens Matos, Jean Araujo, Vandi Alves and Paulo Maciel Presenter:
Teaching Old Caches New Tricks: Predictor Virtualization Andreas Moshovos Univ. of Toronto Ioana Burcea’s Thesis work Some parts joint with Stephen Somogyi.
VSphere vs. Hyper-V Metron Performance Showdown. Objectives Architecture Available metrics Challenges in virtual environments Test environment and methods.
Ken Birman. Massive data centers We’ve discussed the emergence of massive data centers associated with web applications and cloud computing Generally.
CMPT 300: Final Review Chapters 8 – Memory Management: Ch. 8, 9 Address spaces Logical (virtual): generated by the CPU Physical: seen by the memory.
An Adaptable Benchmark for MPFS Performance Testing A Master Thesis Presentation Yubing Wang Advisor: Prof. Mark Claypool.
Micro-Pages: Increasing DRAM Efficiency with Locality-Aware Data Placement Kshitij Sudan, Niladrish Chatterjee, David Nellans, Manu Awasthi, Rajeev Balasubramonian,
Kevin Lim*, Jichuan Chang +, Trevor Mudge*, Parthasarathy Ranganathan +, Steven K. Reinhardt* †, Thomas F. Wenisch* June 23, 2009 Disaggregated Memory.
1 Outline File Systems Implementation How disks work How to organize data (files) on disks Data structures Placement of files on disk.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.
1 stdchk : A Checkpoint Storage System for Desktop Grid Computing Matei Ripeanu – UBC Sudharshan S. Vazhkudai – ORNL Abdullah Gharaibeh – UBC The University.
Dyer Rolan, Basilio B. Fraguela, and Ramon Doallo Proceedings of the International Symposium on Microarchitecture (MICRO’09) Dec /7/14.
CERN openlab Open Day 10 June 2015 KL Yong Sergio Ruocco Data Center Technologies Division Speeding-up Large-Scale Storage with Non-Volatile Memory.
Xen and the Art of Virtualization. Introduction  Challenges to build virtual machines Performance isolation  Scheduling priority  Memory demand  Network.
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
Hystor : Making the Best Use of Solid State Drivers in High Performance Storage Systems Presenter : Dong Chang.
Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.
Solid State Drive Feb 15. NAND Flash Memory Main storage component of Solid State Drive (SSD) USB Drive, cell phone, touch pad…
CERN openlab Open Day 10 June 2015 KL Yong Sergio Ruocco Data Center Technologies Division Speeding-up Large-Scale Storage with Non-Volatile Memory.
CS533 Concepts of Operating Systems Jonathan Walpole.
Storage Class Memory Architecture for Energy Efficient Data Centers Bruce Childers, Sangyeun Cho, Rami Melhem, Daniel Mossé, Jun Yang, Youtao Zhang Computer.
Min Xu1, Yunfeng Zhu2, Patrick P. C. Lee1, Yinlong Xu2
Motivation SSDs will become the primary storage devices on PC, but NTFS behavior may not suitable to flash memory especially on metadata files. When considering.
/38 Lifetime Management of Flash-Based SSDs Using Recovery-Aware Dynamic Throttling Sungjin Lee, Taejin Kim, Kyungho Kim, and Jihong Kim Seoul.
Improving Network I/O Virtualization for Cloud Computing.
StimulusCache: Boosting Performance of Chip Multiprocessors with Excess Cache Hyunjin Lee Sangyeun Cho Bruce R. Childers Dept. of Computer Science University.
1 Geiger: Monitoring the Buffer Cache in a Virtual Machine Environment Stephen T. Jones Andrea C. Arpaci-Dusseau Remzi H. Arpaci-Dusseau Department of.
File System Implementation Chapter 12. File system Organization Application programs Application programs Logical file system Logical file system manages.
Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, Institute of Network and Information Systems School of Electrical Engineering.
CS533 Concepts of Operating Systems Jonathan Walpole.
Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device Shuang LiangRanjit NoronhaDhabaleswar K. Panda IEEE.
A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.
A Cyclic-Executive-Based QoS Guarantee over USB Chih-Yuan Huang,Li-Pin Chang, and Tei-Wei Kuo Department of Computer Science and Information Engineering.
+ CS 325: CS Hardware and Software Organization and Architecture Memory Organization.
University of Massachusetts, Amherst TFS: A Transparent File System for Contributory Storage James Cipar, Mark Corner, Emery Berger
Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Vidhya Sivasankaran.
1 Virtual Machine Memory Access Tracing With Hypervisor Exclusive Cache USENIX ‘07 Pin Lu & Kai Shen Department of Computer Science University of Rochester.
PROBLEM STATEMENT A solid-state drive (SSD) is a non-volatile storage device that uses flash memory rather than a magnetic disk to store data. SSDs provide.
Min Lee, Vishal Gupta, Karsten Schwan
Efficient Live Checkpointing Mechanisms for computation and memory-intensive VMs in a data center Kasidit Chanchio Vasabilab Dept of Computer Science,
Introduction: Memory Management 2 Ideally programmers want memory that is large fast non volatile Memory hierarchy small amount of fast, expensive memory.
CS 3204 Operating Systems Godmar Back Lecture 21.
An Asymmetric Distributed Shared Memory Model for Heterogeneous Parallel Systems Isaac Gelado, Javier Cabezas. John Stone, Sanjay Patel, Nacho Navarro.
XIP – eXecute In Place Jiyong Park. 2 Contents Flash Memory How to Use Flash Memory Flash Translation Layers (Traditional) JFFS JFFS2 eXecute.
File System Department of Computer Science Southern Illinois University Edwardsville Spring, 2016 Dr. Hiroshi Fujinoki CS 314.
Taeho Kgil, Trevor Mudge Advanced Computer Architecture Laboratory The University of Michigan Ann Arbor, USA CASES’06.
Virtual Memory Chapter 8.
Getting the Most out of Scientific Computing Resources
Getting the Most out of Scientific Computing Resources
Presented by Yoon-Soo Lee
Diskpool and cloud storage benchmarks used in IT-DSS
Filesystems.
reFresh SSDs: Enabling High Endurance, Low Cost Flash in Datacenters
KISS-Tree: Smart Latch-Free In-Memory Indexing on Modern Architectures
PARAMETER-AWARE I/O MANAGEMENT FOR SOLID STATE DISKS
Specialized Cloud Architectures
CSE451 Virtual Memory Paging Autumn 2002
Virtual Memory: Working Sets
Department of Computer Science
Operating System Concepts
Operating System Concepts
Efficient Migration of Large-memory VMs Using Private Virtual Memory
Presentation transcript:

University of Pittsburgh Memorage: Emerging Persistent RAM based Malleable Main Memory and Storage Architecture Juyoung Jung and Sangyeun Cho Computer Science Department University of Pittsburgh

Introduction  Conventional memory hierarchy 2 Secondary Storage Main Memory DRAM HDD

University of Pittsburgh Introduction  Conventional memory hierarchy 3 DRAM HDD High performance Low cost per bit Scaling Power consumption Slow performance improvement

University of Pittsburgh Emerging Persistent RAMs (PRAM) PropertiesDesign Implications ScalableHigher density Energy efficientEnergy saving Byte-addressableMain memory PersistentSecondary storage Slower Architectural support needed ! Imbalanced Read/Write Limited cell endurance (PRAM Storage Device)

University of Pittsburgh Outline  Introduction  PRAM-based System Model  Memorage Architecture  Experimental Results  Conclusion 5

University of Pittsburgh Future PRAM-based System Model 6 Memory Bus PSD CPU PRAM Main Memory Secondary Storage Legacy I/O interface via Platform Controller Hub (SLC) (MLC) Easy adoption without big change Familiar dichotomized memory hierarchy concept

University of Pittsburgh Problem of Memory Pressure 7 Memory Bus PSD CPU PRAM Main Memory Secondary Storage Legacy I/O interface via Platform Controller Hub Ever-growing memory demands Memory Pressure Severe performance degradation during page swapping operation

University of Pittsburgh Outline  Introduction  PRAM-based System Model  Memorage Architecture  Objective and observations  Memorage approaches  Design and implementation  Experimental Results  Conclusion 8

University of Pittsburgh Memorage Architecture  Objectives  Effective handling of memory pressure  Extending system lifetime  System-level observations  Little characteristic distinctions between main memory and storage resources (both are PRAMs)  Reducing I/O software overhead becomes more important than in the past  Storage density grows exponentially but the available storage capacity underutilized 9

University of Pittsburgh Outline  Introduction  PRAM-based System Model  Memorage Architecture  Objective and observations  Memorage approaches  Design and implementation  Experimental Results  Conclusion 10

University of Pittsburgh Memorage Architecture  Flexible resource sharing  Cross the traditional memory hierarchy boundary  Memorage approaches  Don’t swap, give more memory Under high memory pressure, borrow PRAM resources from PSD to cope with the memory deficit  Don’t pay for physical over-provisioning Excess PSD resources provide a system with “virtual” over-provisioning 11

University of Pittsburgh OS VM manger Memorage Resource Controller PSD device driver File system Memory subsystem Storage subsystem Memorage system Step1-1 Step1-2 Step2 case(a) Step2 case(a) Memory Pressure Okay! I have some to lend 12

University of Pittsburgh OS VM manger Memorage Resource Controller PSD device driver File system Memory subsystem Storage subsystem Memorage system Step1-1 Step1-2 Step2 case(b) Do swap and Page reclamation Memory Pressure Sorry! I have a tight budget 13

University of Pittsburgh Outline  Introduction  PRAM-based System Model  Memorage Architecture  Objective and observations  Memorage approaches  Design and implementation  Experimental Results  Conclusion 14

University of Pittsburgh Key Design Goals  Transparency to existing applications  Avoid re-compiling applications  Manageable required system changes  Fast adoption of Memorage architecture  Extensive reuse of existing VMM infrastructures  Low system overhead  Keep users oblivious to Memorage support 15

University of Pittsburgh Managing Resource Information PSD resource detection Physical PSD PRAM chunk0 chunkN chunk1 2. Building PSD resource data structures node DMA DMA32 NORMAL MEMORAGE zone from main memory detected during boot process from PSD Reuse memory hot-plug feature !!

University of Pittsburgh Managing Resource Information PSD resource transfer Physical PSD PRAM pages chunk0 chunkN chunk1 MEMORAGE zone Filesystem manipulation Filesystem manipulation

University of Pittsburgh File system Metadata Exposure 18 Boot block Block group 0Block group 1Block group n··· Super block Group descriptors Data bitmap Inode bitmap Inode table Data blocks Zone Memorage ··· Buddy allocator including new zone Memorage Exposed to Memorage manager ··· ··· Example of data bitmap change on storage capacity donation (4MB donation assuming 4KB data block size) On-disk layout of ext3 file system

University of Pittsburgh Memory Expansion and Shrinkage pages_high pages_low pages_min new pages_high new pages_low new pages_min kswapd woken up zone balanced kswapd sleep time Total size Available MM pages AdditionalMemorage pages With Memorage, never reach to watermark to invoke kswapd in this case expanded margin

University of Pittsburgh Outline  Introduction  PRAM-based System Model  Memorage Architecture  Experimental Results  Conclusion 20

University of Pittsburgh Evaluation Methodology  Performance evaluation  Measure the performance improvement with a prototype system implemented in Linux  Emulate future platform with NUMA system 21

University of Pittsburgh Emulation Methodology 22 CPU #0 (4 cores) CPU #1 (4 cores) 4GB Socket 0 memory 96GB Socket 1 memory Main memoryEmulated PSD Offloading PSD resources Memorage performs hot-plug PSD resources Memory pressure

University of Pittsburgh OS Latency for Page Fault Handling us 21.6 us

University of Pittsburgh Evaluated Memory Configuration 24  Workload  8 memory-intensive benchmarks from SPEC CPU2006  Aggregate memory footprint is 5.6GB (bwaves, mcf, milc, zeusmp, cactusADM, leslie3d, lbm, GemsFDTD)  Memory configurations  Baseline: 4.4GB effective memory capacity available  Memorage: with additional 2GB capacity from PSD, total 6.4GB effective memory capacity available

University of Pittsburgh Exec. Time Breakdown (Baseline) 25

University of Pittsburgh Exec. Time Breakdown (Memorage) 26

University of Pittsburgh Relative Performance of Benchmark 27 dynmic

University of Pittsburgh Lifetime Evaluation  Analytical model for lifetime analysis 28 VariableDescription L m, L s lifetime of main memory(MM) and PSD C m, C s capacity of MM and PSD E m, E s write endurance of MM and PSD D m, D s total data volume of MM and PSD before the first failure D m = E m ∙ C m, D s = E s ∙ C s B m, B s average data update rate or write data band width α, β, ϒ C m = α ∙ C s, B s = β∙ B m, E m = ϒ ∙ E s h η / C m, where η is the transfer size

University of Pittsburgh Main Memory Lifetime Improvement 29 variables endurance ratio fixed rapidly reaches a maximum lifetime even with small bandwidth ratio Since realistic write bandwidth of main memory is much larger than PSD, achieve large main memory lifetime improvement !! 8GB MM + 240GB PSD 8GB MM + 480GB PSD

University of Pittsburgh System Lifetime 30 2x memory lifetime improvement 1000x PSD lifetime degradation E s = 10 5, E m = 10 6, B m = 100MB/s PSD lifetime from 10,000 years to 10 years Main memory lifetime from 2.5 year to 5 years ratio of the donated PSD capacity to memory capacity

University of Pittsburgh Outline  Introduction  PRAM-based System Model  Memorage Architecture  Experimental Results  Conclusion 31

University of Pittsburgh Conclusion  Memorage architecture  Capacity sharing across the conventional memory and storage boundary  Better handle memory pressure by exploiting excess PRAM resources from PSD System performance improvement up to 40.5%  Better utilize available system PRAM resources to improve main memory lifetime System lifetime enhancement up to 6.9 times 32

University of Pittsburgh Thank you for listening!