“NVM Duet: Unified Working Memory and Persistent Store Architecture”

Slides:



Advertisements
Similar presentations
Tableau Software Australia
Advertisements

A Search Memory Substrate for High Throughput and Low Power Packet Processing Sangyeun Cho, Michel Hanna and Rami Melhem Dept. of Computer Science University.
A Case for Refresh Pausing in DRAM Memory Systems
1 Lecture 13: Cache and Virtual Memroy Review Cache optimization approaches, cache miss classification, Adapted from UCB CS252 S01.
AMD OPTERON ARCHITECTURE Omar Aragon Abdel Salam Sayyad This presentation is missing the references used.
Better I/O Through Byte-Addressable, Persistent Memory
Scalable Content-Addressable Network Lintao Liu
A Scalable and Reconfigurable Search Memory Substrate for High Throughput Packet Processing Sangyeun Cho and Rami Melhem Dept. of Computer Science University.
TIE Extensions for Cryptographic Acceleration Charles-Henri Gros Alan Keefer Ankur Singla.
1 Lecture 6: Chipkill, PCM Topics: error correction, PCM basics, PCM writes and errors.
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
Chapter 8 Hardware Conventional Computer Hardware Architecture.
10.2 Characteristics of Computer Memory RAM provides random access Most RAM is volatile.
Zheming CSCE715.  A wireless sensor network (WSN) ◦ Spatially distributed sensors to monitor physical or environmental conditions, and to cooperatively.
Phase Change Memory What to wear out today? Chris Craik, Aapo Kyrola, Yoshihisa Abe.
1 Lecture 15: DRAM Design Today: DRAM basics, DRAM innovations (Section 5.3)
1 Lecture 16: Virtual Memory Today: DRAM innovations, virtual memory (Sections )
INTRODUCTION OS/2 was initially designed to extend the capabilities of DOS by IBM and Microsoft Corporations. To create a single industry-standard operating.
Communication Models for Parallel Computer Architectures 4 Two distinct models have been proposed for how CPUs in a parallel computer system should communicate.
An Intelligent Cache System with Hardware Prefetching for High Performance Jung-Hoon Lee; Seh-woong Jeong; Shin-Dug Kim; Weems, C.C. IEEE Transactions.
1 Lecture 14: DRAM, PCM Today: DRAM scheduling, reliability, PCM Class projects.
CERN openlab Open Day 10 June 2015 KL Yong Sergio Ruocco Data Center Technologies Division Speeding-up Large-Scale Storage with Non-Volatile Memory.
CERN openlab Open Day 10 June 2015 KL Yong Sergio Ruocco Data Center Technologies Division Speeding-up Large-Scale Storage with Non-Volatile Memory.
NVM Programming Model. 2 Emerging Persistent Memory Technologies Phase change memory Heat changes memory cells between crystalline and amorphous states.
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
UNIX SVR4 COSC513 Zhaohui Chen Jiefei Huang. UNIX SVR4 UNIX system V release 4 is a major new release of the UNIX operating system, developed by AT&T.
CS 153 Design of Operating Systems Spring 2015 Final Review.
Storage Class Memory Architecture for Energy Efficient Data Centers Bruce Childers, Sangyeun Cho, Rami Melhem, Daniel Mossé, Jun Yang, Youtao Zhang Computer.
Invitation to Computer Science 5th Edition
Lecture Set 14 B new Introduction to Databases - Database Processing: The Connected Model (Using DataReaders)
NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. NON.
Lecture 19: Virtual Memory
Memory  Main memory consists of a number of storage locations, each of which is identified by a unique address  The ability of the CPU to identify each.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
DISTRIBUTED COMPUTING Introduction Dr. Yingwu Zhu.
Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.
Invitation to Computer Science 6th Edition Chapter 5 Computer Systems Organization.
Data Sharing. Data Sharing in a Sysplex Connecting a large number of systems together brings with it special considerations, such as how the large number.
Implications of Emerging Hardware Tom Wenisch (University of Michigan) Nikos Hardavellas (Northwestern University) Sangyeun Cho (University of Pittsburgh)
By Edward A. Lee, J.Reineke, I.Liu, H.D.Patel, S.Kim
Introduction: Memory Management 2 Ideally programmers want memory that is large fast non volatile Memory hierarchy small amount of fast, expensive memory.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Full and Para Virtualization
ThyNVM Enabling Software-Transparent Crash Consistency In Persistent Memory Systems Jinglei Ren, Jishen Zhao, Samira Khan, Jongmoo Choi, Yongwei Wu, and.
Lectures 8 & 9 Virtual Memory - Paging & Segmentation System Design.
Jeffrey Ellak CS 147. Topics What is memory hierarchy? What are the different types of memory? What is in charge of accessing memory?
CHAPTER 3 Router CLI Command Line Interface. Router User Interface User and privileged modes User mode --Typical tasks include those that check the router.
Virtual Memory By CS147 Maheshpriya Venkata. Agenda Review Cache Memory Virtual Memory Paging Segmentation Configuration Of Virtual Memory Cache Memory.
Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,
Computer Organization and Architecture Lecture 1 : Introduction
Failure-Atomic Slotted Paging for Persistent Memory
Distributed Cache Technology in Cloud Computing and its Application in the GIS Software Wang Qi Zhu Yitong Peng Cheng
Andy Wang COP 5611 Advanced Operating Systems
Andy Wang COP 5611 Advanced Operating Systems
Cache Memory Presentation I
Database Performance Tuning and Query Optimization
Better I/O Through Byte-Addressable, Persistent Memory
Experiment Evaluation
OS Virtualization.
Lecture: Memory, Multiprocessors
Page Replacement.
EE 445S Real-Time Digital Signal Processing Lab Spring 2014
Overview Continuation from Monday (File system implementation)
Lecture 6: Reliability, PCM
Andy Wang COP 5611 Advanced Operating Systems
Distributed File Systems
Chapter 11 Database Performance Tuning and Query Optimization
What is Computer Architecture?
Andy Wang COP 5611 Advanced Operating Systems
Authors: Ding-Yuan Lee, Ching-Che Wang, An-Yeu Wu Publisher: 2019 VLSI
Presentation transcript:

“NVM Duet: Unified Working Memory and Persistent Store Architecture” Ren-Shuo Liu, De-Yu Shen, Chia-Lin Yang, Shun-Chih Yu, Cheng-Yuan Michael Wang Sungmin Koo sm.koo1989@gmail.com

Index Background Introduction Data Consistency vs. Bank-Level Parallelism Data Durability vs. Write Speed NVM Duet Evaluation

Background Structure of PCM(Phase Chang Memory) cell 2-states Amorphous state(high resistance, 0) Polycrystalline state(low resistance, 1) Read and Write mechanisms of PCM RESET(writing bit “0”) Heat the phase change material Short latency High power consumption SET(writing bit “1”) Sustained low voltage pulse Long latency Low power consumption To read the state of phase change material, a low enough voltage pulse is applied to the material. Electrode : 전극 PCM(Phase Chang Memory) – 상의 변화에 따라 0,1구분 Amorphous state : 무정형 // crystalline : 다결정질 저항의 차이로 bit(0 or 1) 구분 Set은 높은 온도에서 낮은 온도로 만드는 것이기 때문에 시간이 오래 걸리고 낮은 전압을 유지해 주어야 함

Background MLC PCM ‘Iterative programming’ technique The large resistance difference between the amorphous state and the polycrystalline state makes it possible to store multiple bits per PCM cell ‘Iterative programming’ technique Iterative programming 때문에 MLC에서 latency가 길고 energy가 많이 필요하다.

Background Characteristics Comparison PCM memory system architectures 속도 - DRAM > PCM > NAND flash 수명 – PCM이 NAND보다 좋음 Read write latency 특히 write latency가 dram에 비해서 길기 때문에 main memory로 쓰기 위해서는 성능향상이 필요 DRAM에 비해 PCM은 density->scalability, non-volatile에 장점이 있지만 write latency, endurance, energy consumption이 높기 때문에 3가지 향상을 위한 연구가 중점적으로 진행 중이다. : PCM이 DRAM 대체 : DRAM과 PCM을 parallel하게 사용 (CPU가 모두 접근가능) : DRAM을 cache나 buffer로 이용 Hybrid를 사용하면 DRAM과 PCM의 장점을 모두 취할 수 있음

Introduction NVM technologies have gained a lot of attention recently. Non-volatile, byte-addressability SCM blurs the line between working memory and persistent store. Enable the construction of large-scale working memory High density, scalability, MLC technique Alternative to conventional persistent store. can be connected to CPUs via a direct memory access path Ordinary load, store instruction(previous study) SCM will play the role of both working memory and persistent store at the same time. NVM duet Guarantee consistency and durability not require advance partitioning of PCM resources between persistent store and working memory All the management is transparent to applications. Non-volatile, byte-addressability 때문에 SCM(Storage class memory)라고 불림 Nvm duet은 working mem과 p-store를 동시에 제공한다.

Data Consistency vs. Bank-Level Parallelism Achieve consistency mechanism Persistent update mechanisms at the software level Journaling, shadow update Enforcing ordering writes at the hardware level Consistent update Issues write requests to create N3’ and N4’ Issuing write requests to create N1’, which points to N2, N3’, and N4’ Issue a barrier 저널링 : 쓰기 요청 발생시 데이터의 원래 저장 위치가 아닌 저널 영역이라는 별도의 위치에 먼저 기록하고 기록된 데이터를 주기적으로 데이터의 원래 위치에 write하는 기법 //write도중 전원이 나가도 저널이나 hdd중 하나의 영역에는 남아 있기 때문에 신뢰성 증가 Barrier : 프로그램이 write request(root node)가 persistent sotre에 반영되지 않았다는 것을 하드웨어에게 알려주기 위해 프로그래머에 의해 코딩된다.

Data Consistency vs. Bank-Level Parallelism Figure 3(a) displays a schedule that respects the barriers. Figure 3(b) shows if the barriers were not present. Figure 3(c) have knowledge of the use case for each write A, B, and G belong to working memory The others belong to persistent store 각각의 write정보가 working mem인지 persistent store인지 알아야 함.

Data Durability vs. Write Speed Resistance drift PCM’s limited non-volatility The resistance of PCM cells drifts upward Occur data losses The write speed can be estimated based on the target band allocation A small ΔR is used for a narrow target band to prevent the iterative write from completely missing the target band. 델타R : write pulse Target band는 분석에 의해 정해짐(수식을 이용하는 것 같음) Non-volatile 시간 길어 지면 margin이 짧아지고 델타R이 줄어듦으로 write latency는 증가한다.

NVM Duet HW/SW Interface Built on recently proposed software framework(NV-heap, Mnemosyne) Programmers declare persistent data in the PCM main memory(keywords) Link persistent data to a reserved virtual address space (PersistSpace) AllocMap(one bit each PCM frame) Convey to the memory controller the OS’s knowledge of the use case of each piece of data software framework는 persistent data를 관리 한다. OS가 VPA와 PFN를 MAPPING 해줌 (frame에 persistent data 저장) 그리고 indexing, garbage collection을 위한 metadata는 는 PCM main memory에 유지 AllocMap은 각각의 PCM 프레임당 1개의 bit로 working memory로 사용되는지 persistent store로 사용되는지를 controller가 알 수 있도록 도와준다. Set : working memory으로 사용중인 경우 // 아니면 reset함 컴퓨터 부팅시에 모든 bit가 reset됨(memory로 사용되는 것이 초기에는 없기 때문) – 모두 persistent store Chip resistance parameter 2개 -> 서로 다른 retention 보장하기 위함(뒤에 설명)

NVM Duet Duet Scheduler fully exploit the bank-level parallelism Rule 1: Writes to working memory can be scheduled regardless of barriers. Rule 2: Writes to persistent store are prioritized over writes to working memory if a barrier is pending in the memory controller. Bank-level parallelism을 최대한 이용하는 방법의 scheduling AllocMap을 확인하여 working memory data인지 persistent store data인지 알 수 있다.

NVM Duet Dual-Retention PCM Architecture Dual-Retention PCM chips Provide two access modes with different retention guarantees Command interface(mode signal) Smart Refresh Remove unnecessary refreshing operation Data의 손실을 막기 위해 사용하는 refresh는 overhead가 크다. Data 손실을 막기 위하여 refresh를 수행해야함 Refresh를 줄이면 overhead도 줄고 전력 소비량도 감소함.

Evaluation

Evaluation

References Haros Volos, Andres Jaan Tack, and Michael M.Swift. Mnemosyne: Lightweight Persistent Memory. Ju-Young Jung and Sangyeun Cho. Memorage: Emerging Persistent RAM based Malleable Main Memory and Storage Architecture. 유승훈, 이은지, 반효경. Design and Implementation of a Write Efficient Journaling File System for Phase Change Memory. Eunji Lee, Hyokyung Bahn, and Sam H. Noh. Unioning of the Buffer Cache and Journaling Layers with Non-volatile Memory. Fei Xia, Jin Xiong, and Ning-Hui Sun. A survey of Phase Change Memory System.

Q&A