Storage Class Memory Architecture for Energy Efficient Data Centers Bruce Childers, Sangyeun Cho, Rami Melhem, Daniel Mossé, Jun Yang, Youtao Zhang Computer.

Slides:



Advertisements
Similar presentations
Buffers & Spoolers J L Martin Think about it… All I/O is relatively slow. For most of us, input by typing is painfully slow. From the CPUs point.
Advertisements

A Search Memory Substrate for High Throughput and Low Power Packet Processing Sangyeun Cho, Michel Hanna and Rami Melhem Dept. of Computer Science University.
1 Parallel Scientific Computing: Algorithms and Tools Lecture #2 APMA 2821A, Spring 2008 Instructors: George Em Karniadakis Leopold Grinberg.
1 An Efficient, Hardware-based Multi-Hash Scheme for High Speed IP Lookup Hot Interconnects 2008 Socrates Demetriades, Michel Hanna, Sangyeun Cho and Rami.
International Symposium on Microarchitecture Fine-grained Power Budgeting to Improve Write Throughput of MLC PCM 1 Lei Jiang, 2 Youtao Zhang, 2 Bruce R.
Lecture 2: Modern Trends 1. 2 Microprocessor Performance Only 7% improvement in memory performance every year! 50% improvement in microprocessor performance.
Energy-efficient Cluster Computing with FAWN: Workloads and Implications Vijay Vasudevan, David Andersen, Michael Kaminsky*, Lawrence Tan, Jason Franklin,
Phase Change Memory What to wear out today? Chris Craik, Aapo Kyrola, Yoshihisa Abe.
1: Operating Systems Overview
Energy Efficient Prefetching with Buffer Disks for Cluster File Systems 6/19/ Adam Manzanares and Xiao Qin Department of Computer Science and Software.
Technical University of Lodz Department of Microelectronics and Computer Science Elements of high performance microprocessor architecture Virtual memory.
Chapter 1 and 2 Computer System and Operating System Overview
EET 4250: Chapter 1 Performance Measurement, Instruction Count & CPI Acknowledgements: Some slides and lecture notes for this course adapted from Prof.
Chapter 1 and 2 Computer System and Operating System Overview
Memory: Virtual MemoryCSCE430/830 Memory Hierarchy: Virtual Memory CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng Zhu.
1 Lecture 14: DRAM, PCM Today: DRAM scheduling, reliability, PCM Class projects.
CERN openlab Open Day 10 June 2015 KL Yong Sergio Ruocco Data Center Technologies Division Speeding-up Large-Scale Storage with Non-Volatile Memory.
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
CS 423 – Operating Systems Design Lecture 22 – Power Management Klara Nahrstedt and Raoul Rivas Spring 2013 CS Spring 2013.
Measuring zSeries System Performance Dr. Chu J. Jong School of Information Technology Illinois State University 06/11/2012 Sponsored in part by Deer &
CERN openlab Open Day 10 June 2015 KL Yong Sergio Ruocco Data Center Technologies Division Speeding-up Large-Scale Storage with Non-Volatile Memory.
Chapter 1 CSF 2009 Computer Abstractions and Technology.
Use of PCM in Computer Systems: an End-to-End Exploration Sangyeun Cho Computer Science Department University of Pittsburgh We need V.
Computing Hardware Starter.
Defining Anomalous Behavior for Phase Change Memory
Operating System. Architecture of Computer System Hardware Operating System (OS) Programming Language (e.g. PASCAL) Application Programs (e.g. WORD, EXCEL)
Exploiting Flash for Energy Efficient Disk Arrays Shimin Chen (Intel Labs) Panos K. Chrysanthis (University of Pittsburgh) Alexandros Labrinidis (University.
10 years of research on Power Management (now called green computing) Rami Melhem Daniel Mosse Bruce Childers.
LOGO OPERATING SYSTEM Dalia AL-Dabbagh
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
Priority Research Direction (use one slide for each) Key challenges -Fault understanding (RAS), modeling, prediction -Fault isolation/confinement + local.
Sangyeun Cho Hyunjin Lee
EET 4250: Chapter 1 Computer Abstractions and Technology Acknowledgements: Some slides and lecture notes for this course adapted from Prof. Mary Jane Irwin.
1 Lecture 1: CS/ECE 3810 Introduction Today’s topics:  Why computer organization is important  Logistics  Modern trends.
Sogang University Advanced Computing System Chap 1. Computer Architecture Hyuk-Jun Lee, PhD Dept. of Computer Science and Engineering Sogang University.
Dong Hyuk Woo Nak Hee Seong Hsien-Hsin S. Lee
University of Pittsburgh Memorage: Emerging Persistent RAM based Malleable Main Memory and Storage Architecture Juyoung Jung and Sangyeun Cho Computer.
High Performance Computing Processors Felix Noble Mirayma V. Rodriguez Agnes Velez Electric and Computer Engineer Department August 25, 2004.
Memory  Main memory consists of a number of storage locations, each of which is identified by a unique address  The ability of the CPU to identify each.
Lecture 16: Storage and I/O EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2014, Dr.
Directed Reading 2 Key issues for the future of Software and Hardware for large scale Parallel Computing and the approaches to address these. Submitted.
Operating Systems David Goldschmidt, Ph.D. Computer Science The College of Saint Rose CIS 432.
Chapter 1 Computer Abstractions and Technology. Chapter 1 — Computer Abstractions and Technology — 2 The Computer Revolution Progress in computer technology.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
Energy Reduction for STT-RAM Using Early Write Termination Ping Zhou, Bo Zhao, Jun Yang, *Youtao Zhang Electrical and Computer Engineering Department *Department.
A new perspective on processing-in-memory architecture design These data are submitted with limited rights under Government Contract No. DE-AC52-8MA27344.
“NVM Duet: Unified Working Memory and Persistent Store Architecture”
MIAO ZHOU, YU DU, BRUCE CHILDERS, RAMI MELHEM, DANIEL MOSSÉ UNIVERSITY OF PITTSBURGH Writeback-Aware Bandwidth Partitioning for Multi-core Systems with.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Literature Review on Emerging Memory Technologies
Emerging Non-volatile Memories: Opportunities and Challenges
Memory The term memory is referred to computer’s main memory, or RAM (Random Access Memory). RAM is the location where data and programs are stored (temporarily),
Computer Organization Yasser F. O. Mohammad 1. 2 Lecture 1: Introduction Today’s topics:  Why computer organization is important  Logistics  Modern.
What is it and why do we need it? Chris Ward CS147 10/16/2008.
Tackling I/O Issues 1 David Race 16 March 2010.
대용량 플래시 SSD의 시스템 구성, 핵심기술 및 기술동향
1 Paolo Bianco Storage Architect Sun Microsystems An overview on Hybrid Storage Technologies.
Memory P2 Understand hardware technologies for game platforms
Rakan Maddah1, Sangyeun2,1 Cho and Rami Melhem1
Memory COMPUTER ARCHITECTURE
Operating System.
Scalable High Performance Main Memory System Using PCM Technology
reFresh SSDs: Enabling High Endurance, Low Cost Flash in Datacenters
Lecture 6: Reliability, PCM
Memory P2 Understand hardware technologies for game platforms
Introduction to Operating Systems
Horizontally Partitioned Hybrid Main Memory with PCM
Introduction to Operating Systems
2.C Memory GCSE Computing Langley Park School for Boys.
Virtual Memory: Working Sets
Presentation transcript:

Storage Class Memory Architecture for Energy Efficient Data Centers Bruce Childers, Sangyeun Cho, Rami Melhem, Daniel Mossé, Jun Yang, Youtao Zhang Computer Science Department University of Pittsburgh

Server power consumption (Watts) Processors Memory (Lefurgy et al., ’03) (1,614W) (2,972W)

Challenges with DRAM Power wall –Large fractions of system power consumed in DRAM Cost wall –Memory accounts for a major fraction of overall server cost Scaling wall –DRAM scaling becomes harder and harder Higher speed (bandwidth) means faster clocking Larger size = increase of loading (on buses) and refresh overheads (power & performance)

New non-volatile memory to rescue US Patents Granted MRAM FRAM PCM (PRAM) (Lam, VLSI-TSA ’08) 1.Non-volatile 2.Byte-addressable 3.Acceptable performance 4.Good scaling potential * Subject to write endurance limit 1.Non-volatile 2.Byte-addressable 3.Acceptable performance 4.Good scaling potential * Subject to write endurance limit

Agenda Storage class memory architecture Industry progress Our vision Some research questions

Storage class memory architecture L1 $$ L2 $$ L1 $$ PCM-Small Smart Mem-ctrl Smart Mem-ctrl DRAM PCM-Large PCM is slow and write endurance limited; we need DRAM buffering This is PCM working memory; a better species (e.g., SLC)? This is PCM “storage” space; maybe equivalent to PCM-Small or maybe slower and larger (e.g., MLC)? “Smart mem. controller” to handle diff. technologies; cache mgmt, wear leveling, error handling (ECC, sparing), trim & low-level scheduling

Prior work & findings Memory energy savings –Sizable savings of 20~90% [Zhou et al., ’09, Park et al., ’11] –At a manageable performance hit of ~5% or so Hardware wear leveling feasible [Qureshi et al., ’09, Seong et al., ’10] Other system implications –Fast system on and off [Doh et al., ’09] –Single-level data store [Venkataraman et al., ’11] –Rapid checkpointing [Dong et al., ’09]

Industry progress: Samsung Lee et al. ISSCC ’07 Lee et al. JSSC ’08 Diode switch design 266MB/s read 4.64MB/s write (x16) Diode switch design 266MB/s read 4.64MB/s write (x16) Chung et al. ISSCC ’11 LPDDR2-N “Write skewing” 6.4MB/s write “DCWI” (~Flip-N-Write) LPDDR2-N “Write skewing” 6.4MB/s write “DCWI” (~Flip-N-Write)

(Servalli, IEDM ’09) Industry progress: Numonyx (Micron) Early access program (2009) “Alverstone” (OMNEO) TR switch design 40MB/s read (?) <1MB/s write (?) “Alverstone” (OMNEO) TR switch design 40MB/s read (?) <1MB/s write (?) Numerous press releases (slated for MP in 2011) “Bonelli” 1.8V I/O “Bonelli” 1.8V I/O (2011~2012?) “Imola” and “Mandello” 2Gb & 1.2V & 1.8V I/O LPDDR2-NVM & DDR3-NVM “Imola” and “Mandello” 2Gb & 1.2V & 1.8V I/O LPDDR2-NVM & DDR3-NVM

Our vision To drastically reduce the power needed by TB capacities for main memory Cross-cutting, holistic system design –With heterogeneous resources, management tasks are best handled by collaboration of layers –MemVisor

Research questions (infra) PCM has the potential to beat DRAM in terms of capacity and power… –But what about performance? How much performance is “good enough” for key applications? What cross-layer information is critical for MemVisor? –What are appropriate interfaces? Can we predictively allocate different amount of DRAM and PCM to a virtual machine? –Hardware and software support?

Research questions (application) How can we best utilize persistency in memory? –Extension of storage? How? –New algorithms and data structures? PCM provides “storage” that is orders of magnitude faster than HDDs –Any changes needed in OS? DBMS? New algorithms that work synergistically with the underlying hardware and system layers for longer lifetime and higher reliability?

Storage Class Memory Architecture for Energy Efficient Data Centers