Scaling the Memory Power Wall with DRAM-Aware Data Management

Slides:



Advertisements
Similar presentations
From A to E: Analyzing TPCs OLTP Benchmarks Pınar Tözün Ippokratis Pandis* Cansu Kaynak Djordje Jevdjic Anastasia Ailamaki École Polytechnique Fédérale.
Advertisements

1 Copyright © 2012 Oracle and/or its affiliates. All rights reserved. Convergence of HPC, Databases, and Analytics Tirthankar Lahiri Senior Director, Oracle.
To Share or Not to Share? Ryan Johnson Nikos Hardavellas, Ippokratis Pandis, Naju Mancheril, Stavros Harizopoulos**, Kivanc Sabirli, Anastasia Ailamaki,
Memory Modules Overview Spring, 2004 Bill Gervasi Senior Technologist, Netlist Chairman, JEDEC Small Modules & DRAM Packaging Committees.
1 Database Servers on Chip Multiprocessors: Limitations and Opportunities Nikos Hardavellas With Ippokratis Pandis, Ryan Johnson, Naju Mancheril, Anastassia.
DBMSs on a Modern Processor: Where Does Time Go? Anastassia Ailamaki Joint work with David DeWitt, Mark Hill, and David Wood at the University of Wisconsin-Madison.
Matching Memory Access Patterns and Data Placement for NUMA Systems Zoltán Majó Thomas R. Gross Computer Science Department ETH Zurich, Switzerland.
Rethinking DRAM Design and Organization for Energy-Constrained Multi-Cores Aniruddha N. Udipi, Naveen Muralimanohar*, Niladrish Chatterjee, Rajeev Balasubramonian,
An evaluation of the Intel Xeon E5 Processor Series Zurich Launch Event 8 March 2012 Sverre Jarp, CERN openlab CTO Technical team: A.Lazzaro, J.Leduc,
1 MemScale: Active Low-Power Modes for Main Memory Qingyuan Deng, David Meisner*, Luiz Ramos, Thomas F. Wenisch*, and Ricardo Bianchini Rutgers University.
CSE 691: Energy-Efficient Computing Lecture 4 SCALING: stateless vs. stateful Anshul Gandhi 1307, CS building
Analysis of Database Workloads on Modern Processors Advisor: Prof. Shan Wang P.h.D student: Dawei Liu Key Laboratory of Data Engineering and Knowledge.
The Energy Case for Graph Processing on Hybrid Platforms Abdullah Gharaibeh, Lauro Beltrão Costa, Elizeu Santos-Neto and Matei Ripeanu NetSysLab The University.
Memory System Characterization of Big Data Workloads
What will my performance be? Resource Advisor for DB admins Dushyanth Narayanan, Paul Barham Microsoft Research, Cambridge Eno Thereska, Anastassia Ailamaki.
Shimin Chen Big Data Reading Group.  Energy efficiency of: ◦ Single-machine instance of DBMS ◦ Standard server-grade hardware components ◦ A wide spectrum.
Energy-efficient Cluster Computing with FAWN: Workloads and Implications Vijay Vasudevan, David Andersen, Michael Kaminsky*, Lawrence Tan, Jason Franklin,
DB system design for new hardware and sciences Anastasia Ailamaki École Polytechnique Fédérale de Lausanne and Carnegie Mellon University.
1 The Problem of Power Consumption in Servers L. Minas and B. Ellison Intel-Lab In Dr. Dobb’s Journal, May 2009 Prepared and presented by Yan Cai Fall.
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.
Debunking the 100X GPU vs. CPU Myth: An Evaluation of Throughput Computing on CPU and GPU Presented by: Ahmad Lashgar ECE Department, University of Tehran.
Capacity Planning in SharePoint Capacity Planning Process of evaluating a technology … Deciding … Hardware … Variety of Ways Different Services.
Authors: Tong Li, Dan Baumberger, David A. Koufaty, and Scott Hahn [Systems Technology Lab, Intel Corporation] Source: 2007 ACM/IEEE conference on Supercomputing.
AUTHORS: STIJN POLFLIET ET. AL. BY: ALI NIKRAVESH Studying Hardware and Software Trade-Offs for a Real-Life Web 2.0 Workload.
Performance Issues in Parallelizing Data-Intensive applications on a Multi-core Cluster Vignesh Ravi and Gagan Agrawal
CSE 691: Energy-Efficient Computing Lecture 6 SHARING: distributed vs. local Anshul Gandhi 1307, CS building
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.
[Tim Shattuck, 2006][1] Performance / Watt: The New Server Focus Improving Performance / Watt For Modern Processors Tim Shattuck April 19, 2006 From the.
Eneryg Efficiency for MapReduce Workloads: An Indepth Study Boliang Feng Renmin University of China Dec 19.
Data Replication and Power Consumption in Data Grids Susan V. Vrbsky, Ming Lei, Karl Smith and Jeff Byrd Department of Computer Science The University.
Towards Dynamic Green-Sizing for Database Servers Mustafa Korkmaz, Alexey Karyakin, Martin Karsten, Kenneth Salem University of Waterloo.
1 CS : Technology Trends Ion Stoica and Ali Ghodsi ( August 31, 2015.
Scaling up analytical queries with column-stores Ioannis Alagiannis Manos Athanassoulis Anastasia Ailamaki École Polytechnique Fédérale de Lausanne.
BEAR: Mitigating Bandwidth Bloat in Gigascale DRAM caches
THOMSON REUTERS Infrastructure Growth Mitigation x86 Server Virtualization, Storage Optimization CHRISTOPHER CROWHURST VP Strategic Technology Professional.
Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS
1 Adaptive Parallelism for Web Search Myeongjae Jeon Rice University In collaboration with Yuxiong He (MSR), Sameh Elnikety (MSR), Alan L. Cox (Rice),
E-MOS: Efficient Energy Management Policies in Operating Systems
© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice ProLiant G5 to G6 Processor Positioning.
Multi-mode Energy Management for Multi-tier Server Clusters Tibor Horvath Kevin Skadron University of Virginia PACT 2008.
Intel and AMD processors
Lecture 2: Performance Today’s topics:
Reducing OLTP Instruction Misses with Thread Migration
Anshul Gandhi 347, CS building
Seth Pugsley, Jeffrey Jestes,
Green cloud computing 2 Cs 595 Lecture 15.
Windows Server* 2016 & Intel® Technologies
HPE Persistent Memory Microsoft Ignite 2017
Morgan Kaufmann Publishers
Chapter 6 Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism Topic 13 Using Energy Efficiently Inside the Server Prof. Zhang.
Frequency Governors for Cloud Database OLTP Workloads
BitWarp Energy Efficient Analytic Data Processing on Next Generation General Purpose GPUs Jason Power || Yinan Li || Mark D. Hill || Jignesh M. Patel.
CS : Technology Trends August 31, 2015 Ion Stoica and Ali Ghodsi (
Scalable High Performance Main Memory System Using PCM Technology
Repairing Write Performance on Flash Devices
DIMMer: A case for turning off DIMMs in clouds
Row Buffer Locality Aware Caching Policies for Hybrid Memories
Lecture 2: Performance Today’s topics: Technology wrap-up
Haishan Zhu, Mattan Erez
KISS-Tree: Smart Latch-Free In-Memory Indexing on Modern Architectures
Support for ”interactive batch”
Database Servers on Chip Multiprocessors: Limitations and Opportunities Nikos Hardavellas With Ippokratis Pandis, Ryan Johnson, Naju Mancheril, Anastassia.
Aniruddha N. Udipi, Naveen Muralimanohar*, Niladrish Chatterjee,
CANDY: Enabling Coherent DRAM Caches for Multi-node Systems
The University of Adelaide, School of Computer Science
Chih-Hsun Chou Daniel Wong Laxmi N. Bhuyan
The University of Adelaide, School of Computer Science
Rajeev Balasubramonian
Run time performance for all benchmarked software.
Presentation transcript:

Scaling the Memory Power Wall with DRAM-Aware Data Management Raja Appuswamy Matthaios Olma Anastasia Ailamaki Ecole Polytechnique Fédérale de Lausanne (EPFL)

Power-Hungry MMDBs? Energy efficiency - The New Holy Grail Year End-use Energy B kwh Elec. Bills ($B) Power plants (500MW) CO2 (US) Million MT 2013 91 $9.0 37 97 2020 139 $13.7 51 147 2013-2020 incr. 47 $4.7 17 50 Energy efficiency - The New Holy Grail What is the energy behavior of MMDBs?

DRAM will emerge as a major contributor Energy in MMDB *HP Power advisor, six-core Intel Xeon E7-4809v2 4-socket - 24 cores, 96 DIMM slots 2.5x loaded CPU Loaded (Watt) 1.25x idle 16x capacity 3x power 4x capacity 10x power DRAM will emerge as a major contributor

Server Memory Trends Capacity Limitation Processor: Intel Xeon E5 2 sockets, 4 channels/socket Memory Type Max Capacity Latency (nsec) Bandwidth (GB/s) Loaded W/GB Idle W/GB UDIMM 128 153.7 72 0.2 0.02 LRDIMM 768 235 40.4 0.15 0.07 HCDIMM 161.9 63.9 0.74 0.37 Capacity Limitation Capacity – Performance Tradeoff Capacity – Power Tradeoff Big data => Big memory => Big power bills MMDBs should focus on reducing DRAM power

Reducing DRAM Power Draw Hardware features Frequency scaling Power-down modes Limitations Enabled/disabled statically at boot time Power-down state transition controlled by MC What is the effectiveness under MMDB workloads? What is the impact on MMDB performance?

Experimental Setup Hardware Micro-benchmarks 2 GHz Intel Xeon E5-2640 v2 CPU (2 socket, 8 cores/socket) 256 GB of memory (16x 16 GB RDIMMs) Micro-benchmarks Concurrent Scans 128 MB of int64s Parallel Aggregation (a = ∑(b(i)+c(i))) over 8-GB double arrays 1-8 threads affinitized to a single socket to avoid NUMA Macro-benchmarks (in paper) TPC-C, TPC-H

Power Breakdown (Scans) DRAM contribution is high even at 100% CPU Even with 256 GB RDIMMs, DRAM contributes 30%

Impact of Frequency Scaling 8 threads Most energy-efficient is not always the fastest

Memory Bandwidth Utilization (aggregation) Bandwidth-intensive workloads suffer

Impact of Power-Down Modes (aggregation) No impact on performance or power 13% reduction in power

Exploiting DRAM Idleness >40% residency, yet negligible power savings Memory Controller is conservative MC good at predicting idleness

Can We Do Better? DRAM-aware memory layout DB-driven gear shifting Separate hot/cold data to enable longer idle time DB-driven gear shifting Pairing data and power modes Multitier memory and storage tiering. Rethinking anti-caching, 5-minute rule for two-level DRAM Tiering-aware query optimization Making optimizer aware of access latencies

“It’s the memory, stupid!” Conclusion Perfect storm MMDBs need more DRAM to meet application demands DRAM technology poses power-performance tradeoffs Promise Frequency scaling/low-power modes can limit power draw Hurdles Need software-driven DRAM DVFS and state transitions Need to redesigns MMDBs to exploit hardware features “It’s the memory, stupid!” - Richard L. Sites