Shimin Chen Big Data Reading Group Presented and modified by Randall Parabicoli.

Slides:



Advertisements
Similar presentations
Mark Holliman Wide Field Astronomy Unit Institute for Astronomy University of Edinburgh.
Advertisements

Daniel Schall, Volker Höfner, Prof. Dr. Theo Härder TU Kaiserslautern.
Query Processing and Optimizing on SSDs Flash Group Qingling Cao
1 MemScale: Active Low-Power Modes for Main Memory Qingyuan Deng, David Meisner*, Luiz Ramos, Thomas F. Wenisch*, and Ricardo Bianchini Rutgers University.
1 Distributed Systems Meet Economics: Pricing in Cloud Computing Hadi Salimi Distributed Systems Lab, School of Computer Engineering, Iran University of.
Parallel Database Systems
Shimin Chen Big Data Reading Group.  Energy efficiency of: ◦ Single-machine instance of DBMS ◦ Standard server-grade hardware components ◦ A wide spectrum.
Towards Energy Efficient MapReduce Yanpei Chen, Laura Keys, Randy H. Katz University of California, Berkeley LoCal Retreat June 2009.
3D Computer Rendering Kevin Ginty Centre for Internet Technologies
Energy Efficient Web Server Cluster Andrew Krioukov, Sara Alspaugh, Laura Keys, David Culler, Randy Katz.
Energy-Efficient Query Processing on Embedded CPU-GPU Architectures Xuntao Cheng, Bingsheng He, Chiew Tong Lau Nanyang Technological University, Singapore.
Chapter 8 Physical Database Design. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Overview of Physical Database.
Distributed Systems Meet Economics: Pricing In The Cloud Authors: Hongyi Wang, Qingfeng Jing, Rishan Chen, Bingsheng He, Zhengping He, Lidong Zhou Presenter:
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.
Hystor : Making the Best Use of Solid State Drivers in High Performance Storage Systems Presenter : Dong Chang.
Towards Eco-friendly Database Management Systems W. Lang, J. M. Patel (U Wisconsin), CIDR 2009 Shimin Chen Big Data Reading Group.
Analyzing the Energy Efficiency of a Database Server Hanskamal Patel SE 521.
CS 423 – Operating Systems Design Lecture 22 – Power Management Klara Nahrstedt and Raoul Rivas Spring 2013 CS Spring 2013.
Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical.
TPB Models Development Status Report Presentation to the Travel Forecasting Subcommittee Ron Milone National Capital Region Transportation Planning Board.
Report : Zhen Ming Wu 2008 IEEE 9th Grid Computing Conference.
Scaling and Packing on a Chip Multiprocessor Vincent W. Freeh Tyler K. Bletsch Freeman L. Rawson, III Austin Research Laboratory.
Folklore Confirmed: Compiling for Speed = Compiling for Energy Tomofumi Yuki INRIA, Rennes Sanjay Rajopadhye Colorado State University 1.
C-Store: Column Stores over Solid State Drives Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Jun 19, 2009.
XI HE Computing and Information Science Rochester Institute of Technology Rochester, NY USA Rochester Institute of Technology Service.
Exploiting Flash for Energy Efficient Disk Arrays Shimin Chen (Intel Labs) Panos K. Chrysanthis (University of Pittsburgh) Alexandros Labrinidis (University.
Cloud Computing Energy efficient cloud computing Keke Chen.
Goodbye rows and tables, hello documents and collections.
CERN - IT Department CH-1211 Genève 23 Switzerland t Tier0 database extensions and multi-core/64 bit studies Maria Girone, CERN IT-PSS LCG.
Virtualization for Adaptability Project Presentation CS848 Fall 2006 Umar Farooq Minhas 29 Nov 2006 David R. Cheriton School of Computer Science University.
Data Warehousing 1 Lecture-24 Need for Speed: Parallelism Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
© Stavros Harizopoulos 2006 Performance Tradeoffs in Read- Optimized Databases: from a Data Layout Perspective Stavros Harizopoulos MIT CSAIL Modified.
Critical Power Slope Understanding the Runtime Effects of Frequency Scaling Akihiko Miyoshi, Charles Lefurgy, Eric Van Hensbergen Ram Rajamony Raj Rajkumar.
Processes and OS basics. RHS – SOC 2 OS Basics An Operating System (OS) is essentially an abstraction of a computer As a user or programmer, I do not.
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
Eneryg Efficiency for MapReduce Workloads: An Indepth Study Boliang Feng Renmin University of China Dec 19.
Achieving Scalability, Performance and Availability on Linux with Oracle 9iR2-RAC Grant McAlister Senior Database Engineer Amazon.com Paper
Energy Management in Virtualized Environments Gaurav Dhiman, Giacomo Marchetti, Raid Ayoub, Tajana Simunic Rosing (CSE-UCSD) Inside Xen Hypervisor Online.
+ CS 325: CS Hardware and Software Organization and Architecture Memory Organization.
Towards Dynamic Green-Sizing for Database Servers Mustafa Korkmaz, Alexey Karyakin, Martin Karsten, Kenneth Salem University of Waterloo.
VGreen: A System for Energy Efficient Manager in Virtualized Environments G. Dhiman, G Marchetti, T Rosing ISLPED 2009.
Critical Power Slope: Understanding the Runtime Effects of Frequency Scaling Akihiko Miyoshi †,Charles Lefurgy ‡, Eric Van Hensbergen ‡, Ram Rajamony ‡,
Radix Sort and Hash-Join for Vector Computers Ripal Nathuji 6.893: Advanced VLSI Computer Architecture 10/12/00.
CIS250 OPERATING SYSTEMS Chapter One Introduction.
PERFORMANCE STUDY OF BIG DATA ON SMALL NODES. Ομάδα: Παναγιώτης Μιχαηλίδης Αντρέας Σόλου Instructor: Demetris Zeinalipour.
Coupling Facility. The S/390 Coupling Facility (CF), the key component of the Parallel Sysplex cluster, enables multisystem coordination and datasharing.
XI HE Computing and Information Science Rochester Institute of Technology Rochester, NY USA Rochester Institute of Technology Service.
연세대학교 Yonsei University Data Processing Systems for Solid State Drive Yonsei University Mincheol Shin
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
Virtualization Supplemental Material beyond the textbook.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
1 Adaptive Parallelism for Web Search Myeongjae Jeon Rice University In collaboration with Yuxiong He (MSR), Sameh Elnikety (MSR), Alan L. Cox (Rice),
Mapping the Data Warehouse to a Multiprocessor Architecture
The Sort Benchmark AlgorithmsSolid State Disks External Memory Multiway Mergesort  Phase 1: Run Formation  Phase 2: Merge Runs  Careful parameter selection.
PROOF Benchmark on Different Hardware Configurations 1 11/29/2007 Neng Xu, University of Wisconsin-Madison Mengmeng Chen, Annabelle Leung, Bruce Mellado,
Unit - 4 Introduction to the Other Databases.  Introduction :-  Today single CPU based architecture is not capable enough for the modern database.
Analyzing Memory Access Intensity in Parallel Programs on Multicore Lixia Liu, Zhiyuan Li, Ahmed Sameh Department of Computer Science, Purdue University,
The Sort Benchmark AlgorithmsSolid State Disks External Memory Multiway Mergesort  Phase 1: Run Formation  Phase 2: Merge Runs  Careful parameter selection.
Indexing strategies and good physical designs for performance tuning Kenneth Ureña /SpanishPASSVC.
Seth Pugsley, Jeffrey Jestes,
Solid State Disks Testing with PROOF
Frequency Governors for Cloud Database OLTP Workloads
Overview Introduction VPS Understanding VPS Architecture
Zhen Xiao, Qi Chen, and Haipeng Luo May 2013
KISS-Tree: Smart Latch-Free In-Memory Indexing on Modern Architectures
Outline - Energy Management
COMP755 Advanced Operating Systems
A very basic introduction
Presentation transcript:

Shimin Chen Big Data Reading Group Presented and modified by Randall Parabicoli

 Assess and explore ways to improve energy efficiency  Energy efficiency of: ◦ Single-machine instance of DBMS ◦ Standard server-grade hardware components ◦ A wide spectrum of database tasks

 HP xw8600 workstation  64-bit Fedora 4 Linux (kernel )  Two Intel Xeon E GHz quad core CPUs (32K L1, 6MB L2)  16GB RAM  4 HDDs (Seagate Savvio 10K.3)  4 SSDs (Intel X-25E)

 Total system power: ◦ power meter  Individual components: ◦ SSDs, HDDs, and CPUs ◦ clamp meter to measure 5V and 12V lines from the power supply  Multiplying the current with the line voltage (5V / 12V) gets the power measurement.

 Configure 4 disks (SSDs) as RAID-0. Read a 100GB file sequentially, varying disk utilization by increasing CPU computation overhead

 Consumes 85% of dynamic power  Use four micro-benchmarks to study CPU power ◦ Hashjoin, Sort, RowScan, ComprColScan  Two scheduling policies: ◦ Performance Oriented vs Energy-Saving  Each core fully utilized  Freq adjusted by OS

Big jump when a CPU becomes active Hash join and row scan consumes more power

Operators put more stress on memory subsystem of CPU, thus leading to more power consumption.

CPU power is not a linear function of the number of cores used For a fixed configuration, different operators may differ significantly (60% in the experiments) in power consumption

 Energy efficiency vs. performance for a large number of DB configurations  DB: algorithm kernels, PostgreSQL, commercial System-X  Knobs: ◦ Execution plan selection (algorithms) ◦ Intra-operator parallelism (# of cores for a single operator) ◦ Inter-query parallelism (# of independent queries in parallel) ◦ Physical layout (row vs. column scans) ◦ Storage layout (striping) ◦ Choice of storage medium (HDD vs. SDD) ◦ Scheduling policies and frequency settings (from before)

 Energy-Efficiency  Tuples / Joule  Amount of work that can be done per unit of energy  Performance  1 / Time  More performance = less time spent Energy vs. Performance

 Dynamic power range among the points is small, 165W + 19% ◦ Power remains relatively constant ◦ Energy efficiency varies directly with performance.

 Again: 169W+14%  Therefore the linear relationship

 Linear relationship with less than 10% variance

 For this current server, the best performing DB execution plan is also good enough for energy efficiency ◦ Regardless of query complexity and knobs

 More variance as idle power is reduced

 Power capping leads to more interesting configurations

 Key Contributions ◦ Study of power-performance of core database operators.  Using modern scale-out (shared-nothing) hardware. ◦ Analysis of the effects of hardware/software knobs on energy efficiency of complex queries.  PostgreSQL  System-X ◦ Highest performing configuration is the most energy- efficient.  Contrary to previous studies’ suggestions.  Suggests that performance and energy efficiency are highly co-related.

 As server hardware becomes more energy efficient, idle power may reduce, leading to more variance

 Shared-nothing energy efficiency ◦ Resource consolidation across underutilized nodes. ◦ Saves power without sacrificing performance.  Alternative energy-efficient hardware ◦ Lower fixed-power costs.  Software mechanisms to cap power consumption while maximizing performance.

 How could OLTP (Online Transaction Processing) applications improve energy efficiency?  Why do RowScan and HashJoin take up more memory bus utilization and CPU power consumption than ComprColScan and Sort?