Shimin Chen Big Data Reading Group.  Energy efficiency of: ◦ Single-machine instance of DBMS ◦ Standard server-grade hardware components ◦ A wide spectrum.

Slides:



Advertisements
Similar presentations
Mark Holliman Wide Field Astronomy Unit Institute for Astronomy University of Edinburgh.
Advertisements

Daniel Schall, Volker Höfner, Prof. Dr. Theo Härder TU Kaiserslautern.
Query Processing and Optimizing on SSDs Flash Group Qingling Cao
1 Distributed Systems Meet Economics: Pricing in Cloud Computing Hadi Salimi Distributed Systems Lab, School of Computer Engineering, Iran University of.
Shimin Chen Big Data Reading Group Presented and modified by Randall Parabicoli.
High-Performance Task Distribution for Volunteer Computing Rom Walton
Energy-Efficient Query Processing on Embedded CPU-GPU Architectures Xuntao Cheng, Bingsheng He, Chiew Tong Lau Nanyang Technological University, Singapore.
Chapter 8 Physical Database Design. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Overview of Physical Database.
Distributed Systems Meet Economics: Pricing In The Cloud Authors: Hongyi Wang, Qingfeng Jing, Rishan Chen, Bingsheng He, Zhengping He, Lidong Zhou Presenter:
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
Hystor : Making the Best Use of Solid State Drivers in High Performance Storage Systems Presenter : Dong Chang.
1 A Comparison of Approaches to Large-Scale Data Analysis Pavlo, Paulson, Rasin, Abadi, DeWitt, Madden, Stonebraker, SIGMOD’09 Shimin Chen Big data reading.
Towards Eco-friendly Database Management Systems W. Lang, J. M. Patel (U Wisconsin), CIDR 2009 Shimin Chen Big Data Reading Group.
Analyzing the Energy Efficiency of a Database Server Hanskamal Patel SE 521.
Buying a Laptop. 3 Main Components The 3 main components to consider when buying a laptop or computer are Processor – The Bigger the Ghz the faster the.
Accelerating SQL Database Operations on a GPU with CUDA Peter Bakkum & Kevin Skadron The University of Virginia GPGPU-3 Presentation March 14, 2010.
Report : Zhen Ming Wu 2008 IEEE 9th Grid Computing Conference.
Technology Expectations in an Aeros Environment October 15, 2014.
Comp-TIA Standards.  AMD- (Advanced Micro Devices) An American multinational semiconductor company that develops computer processors and related technologies.
Scaling and Packing on a Chip Multiprocessor Vincent W. Freeh Tyler K. Bletsch Freeman L. Rawson, III Austin Research Laboratory.
Folklore Confirmed: Compiling for Speed = Compiling for Energy Tomofumi Yuki INRIA, Rennes Sanjay Rajopadhye Colorado State University 1.
COLLABORATIVE EXECUTION ENVIRONMENT FOR HETEROGENEOUS PARALLEL SYSTEMS Aleksandar Ili´c, Leonel Sousa 2010 IEEE International Symposium on Parallel & Distributed.
Exploiting Flash for Energy Efficient Disk Arrays Shimin Chen (Intel Labs) Panos K. Chrysanthis (University of Pittsburgh) Alexandros Labrinidis (University.
© Stavros Harizopoulos 2006 Performance Tradeoffs in Read-Optimized Databases Stavros Harizopoulos MIT CSAIL joint work with: Velen Liang, Daniel Abadi,
Introduction to Database Systems 1 Storing Data: Disks and Files Chapter 3 “Yea, from the table of my memory I’ll wipe away all trivial fond records.”
Task Scheduling for Highly Concurrent Analytical and Transactional Main-Memory Workloads Iraklis Psaroudakis (EPFL), Tobias Scheuer (SAP AG), Norman May.
Parallelizing Security Checks on Commodity Hardware E.B. Nightingale, D. Peek, P.M. Chen and J. Flinn U Michigan.
CERN - IT Department CH-1211 Genève 23 Switzerland t Tier0 database extensions and multi-core/64 bit studies Maria Girone, CERN IT-PSS LCG.
Performance Tradeoffs in Read-Optimized Databases Stavros Harizopoulos * MIT CSAIL joint work with: Velen Liang, Daniel Abadi, and Sam Madden massachusetts.
© Stavros Harizopoulos 2006 Performance Tradeoffs in Read- Optimized Databases: from a Data Layout Perspective Stavros Harizopoulos MIT CSAIL Modified.
Eneryg Efficiency for MapReduce Workloads: An Indepth Study Boliang Feng Renmin University of China Dec 19.
CS Hardware Tuning Xiaofang Zhou School of Computing, NUS Office: S URL:
Achieving Scalability, Performance and Availability on Linux with Oracle 9iR2-RAC Grant McAlister Senior Database Engineer Amazon.com Paper
+ CS 325: CS Hardware and Software Organization and Architecture Memory Organization.
Towards Dynamic Green-Sizing for Database Servers Mustafa Korkmaz, Alexey Karyakin, Martin Karsten, Kenneth Salem University of Waterloo.
Hyper Threading Technology. Introduction Hyper-threading is a technology developed by Intel Corporation for it’s Xeon processors with a 533 MHz system.
CS338Parallel and Distributed Databases11-1 Parallel and Distributed Databases Lecture Topics Multi-CPU and distributed systems Monolithic system Client–server.
CSU - DCE Webmaster I Scaling Issues - Fort Collins, CO Copyright © XTR Systems, LLC Web Site Scaling Issues (or Size Really Does Matter) Instructor:
GEM: A Framework for Developing Shared- Memory Parallel GEnomic Applications on Memory Constrained Architectures Mucahid Kutlu Gagan Agrawal Department.
Computational Research in the Battelle Center for Mathmatical medicine.
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
The Sort Benchmark AlgorithmsSolid State Disks External Memory Multiway Mergesort  Phase 1: Run Formation  Phase 2: Merge Runs  Careful parameter selection.
An Integrated GPU Power and Performance Model (ISCA’10, June 19–23, 2010, Saint-Malo, France. International Symposium on Computer Architecture)
Dynamic Control of Coding for Progressive Packet Arrivals in DTNs.
PROOF Benchmark on Different Hardware Configurations 1 11/29/2007 Neng Xu, University of Wisconsin-Madison Mengmeng Chen, Annabelle Leung, Bruce Mellado,
Unit - 4 Introduction to the Other Databases.  Introduction :-  Today single CPU based architecture is not capable enough for the modern database.
Course 03 Basic Concepts assist. eng. Jánó Rajmond, PhD
Analyzing Memory Access Intensity in Parallel Programs on Multicore Lixia Liu, Zhiyuan Li, Ahmed Sameh Department of Computer Science, Purdue University,
Efficient Data Management Tools for the Heterogeneous Big Data Warehouse Autors: Aleksandr Alekseev (Programmer), Victoria Osipova (Associate professor),
The Sort Benchmark AlgorithmsSolid State Disks External Memory Multiway Mergesort  Phase 1: Run Formation  Phase 2: Merge Runs  Careful parameter selection.
Cluster Status & Plans —— Gang Qin
Seth Pugsley, Jeffrey Jestes,
Hardware specifications
Solid State Disks Testing with PROOF
Resource Aware Scheduler – Initial Results
Oracle SQL*Loader
Frequency Governors for Cloud Database OLTP Workloads
تدريس يار: ميثم نظرياني
Oracle Storage Performance Studies
Chapter 1: Intro (excerpt)
Akshay Tomar Prateek Singh Lohchubh
Parallel Analytic Systems
Declarative Transfer Learning from Deep CNNs at Scale
SAP HANA Cost-optimized Hardware for Non-Production
SAP HANA Cost-optimized Hardware for Non-Production
GridTorrent Framework: A High-performance Data Transfer and Data Sharing Framework for Scientific Computing.
A very basic introduction
Presentation transcript:

Shimin Chen Big Data Reading Group

 Energy efficiency of: ◦ Single-machine instance of DBMS ◦ Standard server-grade hardware components ◦ A wide spectrum of database tasks  Assess and explore ways to improve energy efficiency

 HP xw8600 workstation  64-bit Fedora 4 Linux (kernel )  Two Intel Xeon E GHz quad core CPUs (32K L1, 6MB L2)  16GB RAM  4 HDDs (Seagate Savvio 10K.3)  4 SSDs (Intel X-25E)

 Total system power: ◦ power meter  Individual components: ◦ clamp meter to measure 5V and 12V lines from the power supply

 Configure 4 disks (SSDs) as RAID-0. Read a 100GB file sequentially, varying disk utilization by increasing CPU computation overhead

 Consumes 85% of dynamic power  Use four micro-benchmarks to study CPU power  Two scheduling policies:  Freq adjusted by OS

Big jump when a CPU becomes active Hash join and row scan consumes more power

see higher memory bus utilization

CPU power is not a linear function of the number of cores used For a fixed configuration, different operators may differ significantly (60% in the experiments)

 Energy efficiency vs. performance for a large number of DB configurations  DB: algorithm kernels, PostgreSQL, commercial System-X  Knobs: ◦ Execution plan selection ◦ Intra-operator parallelism (# of cores for a single operator) ◦ Inter-query parallelism (# of independent queries in parallel) ◦ Physical layout (row vs. column) ◦ Storage layout (striping) ◦ Choice of storage medium (HDD vs. SDD)

 Dynamic power range among the points is small, 165W + 19%

 Again: 169W+14%  Therefore the linear relationship

 Linear relationship with less than 10% variance

 For this current server, the best performing DB execution plan is also good enough for energy efficiency

 More variance as idle power is reduced

 Power capping leads to more interesting configurations

 This paper studies a DB server representing the current hardware  Results show that performance and energy efficiency are highly co-related.  As server hardware becomes more energy efficient, idle power may reduce, leading to more variance  Power capping also provides interesting research challenges