Energy Efficient Prefetching with Buffer Disks for Cluster File Systems 6/19/2015 1 Adam Manzanares and Xiao Qin Department of Computer Science and Software.

Slides:



Advertisements
Similar presentations
A Proposal of Capacity and Performance Assured Storage in The PRAGMA Grid Testbed Yusuke Tanimura 1) Hidetaka Koie 1,2) Tomohiro Kudoh 1) Isao Kojima 1)
Advertisements

Conserving Disk Energy in Network Servers ACM 17th annual international conference on Supercomputing Presented by Hsu Hao Chen.
Energy Efficiency through Burstiness Athanasios E. Papathanasiou and Michael L. Scott University of Rochester, Computer Science Department Rochester, NY.
Daniel Schall, Volker Höfner, Prof. Dr. Theo Härder TU Kaiserslautern.
1 Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems Brian Forney Andrea Arpaci-Dusseau Remzi Arpaci-Dusseau Wisconsin Network Disks University.
1 Magnetic Disks 1956: IBM (RAMAC) first disk drive 5 Mb – Mb/in $/year 9 Kb/sec 1980: SEAGATE first 5.25’’ disk drive 5 Mb – 1.96 Mb/in2 625.
Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.
A Server-less Architecture for Building Scalable, Reliable, and Cost-Effective Video-on-demand Systems Jack Lee Yiu-bun, Raymond Leung Wai Tak Department.
1 External Sorting Chapter Why Sort?  A classic problem in computer science!  Data requested in sorted order  e.g., find students in increasing.
What will my performance be? Resource Advisor for DB admins Dushyanth Narayanan, Paul Barham Microsoft Research, Cambridge Eno Thereska, Anastassia Ailamaki.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Query Evaluation Chapter 11 External Sorting.
Energy Efficient Prefetching – from models to Implementation 6/19/ Adam Manzanares and Xiao Qin Department of Computer Science and Software Engineering.
Ziliang Zong, Adam Manzanares, and Xiao Qin Department of Computer Science and Software Engineering Auburn University Energy Efficient Scheduling for High-Performance.
A Server-less Architecture for Building Scalable, Reliable, and Cost-Effective Video-on-demand Systems Presented by: Raymond Leung Wai Tak Supervisor:
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.
Scalable Server Load Balancing Inside Data Centers Dana Butnariu Princeton University Computer Science Department July – September 2010 Joint work with.
CS246 Search Engine Scale. Junghoo "John" Cho (UCLA Computer Science) 2 High-Level Architecture  Major modules for a search engine? 1. Crawler  Page.
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.
Reliability Analysis of An Energy-Aware RAID System Shu Yin Xiao Qin Auburn University.
Redundant Array of Independent Disks
Report : Zhen Ming Wu 2008 IEEE 9th Grid Computing Conference.
PARAID: The Gear-Shifting Power-Aware RAID Charles Weddle, Mathew Oldham, An-I Andy Wang – Florida State University Peter Reiher – University of California,
OPTIMAL SERVER PROVISIONING AND FREQUENCY ADJUSTMENT IN SERVER CLUSTERS Presented by: Xinying Zheng 09/13/ XINYING ZHENG, YU CAI MICHIGAN TECHNOLOGICAL.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
Principles of Scalable HPC System Design March 6, 2012 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
Eneryg Efficiency for MapReduce Workloads: An Indepth Study Boliang Feng Renmin University of China Dec 19.
Amy Apon, Pawel Wolinski, Dennis Reed Greg Amerson, Prathima Gorjala University of Arkansas Commercial Applications of High Performance Computing Massive.
1 PARAID: A Gear-Shifting Power-Aware RAID Charles Weddle, Mathew Oldham, Jin Qian, An-I Andy Wang – Florida St. University Peter Reiher – University of.
PARAID: A Gear-Shifting Power-Aware RAID Charles Weddle, Mathew Oldham, Jin Qian, An-I Andy Wang – Florida St. University Peter Reiher – University of.
Increasing Web Server Throughput with Network Interface Data Caching October 9, 2002 Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice Computer Architecture.
Large Scale Parallel File System and Cluster Management ICT, CAS.
Data Replication and Power Consumption in Data Grids Susan V. Vrbsky, Ming Lei, Karl Smith and Jeff Byrd Department of Computer Science The University.
CS 153 Design of Operating Systems Spring 2015 Lecture 22: File system optimizations.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Improving Disk Throughput in Data-Intensive Servers Enrique V. Carrera and Ricardo Bianchini Department of Computer Science Rutgers University.
System optimisation Unit 11.4A Computer Science Grade 11.
Adaptive Sleep Scheduling for Energy-efficient Movement-predicted Wireless Communication David K. Y. Yau Purdue University Department of Computer Science.
Virtualization Supplemental Material beyond the textbook.
MiddleMan: A Video Caching Proxy Server NOSSDAV 2000 Brian Smith Department of Computer Science Cornell University Ithaca, NY Soam Acharya Inktomi Corporation.
Accounting for Load Variation in Energy-Efficient Data Centers
Introduction to Database Systems1 External Sorting Query Processing: Topic 0.
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
CIP HPC CIP - HPC HPC = High Performance Computer It’s not a regular computer, it’s bigger, faster, more powerful, and more.
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
Tackling I/O Issues 1 David Race 16 March 2010.
External Sorting. Why Sort? A classic problem in computer science! Data requested in sorted order –e.g., find students in increasing gpa order Sorting.
RobuSTore: Performance Isolation for Distributed Storage and Parallel Disk Arrays Justin Burke, Huaxia Xia, and Andrew A. Chien Department of Computer.
Load Rebalancing for Distributed File Systems in Clouds.
COSC 6340: Disks 1 Disks and Files DBMS stores information on (“hard”) disks. This has major implications for DBMS design! » READ: transfer data from disk.
1 Thierry Titcheu Chekam 1,2, Ennan Zhai 3, Zhenhua Li 1, Yong Cui 4, Kui Ren 5 1 School of Software, TNLIST, and KLISS MoE, Tsinghua University 2 Interdisciplinary.
COMP7500 Advanced Operating Systems I/O-Aware Load Balancing Techniques Dr. Xiao Qin Auburn University
1 Lecture 16: Data Storage Wednesday, November 6, 2006.
COMP7500 Advanced Operating Systems
Installing Windows Server 2008
CLUSTER COMPUTING Presented By, Navaneeth.C.Mouly 1AY05IS037
”The Ball” Radical Cloud Resource Consolidation
PA an Coordinated Memory Caching for Parallel Jobs
Auburn University COMP7500 Advanced Operating Systems I/O-Aware Load Balancing Techniques (2) Dr. Xiao Qin Auburn University.
CS222P: Principles of Data Management UCI, Fall 2018 Notes #09 External Sorting Instructor: Chen Li.
Qingbo Zhu, Asim Shankar and Yuanyuan Zhou
CS246 Search Engine Scale.
Energy-Efficient Storage Systems
CS222: Principles of Data Management Lecture #10 External Sorting
CS246: Search-Engine Scale
CS222P: Principles of Data Management Lecture #10 External Sorting
CS222/CS122C: Principles of Data Management UCI, Fall 2018 Notes #09 External Sorting Instructor: Chen Li.
Presentation transcript:

Energy Efficient Prefetching with Buffer Disks for Cluster File Systems 6/19/ Adam Manzanares and Xiao Qin Department of Computer Science and Software Engineering Auburn University

Motivation  Using 2010 Historical Trends Scenario ◦ Server and Data Centers Consume 110 Billion kWh per year ◦ Assume average commercial end user is charged 9.46 kWh ◦ Disk systems can account for 27% of the energy cost of data centers 6/19/2015 2

Buffer Disk Architecture RAM Buffer m buffer disks n data disks Buffer Disk Controller Data Partitioning Security Model Load Balancing Power Management Prefetching Disk Requests Energy-Related Reliability Model 6/19/2015 3

Energy Saving Principles  Energy Saving Principle One ◦ Increase the length and number of idle periods larger than the disk break-even time T BE  Energy Saving Principle Two ◦ Reduce the number of power-state transitions 6/19/2015 4

Paramaters Tested ParameterValues Data Size1, 5, 10, 25 MB # of Data Disks4, 8, 12 Inter-arrival Delay0, 0.1, 0.5, 1 S Hit Rate85, 90, 95, 100% 6/19/2015 5

Energy Savings Hit Rate 85% 6/19/2015 6

State Transitions 6/19/2015 7

Why a Cluster File System Block level prefetching difficult Natural place to track file accesses Control placement of data among storage nodes, and data disks Tiered approach simplifies management of files and disk states Eliminates some shortcomings of modeling and simulation 6/19/2015 8

Energy Efficient Virtual File System 6/19/2015 9

EEVFS Process Flow 6/19/

EEVFS Testbed ParameterStorage ServerStorage Node Type 1 Storage Node Type 2 CPUP4 2.0 GHzP4 3.2 GHzP4 2.4 GHz Memory (MB) Network Interconnect Disk TypeSATAATA/133 Disk Capacity120 GB80 GB Disk Bandwidth100 MB/s58 MB/s34 MB/s 6/19/

Energy Savings 6/19/

State Transitions 6/19/

Response Times 6/19/

Berkeley Web Trace 6/19/

EEVFS Summary Knowledge of requests assumed and may be hard to come by Performance tied to one of the buffer disks 6/19/

Parallel Striping Groups Disk 1 Disk 2 Group 1 Buffer Disk Storage Node 1 Disk 3 Disk 4 Buffer Disk Storage Node 2 Disk 5 Disk 6 Group 2 Buffer Disk Storage Node 3 Disk 7 Disk 8 Buffer Disk Storage Node 4 File 1File 2File 3File 4 6/19/

Striping Within a Group Disk 1 Disk 2 Group 1 Buffer Disk Storage Node 1 Disk 3 Disk 4 Buffer Disk Storage Node File 1 File /19/

Striping Within a Group Number of disks in a group can be matched to nearest bottleneck Striping within the group maintains relatively high performance Allows us to use a buffer disk for each storage node, while still maintaining file striping level 6/19/

Testbed ParameterStorage ServerStorage Node CPUCeleron 2.2 GHz Memory (MB)2000 Network Interconnect 1000 Disk TypeSATA Disk Capacity160 GB480 GB Disk Bandwidth126 MB/s 6/19/

Measured Results 6/19/

Measured Results 6/19/

Berkeley Web Trace 6/19/

Response Time Comparison Energy efficiency is slightly improved Response time gain is significant ParameterStripingNo Striping Energy Consumption (J)2,088,1132,100,243 Response Time (S) /19/

Summary Improves the energy efficiency and performance of a storage system Designed to scale –Needs to be tested on large scale storage system 6/19/

Future Work Improve the EEVFS prototype for production use Run EEVFS on large scale storage system –Investigate scaling effects 6/19/

Questions