Amar Phanishayee,LawrenceTan,Vijay Vasudevan

Slides:



Advertisements
Similar presentations
1 Parallel Scientific Computing: Algorithms and Tools Lecture #2 APMA 2821A, Spring 2008 Instructors: George Em Karniadakis Leopold Grinberg.
Advertisements

Daniel Schall, Volker Höfner, Prof. Dr. Theo Härder TU Kaiserslautern.
SILT: A Memory-Efficient, High-Performance Key-Value Store
FAWN: Fast Array of Wimpy Nodes Developed By D. G. Andersen, J. Franklin, M. Kaminsky, A. Phanishayee, L. Tan, V. Vasudevan Presented by Peter O. Oliha.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems New Trends in Distributed Storage Steve Ko Computer Sciences and Engineering University at Buffalo.
FAWN: Fast Array of Wimpy Nodes A technical paper presentation in fulfillment of the requirements of CIS 570 – Advanced Computer Systems – Fall 2013 Scott.
Dynamo: Amazon's Highly Available Key-value Store Distributed Storage Systems CS presented by: Hussam Abu-Libdeh.
Thin Servers with Smart Pipes: Designing SoC Accelerators for Memcached Bohua Kou Jing gao.
Log-Structured Memory for DRAM-Based Storage Stephen Rumble, Ankita Kejriwal, and John Ousterhout Stanford University.
Energy-efficient Cluster Computing with FAWN: Workloads and Implications Vijay Vasudevan, David Andersen, Michael Kaminsky*, Lawrence Tan, Jason Franklin,
Authors: David G. Andersen et al. Offense: Chang Seok Bae Yi Yang.
FAWN: A Fast Array of Wimpy Nodes Authors: David G. Andersen et al. Offence: Jaime Espinosa Chunjing Xiao.
FAWN: A Fast Array of Wimpy Nodes Presented by: Aditi Bose & Hyma Chilukuri.
Lecture 3: A Case for RAID (Part 1) Prof. Shahram Ghandeharizadeh Computer Science Department University of Southern California.
FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque.
Hash, Don’t Cache: Fast Packet Forwarding for Enterprise Edge Routers Minlan Yu Princeton University Joint work with Jennifer.
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.
Inexpensive Scalable Information Access Many Internet applications need to access data for millions of concurrent users Relational DBMS technology cannot.
Storage: Scaling Out > Scaling Up? Ankit Singla Chi-Yao Hong.
Disk and Tape Square Off Again Tape Remains King of Hill with LTO-4 Presented by Heba Saadeldeen.
How WebMD Maintains Operational Flexibility with NoSQL Rajeev Borborah, Sr. Director, Engineering Matt Wilson – Director, Production Engineering – Consumer.
Windows Azure SQL Database and Storage Name Title Organization.
Object-based Storage Long Liu Outline Why do we need object based storage? What is object based storage? How to take advantage of it? What's.
Application-driven Energy-efficient Architecture Explorations for Big Data Authors: Xiaoyan Gu Rui Hou Ke Zhang Lixin Zhang Weiping Wang (Institute of.
Facebook: Infrastructure at Scale Using Emerging Hardware Bill Jia, PhD Manager, Performance and Capacity Engineering Oct 8 th, 2013.
Exploiting Flash for Energy Efficient Disk Arrays Shimin Chen (Intel Labs) Panos K. Chrysanthis (University of Pittsburgh) Alexandros Labrinidis (University.
RAMCloud: A Low-Latency Datacenter Storage System Ankita Kejriwal Stanford University (Joint work with Diego Ongaro, Ryan Stutsman, Steve Rumble, Mendel.
November , 2009SERVICE COMPUTATION 2009 Analysis of Energy Efficiency in Clouds H. AbdelSalamK. Maly R. MukkamalaM. Zubair Department.
X-Stream: Edge-Centric Graph Processing using Streaming Partitions
VLDB2012 Hoang Tam Vo #1, Sheng Wang #2, Divyakant Agrawal †3, Gang Chen §4, Beng Chin Ooi #5 #National University of Singapore, †University of California,
SESSION CODE: BIE07-INT Eric Kraemer Senior Program Manager Microsoft Corporation.
The Design and Implementation of Log-Structure File System M. Rosenblum and J. Ousterhout.
Log-structured Memory for DRAM-based Storage Stephen Rumble, John Ousterhout Center for Future Architectures Research Storage3.2: Architectures.
CS 140 Lecture Notes: Technology and Operating Systems Slide 1 Technology Changes Mid-1980’s2012Change CPU speed15 MHz2.5 GHz167x Memory size8 MB4 GB500x.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Cheap and Large CAMs for High Performance Data-Intensive Networked Systems Ashok Anand, Chitra Muthukrishnan, Steven Kappes, and Aditya Akella University.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
Compiler and Runtime Support for Enabling Generalized Reduction Computations on Heterogeneous Parallel Configurations Vignesh Ravi, Wenjing Ma, David Chiu.
CSE378 Intro to caches1 Memory Hierarchy Memory: hierarchy of components of various speeds and capacities Hierarchy driven by cost and performance In early.
June 30 - July 2, 2009AIMS 2009 Towards Energy Efficient Change Management in A Cloud Computing Environment: A Pro-Active Approach H. AbdelSalamK. Maly.
PERFORMANCE STUDY OF BIG DATA ON SMALL NODES. Ομάδα: Παναγιώτης Μιχαηλίδης Αντρέας Σόλου Instructor: Demetris Zeinalipour.
Copyright © 2006, GemStone Systems Inc. All Rights Reserved. Increasing computation throughput with Grid Data Caching Jags Ramnarayan Chief Architect GemStone.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
1 Lecture 20: Big Data, Memristors Today: architectures for big data, memristors.
1 Efficient Mixed-Platform Clouds Phillip B. Gibbons, Intel Labs Michael Kaminsky, Michael Kozuch, Padmanabhan Pillai (Intel Labs) Gregory Ganger, David.
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
CSE 591: Energy-Efficient Computing Lecture 3 SPEED: processor Anshul Gandhi 347, CS building
Tackling I/O Issues 1 David Race 16 March 2010.
Load Rebalancing for Distributed File Systems in Clouds.
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)
CalvinFS: Consistent WAN Replication and Scalable Metdata Management for Distributed File Systems Thomas Kao.
FaRM: Fast Remote Memory Aleksandar Dragojević, Dushyanth Narayanan, Orion Hodson and Miguel Castro, Microsoft Research NSDI’14 January 5 th, 2016 Cho,
Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung
Distributed, real-time actionable insights on high-volume data streams
BD-Cache: Big Data Caching for Datacenters
Green cloud computing 2 Cs 595 Lecture 15.
Database Management Systems (CS 564)
BD-CACHE Big Data Caching for Datacenters
FAWN: A Fast Array of Wimpy Nodes
Google File System CSE 454 From paper by Ghemawat, Gobioff & Leung.
BitWarp Energy Efficient Analytic Data Processing on Next Generation General Purpose GPUs Jason Power || Yinan Li || Mark D. Hill || Jignesh M. Patel.
Unit 2 Computer Systems HND in Computing and Systems Development
Be Fast, Cheap and in Control
CS 140 Lecture Notes: Technology and Operating Systems
CS 140 Lecture Notes: Technology and Operating Systems
KISS-Tree: Smart Latch-Free In-Memory Indexing on Modern Architectures
FAWN: A Fast Array of Wimpy Nodes
Caching 50.5* + Apache Kafka
Fast Accesses to Big Data in Memory and Storage Systems
Presentation transcript:

Amar Phanishayee,LawrenceTan,Vijay Vasudevan FAWN A Fast Array of Wimpy Nodes* Bogdan Eremia, SCPD *by DavidAndersen, Jason Franklin, Michael Kaminsky, Amar Phanishayee,LawrenceTan,Vijay Vasudevan

Energy in computing • Power is a significant burden on computing • 3-yearTCO soon to be dominated by power Hydroelectric Dam 2

“Google’s power consumption ...would incur an annual electricity bill of nearly $38 million” [Qureshi:sigcomm09] “Energy consumption by…data centers could nearly double ...(by 2011) to more than 100 billion kWh, representing a $7.4 billion annual electricity cost” [EPA Report 2007] Annual cost of energy for Google,Amazon,Microsoft = Annual cost of all first-year CS PhD Students 3 Monday, October 12, 2009

Can we reduce energy Still serve the same workloads use by a factor of ten? Still serve the same workloads Avoid increasing capital cost 4

FAWN Traditional Server Mem 40W CPU 4GB CompactFlash Improve computational efficiency of data-intensive computing using an array of well-balanced low-power systems. FastArray ofWimpy Nodes Traditional Server FAWN CPU Disk CPU Mem ()* %&' +,-#. AMD Geode 256MB DRAM 4GB CompactFlash 220W 40W 5

{ Goal:reduce peak power FAWN Traditional Datacenter 1000W Servers 100% Servers 20% Power Cooling 1000W 750W 100W <100W Distribution { 20% energy loss (good) 6

Overview • Background • FAWN Principles • FAWN-KV Design • Evaluation • Conclusion 7

Towards balanced systems Rebalancing Options 1E+08 1E+07 1E+06 1E+05 1E+04 1E+03 1E+02 1E+01 Disk Seek Wasted resources DRAM Access Nanoseconds 1E+00 1E-01 1980 1985 1990 1995 2000 2005 Year CPU Cycle Today’s CPUs Slower CPUs Array of Fast Storage Fastest Disks Slow CPUs Today’s Disks 8

Targeting the sweet-spot in efficiency Speed vs.Efficiency Fastest processors exhibit superlinear power usage 2500 Instructions/sec/W in millions Fixed power costs can dominate efficiency for slow processors 2000 XScale 800Mhz Atom Z500 1500 FAWN targets sweet spot in system efficiency when including fixed costs Xeon7350 1000 500 Custom ARM Mote 1 10 100 1000 10000 100000 Instructions/sec in millions (Includes 0.1W power overhead) 9

Targeting the sweet-spot in efficiency Instructions/sec/W in millions FAWN 1000 1500 2000 2500 500 1 Custom ARM Mote 10 Instructions/sec in millions XScale 800Mhz 100 1000 Atom Z500 10000 Xeon7350 100000 Today’s CPU Slower CPU Slow CPU Array of Fast Storage Today’s Disk Fastest Disks 10 More efficient

Overview • Background • FAWN Principles • FAWN-KV Design • Evaluation Architecture Constraints • • Evaluation • Conclusion 11

Data-intensive KeyValue • Critical infrastructure service • Service level agreements for performance/latency • Random-access,read-mostly,hard to cache 12

FAWN-KV: • Energy-efficient cluster key-value store Our KeyValue Proposition • Energy-efficient cluster key-value store • Goal:improve Queries/Joule • Prototype:Alix3c2 nodes with flash storage • 500MHz CPU,256MB DRAM,4GB CompactFlash 13 Monday, October 12, 2009

FAWN-KV: • Prototype:Alix3c2 nodes with flash storage Our KeyValue Proposition Unique Challenges: • Efficient and fast failover • Wimpy CPUs, limited DRAM • Flash poor at small random writes • Prototype:Alix3c2 nodes with flash storage • 500MHz CPU,256MB DRAM,4GB CompactFlash 14

FAWN-KVArchitecture Manages Backends Consistent hashing X Back-end Acts as Gateway Routes Requests Back-end FAWN-DS Front-end KV Ring Consistent hashing X Back-end 15

FAWN-KVArchitecture FAWN-KV FAWN-DS X Front-end Back-end Limited Resources Avoid random writes Efficient Failover Avoid random writes 16

{

Log-structured Datastore • Log-structuring avoids small random writes Get Put Delete Random Read Append FAWN-DS Limited Resources Avoid random writes FAWN-KV Efficient Failover Avoid random writes ✔ 18

On a node addition H A G B F C Hash Index Values (H,B] D Node additions, failures require transfer of key-ranges 19

Nodes stream data range Datastore List Stream Atomic Update A of Datastore List Minimizes locking from B to Concurrent Inserts, Compact Datastore Concurrent Inserts • Background operations sequential Continue to meet SLA A FAWN-DS Limited Resources Avoid random writes FAWN-KV Efficient Failover Avoid random writes ✔ ✔ 21 Monday, October 12, 2009

FAWN-KV Take-aways • Log-structured datastore • Avoids random writes at all levels • Minimizes locking during failover • Careful resource use but high performing • Replication and strong consistency • Variant of chain replication (see paper) 21

Overview • Background • FAWN principles • FAWN-KV Design • Evaluation • Conclusion 22

Evaluation Roadmap • Key-value lookup efficiency comparison • Impact of background operations • TCO analysis for random read workloads 23

FAWN-DS Lookups Watt QPS Watts Alix3c2/Sandisk(CF) Desktop/Mobi (SSD) 346 51.7 2.3 1.96 System Alix3c2/Sandisk(CF) Desktop/Mobi (SSD) MacbookPro / HD Desktop / HD QPS 1298 4289 66 171 Watts 3.75 83 29 87 • FAWN-based system over 6x more efficient than 2008-era traditional systems 24

Impact of background ops 1600 1200 800 400 1600 1200 800 400 Queries per second Queries per second Peak Compact Split Merge Peak Compact Split Merge Peak query load 30% of peak query load Background operations have: • Moderate impact at peak load • Negligible impact at 30% load 25

TCO = Capital Cost + Power Cost ($0.10/kWh) When to use FAWN for random access workloads? TCO = Capital Cost + Power Cost ($0.10/kWh) Traditional (200W) Five 2TB disks 160GB PCI-e Flash SSD 64GB FBDIMM per node ~$2000-8000 per node FAWN (10W each) 2TB disk 64GB SATA Flash SSD 2GB DRAM per node ~$250-500 per node 26

Ratio of query rate to cooling, !$"""" 0.12*+*,%34 ./ *, (# "#$ !"#$ Ratio of query rate to 0.12*+*,%34 0.12*+*0)#35 - )*+ %&%' ! !$ !$" !$"" 01)23!4&')!56+77+8-(9():; ./ *, (# cooling, "#$ 0.12*+*,-./ !$"""

• FAWN architecture reduces energy Conclusion • FAWN architecture reduces energy consumption of cluster computing • FAWN-KV addresses challenges of wimpy nodes for key value storage • Log-structured,memory efficient datastore Efficient replication and failover Meets energy efficiency and performance goals “Each decimal order of magnitude increase in parallelism requires a major redesign and rewrite of parallel code”- KathyYelick • • • 28