Facebook: Infrastructure at Scale Using Emerging Hardware Bill Jia, PhD Manager, Performance and Capacity Engineering Oct 8 th, 2013.

Slides:



Advertisements
Similar presentations
Cloud Computing Data Centers Dr. Sanjay P. Ahuja, Ph.D FIS Distinguished Professor of Computer Science School of Computing, UNF.
Advertisements

The Next I.T. Tsunami Paul A. Strassmann. Copyright © 2005, Paul A. Strassmann - IP4IT - 11/15/05 2 Perspective Months  Weeks.
MCITP Guide to Microsoft Windows Server 2008 Server Administration (Exam #70-646) Chapter 11 Windows Server 2008 Virtualization.
Copyright 2009 FUJITSU TECHNOLOGY SOLUTIONS PRIMERGY Servers and Windows Server® 2008 R2 Benefit from an efficient, high performance and flexible platform.
Application Models for utility computing Ulrich (Uli) Homann Chief Architect Microsoft Enterprise Services.
SSD (Flash-Based) Anthony Bonomi. SSD (Solid State Drive) Commercially available for only a few years Big use in laptops Released the first 512GB last.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.
* Definition of -RAM (random access memory) :- -RAM is the place in a computer where the operating system, application programs & data in current use.
1 CS : Technology Trends Ion Stoica ( September 12, 2011.
Analyzing the Energy Efficiency of a Database Server Hanskamal Patel SE 521.
Basic Computer Structure and Knowledge Project Work.
Shilpa Seth.  Centralized System Centralized System  Client Server System Client Server System  Parallel System Parallel System.
Data Center Infrastructure
Computing Hardware Starter.
© 2009 VMware Inc. All rights reserved VMware vCloud for Service Providers Willem van Engeland, Partner SE, VMware.
Storage & Connectivity Devices. Internal / External Hard Drive Also known as hard disks Internal drive stores the operating system software, application.
Written by Margo Seltzer Presented by Mark Simko.
Software 1. Software is divided into parts System software Operating system Utility software Application software 2.
Module 1.1 Upgrading your computer Theme: Data processing for your business.
InfiniSwitch Company Confidential. 2 InfiniSwitch Agenda InfiniBand Overview Company Overview Product Strategy Q&A.
Hardware Trends. Contents Memory Hard Disks Processors Network Accessories Future.
Future Server and Storage Technology Brian Minick, Infrastructure Design Leader - GE.
Jan. 17, 2002DØRAM Proposal DØRACE Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) IntroductionIntroduction Remote Analysis Station ArchitectureRemote.
 Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over a network (typically the Internet). 
Chapter 19 Upgrading and Expanding Your PC. 2Practical PC 5 th Edition Chapter 19 Getting Started In this Chapter, you will learn: − If you can upgrade.
OCR GCSE Computing © Hodder Education 2013 Slide 1 OCR GCSE Computing Chapter 2: Memory.
How AWS Pricing Works Jinesh Varia Technology Evangelist.
Unit 4—Part A 2nd Evaluating & Purchasing a Computer Computer Technology (S 1 Obj 2-3 and Obj 3-2)
Project GreenLight Overview Thomas DeFanti Full Research Scientist and Distinguished Professor Emeritus California Institute for Telecommunications and.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
Clint Huffman Microsoft Premier Field Engineer (PFE) Microsoft Corporation SESSION CODE: VIR315 Kenon Owens Technical Product Manager Microsoft Corporation.
1 CS : Technology Trends Ion Stoica and Ali Ghodsi ( August 31, 2015.
© GCSE Computing Computing Hardware Starter. Creating a spreadsheet to demonstrate the size of memory. 1 byte = 1 character or about 1 pixel of information.
Platform Disaggregation Lightening talk Openlab Major review 16 th Octobre 2014.
Azure. SQL, SharePoint, BizTalk Images Distributed Cache Queue Geo Replication Read-Only Secondary Storage Delete Disks Large Memory SKU Tag Expressions.
Architecture & Cybersecurity – Module 3 ELO-100Identify the features of virtualization. (Figure 3) ELO-060Identify the different components of a cloud.
Scalable Data Scale #2 site on the Internet (time on site) >200 billion monthly page views Over 1 million developers in 180 countries.
Jeffrey Ellak CS 147. Topics What is memory hierarchy? What are the different types of memory? What is in charge of accessing memory?
Feb. 13, 2002DØRAM Proposal DØCPB Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) IntroductionIntroduction Partial Workshop ResultsPartial.
Tackling I/O Issues 1 David Race 16 March 2010.
Introduction to Exadata X5 and X6 New Features
By Harshal Ghule Guided by Mrs. Anita Mahajan G.H.Raisoni Institute Of Engineering And Technology.
SanDisk Confidential 1 Justin Wheeler Flash and Virtualisation – The Perfect Fit? Senior Solutions Architect.
Peter Idoine Managing Director Oracle New Zealand Limited.
1 Paolo Bianco Storage Architect Sun Microsystems An overview on Hybrid Storage Technologies.
Hardware Architecture
FusionCube At-a-Glance. 1 Application Scenarios Enterprise Cloud Data Centers Desktop Cloud Database Application Acceleration Midrange Computer Substitution.
BLADE HEMAL RANA BLADE TECHNOLOGIES PRESENTED BY HEMAL RANA COMPUTER ENGINEER GOVERNMENT ENGINEERING COLLEGE,MODASA.
Unit 3 Processors and Memory Section B Chapter 1, Slide 2Starting Out with Visual Basic 3 rd EditionIntroduction to ComputersUnit 3B – Processors and.
Chapter 19 Upgrading and Expanding Your PC
CS120 Purchasing a Computer
Extreme Scale Infrastructure
Azure.
Green cloud computing 2 Cs 595 Lecture 15.
Memory and storage AS identifies the main hardware components of at least two types of computer. AS states and discusses the implications.
Introduction to Computers
Windows Server* 2016 & Intel® Technologies
Computer hardware f1031 – computer hardware.
Azure.
Unit 2 Computer Systems HND in Computing and Systems Development
Cloud Computing Data Centers
Morgan Kaufmann Publishers
IBM Power Systems.
2.C Memory GCSE Computing Langley Park School for Boys.
Cloud Computing Data Centers
LO2 – Understand Computer Software
Computing Essentials Module 1.
Computing Essentials Module 1.
Presentation transcript:

Facebook: Infrastructure at Scale Using Emerging Hardware Bill Jia, PhD Manager, Performance and Capacity Engineering Oct 8 th, 2013

Agenda 1 Facebook Scale & Infrastructure 2 Efficiency at Facebook 3 Disaggregated Rack 4 New Components – Emerging Hardware

Data Centers in 5 regions. Facebook Scale 84 % of monthly active users are outside of the U.S. (as of Q2’13)

Facebook Stats 1.15 billion users (6/2013) ~700 million people use facebook daily 350+ million photos added per day (1/2013) 240+ billion photos 4.5 billion likes, posts and comments per day (5/2013)

Cost and Efficiency Infrastructure spend in 2012 (from our 10-K) : “...$1.24 billion for capital expenditures related to the purchase of servers, networking equipment, storage infrastructure, and the construction of data centers.” Efficiency work has been a top priority for several years

Architecture Service Cluster Back-End Cluster Front-End Cluster Web Ads Ads few racks Cache SearchPhotosMsgOthersUDBADS-DBTao Leader Multifeed Multifeed few racks Other small Other small services

Lots of “vanity free” servers.

Front endBack end Life of a “hit” Front-End Back-End Web MC Ads Database L Feed agg request starts Time request completes L L L L L

Standard Systems I Web III Database IV Hadoop V Photos VI Feed CPU High LowHigh Memory LowHighMediumLowHigh Disk Low High IOPS Flash High Medium Services Web, ChatDatabase Hadoop (big data) Photos, Video Multifeed, Search, Ads Five Standard Servers

Five Server Types Advantages: Volume pricing Re-purposing Easier operations - simpler repairs, drivers, DC headcount New servers allocated in hours rather than months Drawbacks: 40 major services; 200 minor ones - not all fit perfectly The needs of the service change over time.

Agenda 1 Facebook Scale & Infrastructure 2 Efficiency at FB 3 Disaggregated Rack 4 New Components

Efficiency at FB Data Centers Heat management, electrical efficiency & operations Servers “Vanity free” design & supply chain optimization Software Horizontal wins like HPHP/HHVM, cache, db, web & service optimizations

Next Opportunities? Disaggregated Rack Better component/service fit Extending component useful life Developing New Components Alternatives in CPU, RAM, Disk & Flash

Agenda 1 Facebook Scale & Infrastructure 2 Efficiency at FB 3 Disaggregated Rack 4 New Components

A rack of news feed servers... COMPUTE RAM STORAGE Type-6 Server Network Switch Type-6 Server =>=> 5.8 TB 80 TB FLASH 30 TB Type-6 Server 80 processors 640 cores LeafAggregator AL AL AL The application lives on a rack of equipment--not a single server.

Compute Standard Server N processors 8 or 16 DIMM slots no hard drive - small flash boot partition. big NIC - 10 Gbps or more

Ram Sled Hardware 128GB to 512GB compute: FPGA, ASIC, mobile processor or desktop processor Performance 450k to 1 million key/value gets/sec Cost Excluding RAM cost: $500 to $700 or a few dollars per GB

Storage Sled (Knox) Hardware 15 drives Replace SAS expander w/ small server Performance 3k IOPS Cost Excluding drives: $500 to $700 or less than $0.01 per GB

Flash Sled Hardware 175GB to 18TB of flash Performance 600k IOPS Cost Excluding flash cost: $500 to $700 NIC at 70% utilization IOPSCapacity 1 Gbps 21k 175 GB 10 Gb 210k1.75 TB 25 Gb 525k4.4 TB 40 Gb 840k7.7 TB 50 Gb1.05M8.8 TB 100 Gb2.1M17.5 TB

Three Disaggregated Rack Wins Server/Service Fit - across services Server/Service Fit - over time Longer useful life through smarter hardware refreshes.

Longer Useful Life Today servers are typically kept in production for about 3 years. With disaggregated rack: Compute - 3 to 6 years RAM sled - 5 years or more Disk sled - 4 to 5 years depending on usage Flash sled - 6 years depending on write volume

Disaggregated Rack Strengths: Volume pricing, serviceability, etc. Custom Configurations Hardware evolves with service Smarter Technology Refreshes Speed of Innovation Potential issues: Physical changes required Interface overhead

Approximate Win Estimates Conservative assumptions show a 12% to 20% opex savings. More aggressive assumptions promise between 14% and 30% opex savings. * These are reasonable savings estimates of what may be possible across several use cases.

Agenda 1 Facebook Scale & Infrastructure 2 Efficiency at FB 3 Disaggregated Rack 4 New Components

20 years of the same... The servers we deploy in the datacenter are pretty much identical to a 386 computer from 20 years ago. New Technologies: Math coprocessor SMP - N processors per server Multi-core processors GPUs - vector math Flash memory The exponential improvement of CPU, RAM, disk and NIC density has been incredible.

CPU Today: Server CPUs are more compute-dense versions of desktop processors. Mobile phones are driving development of lower power processors. System on a Chip (SoC) processors for this market integrate several modules into a single package to minimize energy consumption and board real-estate. Next: Smaller Server CPUs derived from mobile processors may become compelling alternatives.

RAM Today: The DDR standard RAM modules were developed for desktop servers. The DDR standard is difficult to hit on a cost/performance basis with alternative technologies. Next: Lower latency memory technology is interesting for “near RAM.” Slower memory (between RAM and Flash) could be accommodated for “far RAM.”

Storage (ZB of data; 1 ZB = 1 billion TB) Today: Photo and data storage uses 3.5” disk drives almost exclusively. Applications and file systems expect low-latency writes and reads and unlimited write cycles. By changing these assumptions optical and solid-state media may be viable. Next: Cold and big capacity storage. Lower power density and slower performance

Flash (solid state storage) Today: Flash SSD drives are commonly used in databases and applications that need low-latency high-throughput storage. SLC – Premium MLC: Expensive, performant, high enduring (10K) MLC: Less enpensive, performant, lower enduring (3K) TLC: Cheap, performant ( ) Worm TLC: Extremely cheap (1)

Network Interfaces Today: 10 Gbps NICs are becoming standard in servers. 40Gbps are emerging and in production Next: By focusing on rack distances (3 meters) and allowing proprietary technologies, 100 Gbps could be affordable in a few years.