1 CS 294-110: Technology Trends Ion Stoica and Ali Ghodsi (http://www.cs.berkeley.edu/~istoica/classes/cs294/15/) August 31, 2015.

Slides:



Advertisements
Similar presentations
Tuning the Dennis Shasha and Philippe Bonnet, 2013.
Advertisements

Removing the I/O Bottleneck with Virident PCIe Solid State Storage Solutions Jan Silverman VP Operations.
Flash storage memory and Design Trade offs for SSD performance
Chapter 5 Internal Memory
Computer Organization and Architecture
Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
Caches Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University See P&H 5.1, 5.2 (except writes)
SAN DIEGO SUPERCOMPUTER CENTER Niches, Long Tails, and Condos Effectively Supporting Modest-Scale HPC Users 21st High Performance Computing Symposia (HPC'13)
CMPE 421 Parallel Computer Architecture MEMORY SYSTEM.
Lecture 2: Modern Trends 1. 2 Microprocessor Performance Only 7% improvement in memory performance every year! 50% improvement in microprocessor performance.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO IEEE Symposium of Massive Storage Systems, May 3-5, 2010 Data-Intensive Solutions.
1 Lecture 15: DRAM Design Today: DRAM basics, DRAM innovations (Section 5.3)
Memory Hierarchy.1 Review: Major Components of a Computer Processor Control Datapath Memory Devices Input Output.
Computer ArchitectureFall 2007 © November 7th, 2007 Majd F. Sakr CS-447– Computer Architecture.
1 CS 501 Spring 2005 CS 501: Software Engineering Lecture 22 Performance of Computer Systems.
1 CS222: Principles of Database Management Fall 2010 Professor Chen Li Department of Computer Science University of California, Irvine Notes 01.
1 Lecture 1: CS/ECE 3810 Introduction Today’s topics:  logistics  why computer organization is important  modern trends.
SSD (Flash-Based) Anthony Bonomi. SSD (Solid State Drive) Commercially available for only a few years Big use in laptops Released the first 512GB last.
Bandwidth Rocks (1) Latency Lags Bandwidth (last ~20 years) Performance Milestones Disk: 3600, 5400, 7200, 10000, RPM.
CS 142 Lecture Notes: Large-Scale Web ApplicationsSlide 1 RAMCloud Overview ● Storage for datacenters ● commodity servers ● GB DRAM/server.
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.
1 CS : Technology Trends Ion Stoica ( September 12, 2011.
SQL Server 2008 & Solid State Drives Jon Reade SQL Server Consultant SQL Server 2008 MCITP, MCTS Co-founder SQLServerClub.com, SSC
Just a really fast drive Jakub Topič, I3.B
Tape is Dead Disk is Tape Flash is Disk RAM Locality is King Jim Gray Microsoft December 2006 Presented at CIDR2007 Gong Show
1 Computers and Networks Part I Jen Golbeck College of Information Studies University of Maryland.
GPU Programming with CUDA – Accelerated Architectures Mike Griffiths
Lecture 11: DMBS Internals
UC Berkeley 1 The Datacenter is the Computer David Patterson Director, RAD Lab January, 2007.
Semiconductor Memory 1970 Fairchild Size of a single core –i.e. 1 bit of magnetic core storage Holds 256 bits Non-destructive read Much faster than core.
Lecture 03: Fundamentals of Computer Design - Trends and Performance Kai Bu
Modern Hardware for DBMS
SOLID STATE DRIVES By: Vaibhav Talwar UE84071 EEE(5th Sem)
Hardware Trends. Contents Memory Hard Disks Processors Network Accessories Future.
1 Computer System Organization I/O systemProcessor Compiler Operating System (Windows 98) Application (Netscape) Digital Design Circuit Design Instruction.
Main Memory CS448.
Egle Cebelyte. Random Access Memory is simply the storage area where all software is loaded and works from; also called working memory storage.
3/15/2002CSE Final Remarks Concluding Remarks SOAP.
CPE 731 Advanced Computer Architecture Technology Trends Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University of California,
+ CS 325: CS Hardware and Software Organization and Architecture Memory Organization.
Programming for GCSE Topic 5.1: Memory and Storage T eaching L ondon C omputing William Marsh School of Electronic Engineering and Computer Science Queen.
Impact of the convergence of three worlds: High-Performance Computing, Databases, and Analytics Stefan Manegold
1010 Caching ENGR 3410 – Computer Architecture Mark L. Chang Fall 2006.
CSE378 Intro to caches1 Memory Hierarchy Memory: hierarchy of components of various speeds and capacities Hierarchy driven by cost and performance In early.
연세대학교 Yonsei University Data Processing Systems for Solid State Drive Yonsei University Mincheol Shin
1 Adapted from UC Berkeley CS252 S01 Lecture 18: Reducing Cache Hit Time and Main Memory Design Virtucal Cache, pipelined cache, cache summary, main memory.
DR. SIMING LIU SPRING 2016 COMPUTER SCIENCE AND ENGINEERING UNIVERSITY OF NEVADA, RENO Session 3 Computer Evolution.
Storage Systems CSE 598d, Spring 2007 Lecture ?: Rules of thumb in data engineering Paper by Jim Gray and Prashant Shenoy Feb 15, 2007.
Jeffrey Ellak CS 147. Topics What is memory hierarchy? What are the different types of memory? What is in charge of accessing memory?
What is it and why do we need it? Chris Ward CS147 10/16/2008.
Moore’s Law Electronics 19 April Moore’s Original Data Gordon Moore Electronics 19 April 1965.
Parallel Computers Today Oak Ridge / Cray Jaguar > 1.75 PFLOPS Two Nvidia 8800 GPUs > 1 TFLOPS Intel 80- core chip > 1 TFLOPS  TFLOPS = floating.
Sima Dezső Introduction to multicores October Version 1.0.
Taking SSDs to the Next Level – A Hot Year for Storage $9.5B+ in Acquisitions and Investments In the PC Market… Mainstream 3-bit MLC SSDs PCIe.
1 Paolo Bianco Storage Architect Sun Microsystems An overview on Hybrid Storage Technologies.
1 Meta-Message: Technology Ratios Matter Price and Performance change. If everything changes in the same way, then nothing really changes. If some things.
William Stallings Computer Organization and Architecture 6th Edition
TYPES OF MEMORY.
Lynn Choi School of Electrical Engineering
Beyond Moore’s Law, Issues and Approaches By: Patrick Cauley 4/8/16
RAM, CPUs, & BUSES Egle Cebelyte.
Hardware September 19, 2017.
Architecture & Organization 1
CS : Technology Trends August 31, 2015 Ion Stoica and Ali Ghodsi (
What is the maximum capacity for DDR3 RAM?
Architecture & Organization 1
Memory Hierarchy Memory: hierarchy of components of various speeds and capacities Hierarchy driven by cost and performance In early days Primary memory.
What is the maximum capacity for DDR3/DDR4 RAM?
Memory Hierarchy Memory: hierarchy of components of various speeds and capacities Hierarchy driven by cost and performance In early days Primary memory.
Presentation transcript:

1 CS : Technology Trends Ion Stoica and Ali Ghodsi ( August 31, 2015

2 “Skate where the puck's going, not where it's been” – Walter Gretzky

Typical Server Node 3 RAM SSD HDD CPU Memory Bus PCI SATA Ethernet

Typical Server Node 4 RAM SSD HDD CPU Memory Bus PCI SATA Ethernet (80 GB/s) (1 GB/s)* (600 MB/s)* (1 GB/s) ( MB/s)* (100s GB) (TBs) (10s TBs) * multiple channels Time to read all data: 10s sec hours 10s of hours

Moore’s Law Slowing Down 5 Stated 50 years ago by Gordon Moore Number of transistors on microchip double every 2 years Today “closer to 2.5 years” - Brian Krzanich Number of transistors

Memory Capacity % per year DRAM Capacity

: -54% per year : -51% per year : -32% per year ( 7 Memory Price/Byte Evolution

Typical Server Node 8 RAM SSD HDD CPU Memory Bus PCI SATA Ethernet 30%

9 ( Normalized Performance

10 Today +10% per year (Intel personal communications) ( Normalized Performance

Number of cores: % per year 11 (Source:

CPU Performance Improvement Number of cores: % Per core performance: +10% Aggregate improvement: % 12

Typical Server Node 13 RAM SSD HDD CPU Memory Bus PCI SATA Ethernet 30%

SSDs Performance: Reads: 25us latency Write: 200us latency Erase: 1,5 ms Steady state, when SSD full One erase every 64 or 128 reads (depending on page size) Lifetime: 100,000-1 million writes per page 14 Rule of thumb: writes 10x more expensive than reads, and erases 10x more expensive than writes

d 15 Cost crosspoint!

SSDs vs. HDDs SSDs will soon become cheaper than HDDs Transition from HDDs to SSDs will accelerate Already most instances in AWS have SSDs Digital Ocean instances are SSD only Going forward we can assume SSD only clusters 16

Typical Server Node 17 RAM SSD HDD CPU Memory Bus PCI SATA Ethernet 30%

SSD Capacity Leverage Moore’s law 3D technologies will help outpace Moore’s law 18

Typical Server Node 19 RAM SSD HDD CPU Memory Bus PCI SATA Ethernet 30% >30%

Memory Bus: +15% per year 20

Typical Server Node 21 RAM SSD HDD CPU Memory Bus PCI SATA Ethernet 30% 15% >30%

PCI Bandwidth: 15-20% per Year 22

Typical Server Node 23 RAM SSD HDD CPU Memory Bus PCI SATA Ethernet 30% 15% 15-20% >30%

SATA 2003: 1.5Gbps (SATA 1) 2004: 3Gbps (SATA 2) 2008: 6Gbps (SATA 3) 2013: 16Gbps (SATA 3.2) +20% per year since

Typical Server Node 25 RAM SSD HDD CPU Memory Bus PCI SATA Ethernet 30% 15% 15-20% 20% >30%

Ethernet Bandwidth % per year

Typical Server Node 27 RAM SSD HDD CPU Memory Bus PCI SATA Ethernet 30% 15% 15-20% 20% 33-40% >30%

Summary so Far Bandwidth to storage is the bottleneck Will take longer and longer and longer to read entire data from RAM or SSD 28 RAM SSD HDD CPU Memory Bus PCI SATA Ethernet 30% 15% 15-20% 20% 33-40% >30%

But wait, there is more… 29

3D XPoint Technology Developed by Intel and Micron Released last month! Exceptional characteristics: Non-volatile memory 1000x more resilient than SSDs 8-10x density of DRAM Performance in DRAM ballpark! 30

31 (

32

High-Bandwidth Memory Buses Today’s DDR4 maxes out at 25.6 GB/sec High Bandwidth Memory (HBM) led by AMD and NVIDIA Supports 1,024 bit-wide 125 GB/sec Hybrid Memory Cube (HMC) consortium led by Intel To be release in 2016 Claimed that 400 GB/sec possible! Both based on stacked memory chips Limited capacity (won’t replace DRAM), but much higher than on-chip caches Example use cases: GPGPUs 33

Example: HBM 34

Example: HBM 35

A Second Summary 3D XPoint promises virtually unlimited memory Non-volatile Reads 2-3x slower than RAM Writes 2x slower than reads 10x higher density Main limit for now 6GB/sec interface High memory bandwidth promise 5x increase in memory bandwidth or higher, but limited capacity so won’t replace DRAM 36

What does this Mean? 37

Thoughts For big data processing HDD are virtually dead! Still great for archival thought With 3D XPoint, RAM will finally become the new disk Gap between memory capacity and bandwidth still increasing 38

Thoughts Storage hierarchy gets more and more complex: L1 cache L2 cache L3 cache RAM 3D XPoint based storage SSD (HDD) Need to design software to take advantage of this hierarchy 39

Thoughts Primary way to scale processing power is by adding more core Per core performance increase only 10-15% per year now HBM and HBC technologies will alleviate the bottleneck to get data to/from multi-cores, including GPUs Moore’s law is finally slowing down Parallel computation models will become more and more important both at node and cluster levels 40

Thoughts Will locality become more or less important? New OSes that ignore disk and SSDs? Aggressive pre-computations Indexes, views, etc Tradeoff between query latency and result availability … 41