1 Tandem Daytona TeraByte Sort: Tsort 1 TB in 47.5 Minutes Daivd Cossock, Sam Fineberg, Pankaj Mehra, John Peck Trophy presentation by Jim Gray.

Slides:



Advertisements
Similar presentations
Hardware Lesson 3 Inside your computer.
Advertisements

1 THsort PennySort Award Ceremony Beijing China 19 October 2002 Peng LiuPeng Liu, Yao Shi, Li Zhang, Kuo Zhang, Tian Wang, | ZunChong Tian, Hao Wang, Xiaoge.
Mr Greenhalgh S4 Computing Int 1 Things you could do with knowing before the Exam…
An Overview of the Computer System
4/5/20001 Windows 2000 IO Performance Leonard Chung & Jim Gray.
CS 501: Software Engineering Fall 2000 Lecture 19 Performance of Computer Systems.
1 CS 501 Spring 2005 CS 501: Software Engineering Lecture 22 Performance of Computer Systems.
Benchmarks Title: A Measure of Transaction Processing Power Authors: Anon Et. Al. Datamation, 1985.
IT Systems In and Out EN230-1 Justin Champion C208 –
Introduction to Computing Lecture 1. Instructor: Nadeem Ahmad Khan TA: Haroon Waseem Haroon Waseem.
Storage area network and System area network (SAN)
Hardware and Software Basics. Computer Hardware  Central Processing Unit - also called “The Chip”, a CPU, a processor, or a microprocessor  Memory (RAM)
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
Prepared by Careene McCallum-Rodney Hardware specification of a computer system.
Computer Systems 1 Fundamentals of Computing
Computer Organization CSC 405 Bus Structure. System Bus Functions and Features A bus is a common pathway across which data can travel within a computer.
1 Lecture 7: Part 2: Message Passing Multicomputers (Distributed Memory Machines)
TPB Models Development Status Report Presentation to the Travel Forecasting Subcommittee Ron Milone National Capital Region Transportation Planning Board.
1 CHAPTER 2 COMPUTER HARDWARE. 2 The Significance of Hardware  Pace of hardware development is extremely fast. Keeping up requires a basic understanding.
What is a Computer? How Does it Work?.  All a computer can do is ◦ Accept Input – You give it this ◦ Process Data – It “Thinks” about it ◦ Store Data.
Computer Basics Computer Hardware and Software
Know the Computer Multimedia tools. Computer essentials.
CPU (CENTRAL PROCESSING UNIT): processor chip (computer’s brain) found on the motherboard.
CSE 101 Spring 2000 Hardware (Bits & Bytes). Understanding the Machine Data versus Information  Data are raw facts  Information is the result of transforming/examining.
Inside the Computer Ms. Rocío Acevedo September 2006.
1 CS503: Operating Systems Spring 2014 Dongyan Xu Department of Computer Science Purdue University.
Lecture No 11 Storage Devices
1 CS 501 Spring 2006 CS 501: Software Engineering Lecture 22 Performance of Computer Systems.
November 2, 2000HEPiX/HEPNT FERMI SAN Effort Lisa Giacchetti Ray Pasetes GFS information contributed by Jim Annis.
Amy Apon, Pawel Wolinski, Dennis Reed Greg Amerson, Prathima Gorjala University of Arkansas Commercial Applications of High Performance Computing Massive.
Inside your computer. Hardware Review Motherboard Processor / CPU Bus Bios chip Memory Hard drive Video Card Sound Card Monitor/printer Ports.
Inside your computer. Hardware Motherboard Processor / CPU Bus Bios chip Memory Hard drive Video Card Sound Card Monitor/printer Ports.
HPCVL High Performance Computing Virtual Laboratory Founded 1998 as a joint HPC lab between –Carleton U. (Comp. Sci.) –Queen’s U. (Engineering) –U. of.
1 PennySort Award Ceremony Beijing China 23 October 2006.
6/26/01High Throughput Linux Clustering at Fermilab--S. Timm 1 High Throughput Linux Clustering at Fermilab Steven C. Timm--Fermilab.
Computer Architecture Part IV-B: I/O Buses. Chipsets Intelligent bus controller chips found on the motherboard Enable higher speeds on one or more buses.
COMPUTER BASICS Computer Technology. WHAT IS A COMPUTER?  Electronic  Accepts data and instructions  Manipulates, processes, and displays the information.
IBM STL 1 April 1999 Scaleable Computing Jim Gray Microsoft Research Outline.
Computer Organization & Assembly Language © by DR. M. Amer.
1 Put Everything in Future (Disk) Controllers (it’s not “if”, it’s “when?”) Jim Gray Acknowledgements : Dave Patterson.
John Matrow, System Administrator/Trainer. Short History HiPeCC created April 1999 Purchased 16p 300Mhz SGI Origin 2000 April 2001: Added 8p 250Mhz.
SECTION 5: PERFORMANCE CHRIS ZINGRAF. OVERVIEW: This section measures the performance of MapReduce on two computations, Grep and Sort. These programs.
Lesson Objectives To understand the basic hardware of computers, and how they are made up To be able to compare performance of computers with price.
Sep. 17, 2002BESIII Review Meeting BESIII DAQ System BESIII Review Meeting IHEP · Beijing · China Sep , 2002.
Installation of Storage Foundation for Windows High Availability 5.1 SP2 1 Daniel Schnack Principle Technical Support Engineer.
The Sort Benchmark AlgorithmsSolid State Disks External Memory Multiway Mergesort  Phase 1: Run Formation  Phase 2: Merge Runs  Careful parameter selection.
1 CS 501 Spring 2003 CS 501: Software Engineering Lecture 23 Performance of Computer Systems.
BMTS 242: Computer and Systems Lecture 2: Memory, and Software Yousef Alharbi Website
Storage Systems CSE 598d, Spring 2007 Lecture ?: Rules of thumb in data engineering Paper by Jim Gray and Prashant Shenoy Feb 15, 2007.
 System Requirements are the prerequisites needed in order for a software or any other resources to execute efficiently.  Most software defines two.
Computer Performance. Hard Drive - HDD Stores your files, programs, and information. If it gets full, you can’t save any more. Measured in bytes (KB,
Hardware Architecture
The Sort Benchmark AlgorithmsSolid State Disks External Memory Multiway Mergesort  Phase 1: Run Formation  Phase 2: Merge Runs  Careful parameter selection.
Introduction to Programming. Key terms  CPU  I/O Devices  Main memory  Secondary memory  Operating system  User interface  Application  GUI 
CIT 140: Introduction to ITSlide #1 CSC 140: Introduction to IT Operating Systems.
APPENDIX A Hardware and Software Basics
An Overview of the Computer System
Computer Hardware.
Information Technology
Hardware Technology Trends and Database Opportunities
Cluster Active Archive
BUSINESS PLUG-IN B3 HARDWARE AND SOFTWARE BASICS
IDISK Cluster 8 disks, 8 CPUs, DRAM /shelf
הכרת המחשב האישי PC - Personal Computer
Looking Inside the machine (Types of hardware, CPU, Memory)
An Overview of the Computer System
An Overview of the Computer System
المحور 3 : العمليات الأساسية والمفاهيم
מבוא לטכנולוגיית מידע בארגון
Today’s agenda Hardware architecture and runtime system
Presentation transcript:

1 Tandem Daytona TeraByte Sort: Tsort 1 TB in 47.5 Minutes Daivd Cossock, Sam Fineberg, Pankaj Mehra, John Peck Trophy presentation by Jim Gray

2 Benchmark History Wisconsin Bitton Boral DeWitt Turbyfill IBM TP 1-7 CA and Tony Lukes Debit Credit Gray Datamation Anon et al TPC-A MCC Boral &... TPC-B TPC-C TPC-W ? Teradata Bollinger &... TPC-D Sort PennySort MinuteSort

3 A Short History of Sort April Fools 1995: Datamation Sort –Sort 1M 100 B records –An IO benchmark: 15-min to 1 hr! 1993:{Minute | Penny}x{Daytona | Indy} 1998: TeraByte Sort Web site:

4 Ground Rules How much can you sort for a penny (in a minute). –Hardware and Software cost –Depreciated over 3 years –1M$ system gets about 1 second, –1K$ system gets about 1,000 seconds. – Time (seconds) = SystemPrice ($) / 946,080 Input and output are disk resident Input is –100-byte records (random data) –key is first 10 bytes. Must create output file and fill with sorted version of input file. Daytona (product) and Indy (special) categories

5 Bottleneck Analysis Drawn to linear scale Theoretical Bus Bandwidth 422MBps = 66 Mhz x 64 bits Memory Read/Write ~150 MBps MemCopy ~50 MBps Disk R/W ~15MBps

6 Bottleneck Analysis NTFS Read/Write 18 Ultra 3 SCSI on 4 strings (2x4 and 2x5) 3 PCI 64 ~ 155 MBps Unbuffered read (175 raw) ~ 95 MBps Unbuffered write Recently: SQL Server on Xeon: 190MBps scan. Good, but 10x down from S390/SGI/UE10k Memory Read/Write ~250 MBps PCI ~110 MBps Adapter ~70 MBps PCI Adapter 155 MBps

7 PennySort Hardware –266 Mhz Intel PPro –64 MB SDRAM (10ns) –Dual Fujitsu DMA 3.2GB EIDE disks Software –NT workstation 4.3 –NT 5 sort Performance –sort 15 M 100-byte records (~1.5 GB) –Disk to disk –elapsed time 820 sec cpu time = 404 sec

8 Recent Results NCSAsort: 10.3 GB in.9 minute 60 Intel/NT/Myranet nodes MilleniumSort: 16x Dell NT cluster: 100 MB in 1.08 Sec (Datamation)

PennySort Daytona & Indy: 2.58 GB in 917 sec HMsort: Brad Helmkamp, Keith McCready, Stenograph LLC Intel 400Mhz 2 IDE disks

TB Sort Chris Nyberg Nsort SGI 32x Origin Minutes

Terabyte Sort Daytona: Daivd Cossock, Sam Fineberg, Pankaj Mehra, John Peck Tandem/Sandia TSort: 68 CPU ServerNet 47 minutes Indy: IBM SPsort 408 nodes, 1952 cpu 2168 disks 17.6 minutes = 1057sec (all for 1/3 of 94M$, slice price is 64k$ for 4cpu, 2GB ram, 6 9GB disks + interconnect

12 Sandia/Compaq/ServerNet/NT Sort Sort 1.1 Terabyte (13 Billion records) in 47 minutes 68 nodes (dual 450 Mhz processors) 543 disks, 1.5 M$ 1.2 GB ps network rap (2.8 GBps pap) 5.2 GB ps of disk rap (same as pap) (rap=real application performance, pap= peak advertised performance )

13 SP sort 2 – 4 GBps!

Sort Records Daytona Indy Penny 2.58 GB in 917 sec HMsort: Brad Helmkamp, Keith McCready, Stenograph LLCHMsort: Brad HelmkampKeith McCready Stenograph LLC 2.58 GB in 917 sec HMsort:HMsort: Brad Helmkamp, Keith McCready, Stenograph LLC Brad HelmkampKeith McCready Stenograph LLC Minute 7.6 GB in 60 seconds Ordinal Nsort SGI 32 cpu Origin IRIX Ordinal Nsort SGI 32 cpu Origin IRIX 10.3 GB in sec NOW+MPI HPVMsort Luis Rivera UIUC & Andrew Chien UCSD NOW+MPI HPVMsort Luis RiveraAndrew Chien TeraByt e 49 minutes Daivd Cossock, Sam Fineberg, Pankaj Mehra, John Peck 68x2 Compaq &Sandia Labs Daivd CossockSam FinebergPankaj MehraJohn Peck 1057 seconds SPsort 1952 SP cluster 2168 disks Jm Wyllie Jm Wyllie PDF SPsort.pdf (80KB)SPsort.pdf (80KB) Datamation 1.18 Seconds Phillip Buonadonna, Spencer Low, Josh Coates, UC Berkeley Millennium Sort 16x2 Dell NT Myrinet Phillip BuonadonnaJosh Coates

15 Partly hardware Partly software Partly economics 2x/year!

16 Progress on Sorting Speedup comes from Moore’s law 40%/year Processor/Disk/Network arrays: 60%/year (this is a software speedup).

17 Musings: PennySort=TBsort Sorts 1TB in 1Minute 2 pass so 3TB of disk = 10 disks if 330GB/disk = 5Gps (if each disk is 50Mbps) So, 600 seconds (3TB/5GBps) So, node costs 1.5k$ Costs 100x that today maybe in 10 years?

18 Data Gravity Processing Moves to Transducers Move Processing to data sources Move to where the power (and sheet metal) is Processor in –Modem –Display –Microphones (speech recognition) & cameras (vision) –Storage: Data storage and analysis System is “distributed” (a cluster/mob)

19 Disk = Node has magnetic storage (100 GB?) has processor & DRAM has SAN attachment has execution environment OS Kernel SAN driverDisk driver File SystemRPC,... ServicesDBMS Applications

20 Gbps SAN: 110 MBps SAN: Standard Interconnect PCI: 70 MBps UW Scsi: 40 MBps FW scsi: 20 MBps scsi: 5 MBps LAN faster than memory bus? 1 G B ps links in lab. 100$ port cost soon Port is computer Winsock: 110 MBps (10% cpu utilization at each end) RIP FDDI RIP ATM RIP SCI RIP SCSI RIP FC RIP ?