1 Store Everything Online In A Database Jim Gray Microsoft Research

Slides:



Advertisements
Similar presentations
Computer Technology Forecast Jim Gray Microsoft Research
Advertisements

1 Store Everything Online In A Database Jim Gray Microsoft Research
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
Storing Data: Disk Organization and I/O
Overview of MapReduce and Hadoop
Lecture 13 Page 1 CS 111 Online File Systems: Introduction CS 111 On-Line MS Program Operating Systems Peter Reiher.
1 Magnetic Disks 1956: IBM (RAMAC) first disk drive 5 Mb – Mb/in $/year 9 Kb/sec 1980: SEAGATE first 5.25’’ disk drive 5 Mb – 1.96 Mb/in2 625.
The Google File System. Why? Google has lots of data –Cannot fit in traditional file system –Spans hundreds (thousands) of servers connected to (tens.
NWfs A ubiquitous, scalable content management system with grid enabled cross site data replication and active storage. R. Scott Studham.
CS597A: Managing and Exploring Large Datasets Kai Li.
Chapter 12 Distributed Database Management Systems
Developing PANDORA Mark Corbould Director, IT Business Systems.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
Secondary Storage Unit 013: Systems Architecture Workbook: Secondary Storage 1G.
BACKUP/MASTER: Immediate Relief with Disk Backup Presented by W. Curtis Preston VP, Service Development GlassHouse Technologies, Inc.
1© Copyright 2013 EMC Corporation. All rights reserved. EMC and Microsoft SharePoint Server Performance Name Title Date.
Google Distributed System and Hadoop Lakshmi Thyagarajan.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Storage Area Networks The Basics. Storage Area Networks SANS are designed to give you: More disk space Multiple server access to a single disk pool Better.
The Cost of Storage about 1K$/TB 12/1/1999 9/1/2000 9/1/2001 4/1/2002.
1 Storage Refinement. Outline Disk failures To attack Intermittent failures To attack Media Decay and Write failure –Checksum To attack Disk crash –RAID.
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Storage Systems.
12 1 Chapter 12 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
CERN-IT-DB Exabyte-Scale Data Management Using an Object-Relational Database: The LHC Project at CERN Jamie Shiers CERN, Switzerland
The Worlds of Database Systems Chapter 1. Database Management Systems (DBMS) DBMS: Powerful tool for creating and managing large amounts of data efficiently.
The Dawning of the Age of Infinite Storage William Perrizo Dept of Computer Science North Dakota State Univ.
RAMCloud: Concept and Challenges John Ousterhout Stanford University.
Lecture On Database Analysis and Design By- Jesmin Akhter Lecturer, IIT, Jahangirnagar University.
IT 344: Operating Systems Winter 2010 Module 13 Secondary Storage Chia-Chi Teng CTB 265.
Presented by CH.Anusha.  Apache Hadoop framework  HDFS and MapReduce  Hadoop distributed file system  JobTracker and TaskTracker  Apache Hadoop NextGen.
Planning and Designing Server Virtualisation.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS From data management to storage services to the next challenges.
Section 1 # 1 CS The Age of Infinite Storage.
Section 1 # 1 CS The Age of Infinite Storage.
DATABASE MANAGEMENT SYSTEMS IN DATA INTENSIVE ENVIRONMENNTS Leon Guzenda Chief Technology Officer.
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.
Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides.
Mark A. Magumba Storage Management. What is storage An electronic place where computer may store data and instructions for retrieval The objective of.
The exponential growth of data –Challenges for Google,Yahoo,Amazon & Microsoft in web search and indexing The volume of data being made publicly available.
(C) 2008 Clusterpoint(C) 2008 ClusterPoint Ltd. Empowering You to Manage and Drive Down Database Costs April 17, 2009 Gints Ernestsons, CEO © 2009 Clusterpoint.
1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.
1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Storage and File Structure.
Database Management Systems,Shri Prasad Sawant. 1 Storing Data: Disks and Files Unit 1 Mr.Prasad Sawant.
Computer Guts and Operating Systems CSCI 101 Week Two.
INFO1408 Database Design Concepts Week 15: Introduction to Database Management Systems.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
1 Put Everything in Future (Disk) Controllers (it’s not “if”, it’s “when?”) Jim Gray Acknowledgements : Dave Patterson.
SYS364 Database Design Continued. Database Design Definitions Initial ERD’s Normalization of data Final ERD’s Database Management Database Models File.
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
CS 540 Database Management Systems
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
Chapter 6 Discovering Computers Fundamentals Storage.
Next Generation of Apache Hadoop MapReduce Owen
Chapter 1: Computer Basics Instructor:. Chapter 1: Computer Basics Learning Objectives: Understand the purpose and elements of information systems Recognize.
IT-DSS Alberto Pace2 ? Detecting particles (experiments) Accelerating particle beams Large-scale computing (Analysis) Discovery We are here The mission.
1 Lecture 16: Data Storage Wednesday, November 6, 2006.
Canadian Bioinformatics Workshops
CS 540 Database Management Systems
Integrating Disk into Backup for Faster Restores
Lecture 16: Data Storage Wednesday, November 6, 2006.
How much information? Adapted from a presentation by:
The Client/Server Database Environment
Introduction to MapReduce and Hadoop
CS The Age of Infinite Storage
CSE 451: Operating Systems Secondary Storage
Jim Gray Microsoft Research
Presentation transcript:

1 Store Everything Online In A Database Jim Gray Microsoft Research

2 Outline Store Everything Online (Disk not Tape) In a Database

3 How Much is Everything? Soon everything can be recorded and indexed Most bytes will never be seen by humans. Data summarization, trend detection anomaly detection are key technologies See Mike Lesk: How much information is there: See Lyman & Varian: How much information Yotta Zetta Exa Peta Tera Giga Mega Kilo A Book.Movi e All LoC books (words) All Books MultiMedia Everything ! Recorded A Photo 24 Yecto, 21 zepto, 18 atto, 15 femto, 12 pico, 9 nano, 6 micro, 3 milli

4 Storage capacity beating Moore’s law 3 k$/TB today (raw disk) 1k$/TB by end of 2002

5 Outline Store Everything Online (Disk not Tape) In a Database

6 Online Data Can build 1PB of NAS disk for 5M$ today Can SCAN ( read or write ) entire PB in 3 hours. Operate it as a data pump: continuous sequential scan Can deliver 1PB for 1M$ over Internet –Access charge is 300$/Mbps bulk rate Need to Geoplex data (store it in two places). Need to filter/process data near the source, –To minimize network costs.

7 The “Absurd” Disk 2.5 hr scan time (poor sequential access) 1 access per second / 5 GB (VERY cold data) It’s a tape! 1 TB 100 MB/s 200 Kaps

8 Disk vs Tape Disk –80 GB –35 MBps – 5 ms seek time – 3 ms rotate latency – 3$/GB for drive 2$/GB for ctlrs/cabinet –15 TB/rack –1 hour scan Tape –40 GB –10 MBps –10 sec pick time – second seek time –2$/GB for media 8$/GB for drive+library –10 TB/rack –1 week scan The price advantage of disk is growing the performance advantage of disk is huge! At 10K$/TB, disk is competitive with nearline tape. Guestimates Cern: 200 TB 3480 tapes 2 col = 50GB Rack = 1 TB =12 drives

9 Building a Petabyte Disk Store Cadillac ~ 500k$/TB = 500M$/PB plus FC switches plus…800M$/PB TPC-C SANs (Brand PC 18GB/…) 60 M$/PB Brand PC local SCSI 20M$/PB Do it yourself ATA 5M$/PB

10 Cheap Storage and/or Balanced System Low cost storage (2 x 3k$ servers) 5K$ TB 2x ( 800 Mhz, 256Mb + 8x80GB disks + 100MbE) raid5 costs 6K$/TB Balanced server (5k$/.64 TB) –2x800Mhz (2k$) –512 MB –8 x 80 GB drives (2K$) –Gbps Ethernet + switch (300$/port) –9k$/TB 18K$/mirrored TB 2x800 Mhz 512 MB

11 Next step in the Evolution Disks become supercomputers –Controller will have 1bips, 1 GB ram, 1 GBps net –And a disk arm. Disks will run full-blown app/web/db/os stack Distributed computing Processors migrate to transducers.

12 It’s Hard to Archive a Petabyte It takes a LONG time to restore it. At 1GBps it takes 12 days! Store it in two (or more) places online (on disk?). A geo-plex Scrub it continuously (look for errors) On failure, –use other copy until failure repaired, –refresh lost copy from safe copy. Can organize the two copies differently (e.g.: one by time, one by space)

13 Outline Store Everything Online (Disk not Tape) In a Database

14 Why Not file = object + GREP ? It works if you have thousands of objects (and you know them all) But hard to search millions/billions/trillions with GREP Hard to put all attributes in file name. –Minimal metadata Hard to do chunking right. Hard to pivot on space/time/version/attributes.

15 The Reality: it’s build vs buy If you use a file system you will eventually build a database system : –metadata, –Query, –parallel ops, – security,…. –reorganize, –recovery, –distributed, –replication,

16 OK: so I’ll put lots of objects in a file Do It Yourself Database Good news: –Your implementation will be 10x faster than the general purpose one easier to understand and use than the general purpose on. Bad news: –It will cost 10x more to build and maintain –Someday you will get bored maintaining/evolving it –It will lack some killer features: Parallel search Self-describing via metadata SQL, XML, … Replication Online update – reorganization Chunking is problematic (what granularity, how to aggregate)

17 Top 10 reasons to put Everything in a DB 1.Someone else writes the million lines of code 2.Captures data and Metadata, 3.Standard interfaces give tools and quick learning 4.Allows Schema Evolution without breaking old apps 5.Index and Pivot on multiple attributes space-time-attribute-version…. 6.Parallel terabyte searches in seconds or minutes 7.Moves processing & search close to the disk arm (moves fewer bytes (qestons return datons). 8.Chunking is easier (can aggregate chunks at server). 9.Automatic geo-replication 10.Online update and reorganization. 11.Security 12.If you pick the right vendor, ten years from now, there will be software that can read the data.

18 DB Centric Examples TerraServer –All images and all data in the database (chunked as small tiles). – SkyServer & Virtual Sky –Both image and semantic data in a relational store. –Parallel search & NonProcedural access are important. – – – 45s&T=4&P=12&S=10&X=5096&Y=4121&W=4&Z=- 1&tile.2.1.x=55&tile.2.1.y=20http://virtualsky.org/servlet/Page?F=3&RA=16h+10m+1.0s&DE=%2B0d+42m+ 45s&T=4&P=12&S=10&X=5096&Y=4121&W=4&Z=- 1&tile.2.1.x=55&tile.2.1.y=20

19 OK… Why don’t they use our stuff? Wrong metaphor: HDF with hyper-slab is better match. Impedence match: getting stuff in/out of DB is too hard We sold them OODBs and they did not work (unreliable, poor performance, no tools). …

20 So, why will the future be different? They have MUCH more data (10^8 files?) Java / C# eases impedance mismatch: rowsets == ragged arrays. Tools are better –Optimizers are better –CPU and disk parallelism actually works now –Statistical packages are better.

21 Outline Store Everything Online (Disk not Tape) In a Database

22 But… The title of the talk was… “The Future of Distributed Database Systems” Nobody wants to share his database. blocks, files, tables are wrong abstraction for networks. (too low level) “Objects are the right abstraction” So, UDDI / WSDL / SOAP is the solution (not SQL) XML is the wire format, XLANG is the workflow protocol, Query will be in there somewhere.

23 DDB technology GREAT in a Cluster Uniform architecture Trust among nodes High bandwidth-low latency communication Programs have single system image Queries run in parallel Global optimizer does query decomposition

24 But in a Distributed System Heterogenous architecture makes query planning much harder No trust Communication is slow and expensive (minimize it).  Higher level abstraction to minimize round trips

25 DDB the Trust Issue Customers serve themselves Follow the rules posted on the door No Overhead, no staff! Clerks serve Customers Take order, fill order, fill out invoice, collect money. Overhead: staff, training, rules,… DDB Grocery Customers serve themselves Follow the rules posted on the dorr Client/Server Groceries