Alternate Title The elephants are selling 30 year old “bloatware”

Slides:



Advertisements
Similar presentations
Merit Direct CEO Forum Midwest October 20 th, 2010 Terry Jukes, President, B2B DMI LLC T. Jukes
Advertisements

ZIM-EX SPECIFICATIONS AND PICTURES.
Talking With Youth Listening with your heart Bob Lewis Sue Badeau Permanent Family Connections.
Advising on IT-business alignment The challenges of IT-business alignment A presentation for Microsofts TechNet IT Directors Strategy Day Neil Macehiter,
Data Services for Service Oriented Architecture in Finance D. Britton Johnston Chief Technology Evangelist.
Applied Software Project Management Andrew Stellman & Jennifer Greene Applied Software Project Management Why Software.
October 1, 2008www.Connotative.com1 Commercializing Access to the Parallel Universe of Connotative Meaning.
1 Integrating Underwriting with Technology for Industry Growth Presented by Maria Thomson, FSA SVP of Sales for YNEV For.
Fundamentals Fundamentals of Thermal Conductivity Measurement via ASTM 5470 by Dr. John W. Sofia Analysis Tech Inc
Applied Software Project Management Andrew Stellman & Jennifer Greene Applied Software Project Management Applied Software.
Chapter 2 Conceptual frameworks for spatial analysis.
Chapter 4 Part B: Distance and directional operations.
Mauro Di Giamberardino
Our main activity is development, design and manufacture of: custom metal structures, custom metal containers and racks, custom metal furniture, custom.
Web Site Integration using WordPress MySql A presentation (that should have been made) to WordPress Meetup By Peter Mantos; Mantos I.T.Consulting, Inc.
Product & Licensing Overview
Radiographic Anatomy Quiz ©2007 Kenneth J. Young, D.C., D.A.C.B.R., F.C.C., F.E.A.C. (Radiology) Young Radiology Consulting Press the space bar or click.
Chairman, RBNQA Award Committee, IMC Managing Director, Qimpro
Credit Card Operations Bülent Şenver
1 Lecture 2: Metrics to Evaluate Performance Topics: Benchmark suites, Performance equation, Summarizing performance with AM, GM, HM Video 1: Using AM.
ATN Applications: Montgomery County, Maryland Transit Opportunities Advanced Transit Association Annual Technical Meeting January 11, 2014 College Park,
Community Smart Grids October 2013
CS 440 Database Management Systems RDBMS Architecture and Data Storage 1.
Maninder Kaur MEMORY Maninder Kaur
Beautiful & Affordable Wedding Dresses 1.
October 2002www.qimpro.com1 SIX SIGMA BLACK BELT Summary of Steps.
Goodtool Brake Service Tools and Supplies
Copyright © 2011 by the Commonwealth of Pennsylvania. All Rights Reserved. Load Test Report.
Goodtool Brake Service Tools and Supplies Lead Screw Bushing.
V-One Docs. V-One Docs The paper industry is the 4th largest contributor to greenhouse gas emissions among United States.
Adding SMS functionality to WhatsUp Gold
1. 2 © 2007 Bloomberg L.P. All Rights Reserved. The Impact of Computers on the Financial Industry Kai To Bloomberg L.P.
Deir el –Medina Place of Truth Life of the common man.
Deir el –Medina Place of Truth Life of the common man.
Copyright © 2014 The Brattle Group, Inc. Review of 2013 EPA Economic Analysis of Proposed Revised Definition of Waters of the United States David Sunding,
Government Online Content: Value to the People Connie Clem Gov 2.0 Camp - Rocky Mountain Boulder, CO USA June 11, 2010
Kissinger & Fellman, P.C. Communications Taxation Reform Are Local Governments in the Picture? NATOA Regional Workshop St. Louis, Missouri.
Databases MMG508. DB Properties  Definition of a database: “A database is a collection of interrelated data items that are managed as a single unit”
Please Don’t Screen Me Out By Eric Hartwell – March 2005.
Big Data Working with Terabytes in SQL Server Andrew Novick
One Size Fits All An Idea Whose Time Has Come and Gone by Michael Stonebraker.
The End of an Architectural Era Shimin Chen (Big Data Reading Group) (many slides are copied from Stonebraker’s presentation)
The NewSQL database you’ll never outgrow Not Your Father’s Transaction Processing Michael Stonebraker, CTO VoltDB, Inc.
OLTP is Totally Different by Michael Stonebraker.
1 Continuous Queries over Data Streams Vitaly Kroivets, Lyan Marina Presentation for The Seminar on Database and Internet The Hebrew University of Jerusalem,
Meanwhile RAM cost continues to drop Moore’s Law on total CPU processing power holds but in parallel processing… CPU clock rate stalled… Because.
Chapter 14 The Second Component: The Database.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
Dale Roberts 1 Department of Computer and Information Science, School of Science, IUPUI Dale Roberts, Lecturer Computer Science, IUPUI
Lecture 2: Technology Trends and Performance Evaluation Performance definition, benchmark, summarizing performance, Amdahl’s law, and CPI.
Opening Keynote Presentation An Architecture for Intelligent Trading  Alessandro Petroni – Senior Principal Architect, Financial Services, TIBCO Software.
Course Introduction Introduction to Databases Instructor: Joe Bockhorst University of Wisconsin - Milwaukee.
DATABASE. A database is collection of information that is organized so that it can easily be accessed, managed and updated. It is also the collection.
 DATABASE DATABASE  DATABASE ENVIRONMENT DATABASE ENVIRONMENT  WHY STUDY DATABASE WHY STUDY DATABASE  DBMS & ITS FUNCTIONS DBMS & ITS FUNCTIONS 
What Does ‘Big Data’ Mean and Who Will Win? Michael Stonebraker.
OnLine Analytical Processing (OLAP)
“One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.
(C) 2008 Clusterpoint(C) 2008 ClusterPoint Ltd. Empowering You to Manage and Drive Down Database Costs April 17, 2009 Gints Ernestsons, CEO © 2009 Clusterpoint.
1 C-Store: A Column-oriented DBMS By New England Database Group.
1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.
INNOV-10 Progress® Event Engine™ Technical Overview Prashant Thumma Principal Software Engineer.
1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.
MGA Duplica Replication Tool. 1. High Availability and Avoidance of Data Loss  Replicate to alternate databases 2. Split activities across databases.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
What Does ‘Big Data’ Mean and Who Will Win? Michael Stonebraker
Lecture 2: Performance Evaluation
H-store: A high-performance, distributed main memory transaction processing system Robert Kallman, Hideaki Kimura, Jonathan Natkins, Andrew Pavlo, Alex.
McGraw-Hill Technology Education
Performance And Scalability In Oracle9i And SQL Server 2000
Database System Architectures
Presentation transcript:

One Size Fits All: An Idea Whose Time has Come and Gone Michael Stonebraker

Alternate Title The elephants are selling 30 year old “bloatware” That is not good at anything And you should send them to the “home for old software”

Three Financial Services Markets Stream processing (electronic trading) Tick stores (data warehouses) OLTP (transaction processing)

Stream Processing (Electronic Trading) A feed comes out of the wall Compute a “secret sauce” looking for events of interest Trade based on the result But only if you are more nimble than the next guy….

Traditional RDBMS Model Outbound Processing Store the data before processing! Latency What if the data is not important? Too many processes! Optimized for business data processing Where you don’t trust the app. Memory Updates Disk Queries Too slow to be interesting!

Stream Processing Engine with StreamSQL Inbound Processing Database paradigm (SQL) a good one But need a different architecture Straight through processing No task switches Lightweight scheduling StreamBase Application Streambase Application Alerts Actions Event Data Memory Disk Queries

” StreamSQL Application Example Market_Feeds Alerts My_Buys Example: Every minute for every stock I am trading: Calculate VWAP (vol. weighted avg. price) for my trades & all trades Alert whenever my personal trading execution is inferior to market 5 Streambase operators, 30 min to build Streams of “tuples” (time-series data) flow through query Queries run continuously ”

StreamSQL Will Dominate Rule Engines Essentially all applications entail a mix of stored and real-time data StreamSQL covers both kinds of data in a single paradigm A rule engine must switch paradigms StreamSQL amenable to compilation Know what is the next event to process In contrast, hard to figure this out in a rule engine

Performance Benchmark Financial Services Application: Construct a virtual feed of “first arrivers” on a low end Linux machine Relational DB: 11,000 messages/sec Streambase: 300,000 messages/sec Another StreamSQL vendor: 20,000 messages/sec Result: Streambase was a factor of 27 faster

(and Other Warehouse Applications) Tick Stores (and Other Warehouse Applications) Store all market data for the last 10 years To back test “secret sauce” models To answer ad-hoc queries – “how many times has X happened” Typical size – 100 Tbytes Append only

Terminology -- “Row Store” Record 1 Record 2 Record 3 Record 4 E.g. DB2, Oracle, Sybase, SQLServer, …

Rotate Your Thinking 90 Degrees Column stores read only the columns required Not all of them Compression works better By a factor of 2-3 against the elephants No record headers Which are big ticket items No padding to byte or word boundaries

Benchmark Summary Vertica has been baked off about 30 times Typically against the incumbent Has yet to win by less than a factor of 30 against a row store Beats most other column stores by around 10X KX is the only system to come within an order of magnitude

Maybe Elephants are Good at OLTP…… OLTP is a main memory market Not a disk-based one Transactions are short and have no I/O or user stalls Run to completion (single threaded) Disaster Recovery (and HA) a requirement Build it into the bottom of the system

TPC-C Performance on a Low-end Machine Elephant 850 TPS (1/2 the land speed record per processor) H-Store (so far – a university prototype) 70,416 TPS (41X the land speed record per processor) Factor of 82!!!!!

Implications for the Elephants They are selling “one size fits all” Which is 30 year old legacy technology that is good at nothing

Pictorially: Streaming data DBMS apps OLTP Data Warehouse

The DBMS Landscape – Performance Needs Streaming data high low high high OLTP Data Warehouse

One Size Does Not Fit All -- Pictorially Elephants get only “the crevices” Streambase Open source H-Store successors Vertica

Thank You Corporate Headquarters 181 Spring Street Lexington, Massachusetts 02421 +1 866 STRMBAS +1 866 787 6227 +1 781 761 0800 New York City Office 220 West 42nd Street, 20th Floor New York, New York 10036 +1 866 STRMBAS +1 866 787 6227 Reston, Virginia Office 11921 Freedom Drive, Suite 550 Reston, VA 20190 +1 703 608 6958 London Office 107-111 Fleet Street London EC4A 2AB United Kingdom +44 (0)20 7936 9050 Member