Key Challenges in Information Processing James Hamilton Microsoft SQL Server 2002.03.01.

Slides:



Advertisements
Similar presentations
Software Testing Doesnt Scale James Hamilton Microsoft SQL Server.
Advertisements

Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
2  Industry trends and challenges  Windows Server 2012: Modern workstyle, enabled  Access from virtually anywhere, any device  Full Windows experience.
Software Testing Doesn’t Scale James Hamilton Microsoft SQL Server.
6 SQL Server Integration Same manageability, administration & development experience Integrated queries & transactions Integrated HA and backup/restore.
Fabián E. Bustamante, Winter 2006 Recovery Oriented Computing Embracing Failure A. B. Brown and D. A. Patterson, Embracing failure: a case for recovery-
The State of the Art in Distributed Query Processing by Donald Kossmann Presented by Chris Gianfrancesco.
Backup and Disaster Recovery (BDR) A LOGICAL Alternative to costly Hosted BDR ELLEGENT SYSTEMS, Inc.
1 Magnetic Disks 1956: IBM (RAMAC) first disk drive 5 Mb – Mb/in $/year 9 Kb/sec 1980: SEAGATE first 5.25’’ disk drive 5 Mb – 1.96 Mb/in2 625.
1. Aim High with Oracle Real World Performance Andrew Holdsworth Director Real World Performance Group Server Technologies.
1 Storage Today Victor Hatridge – CIO Nashville Electric Service (615)
Adding scalability to legacy PHP web applications Overview Mario A. Valdez-Ramirez.
DB Zero & DB Everything Donald Kossmann 28msec, Inc. & ETH Zurich.
Active Server Availability Feedback James Hamilton Microsoft SQL Server
Challenges in Large Enterprise Data Management James Hamilton Microsoft SQL Server
VIRTUALIZATION AND YOUR BUSINESS November 18, 2010 | Worksighted.
Yes, yes it does! 1.Guest Clustering is supported with SQL Server when running a guest operating system of Windows Server 2008 SP2 or newer.
Architecture for Modular Data Centers James Hamilton 2007/01/17
Is Your IT Out of Alignment? Chargeback and Billing with Parallels Automation Brian Shellabarger, Chief Architect - SaaS.
Implementing Failover Clustering with Hyper-V
An Introduction to Cloud Computing. The challenge Add new services for your users quickly and cost effectively.
Architecture for Modular Data Centers James Hamilton 2007/01/08
Simplify your Job – Automatic Storage Management Angelo Session id:
How WebMD Maintains Operational Flexibility with NoSQL Rajeev Borborah, Sr. Director, Engineering Matt Wilson – Director, Production Engineering – Consumer.
System Center Operations Manager 2007 Dave Northey Microsoft Ireland.
Real Security for Server Virtualization Rajiv Motwani 2 nd October 2010.
CompSci Self-Managing Systems Shivnath Babu.
COMPANY AND PRODUCT OVERVIEW Russ Taddiken Director of Principal Storage Architecture.
Day 10 Hardware Fault Tolerance RAID. High availability All servers should be on UPSs –2 Types Smart UPS –Serial cable connects from UPS to computer.
Data Center Infrastructure
October 24, 2000Milestones, Funding of USCMS S&C Matthias Kasemann1 US CMS Software and Computing Milestones and Funding Profiles Matthias Kasemann Fermilab.
Database Systems – Data Warehousing
4.x Performance Technology drivers – Exascale systems will consist of complex configurations with a huge number of potentially heterogeneous components.
Active Server Availability Feedback James Hamilton Microsoft SQL Server CIDR
Successful Deployment and Solid Management … Close Relatives Tim Sinclair, General Manager, Windows Enterprise Management.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
STEALTH Content Store for SharePoint using Caringo CAStor  Boosting your SharePoint to the MAX! "Optimizing your Business behind the scenes"
Enterprise Storage A New Approach to Information Access Darren Thomas Vice President Compaq Computer Corporation.
 DATABASE DATABASE  DATABASE ENVIRONMENT DATABASE ENVIRONMENT  WHY STUDY DATABASE WHY STUDY DATABASE  DBMS & ITS FUNCTIONS DBMS & ITS FUNCTIONS 
1 A Database Testing and QA Roadmap SASQAG July 18, 2002 Ron Talmage Prospice, LLC.
APPLICATION Provisioning & Management made EASY EASY to ManageEASY to Manage EASY to MarketEASY to Market.
SXe on Windows 2000 Installing Windows 2000 Server, Progress, and SXe.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
DATABASE MANAGEMENT SYSTEMS IN DATA INTENSIVE ENVIRONMENNTS Leon Guzenda Chief Technology Officer.
From Quality Control to Quality Assurance…and Beyond Alan Page Microsoft.
© 2008 Quest Software, Inc. ALL RIGHTS RESERVED. Perfmon and Profiler 101.
PowerOneData’s GENII Leverages Cloud Platform to Deliver Affordable, Scalable, and Accessible Meter Data Management Software to Customers COMPANY PROFILE:
4/24/2017 © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks.
Server Virtualization
©2015 EarthLink. All rights reserved. Private Cloud Hosting Create Your Own Private IT Environment.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
CS 127 Introduction to Computer Science. What is a computer?  “A machine that stores and manipulates information under the control of a changeable program”
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Big Data Directions Greg.
MidVision Enables Clients to Rent IBM WebSphere for Development, Test, and Peak Production Workloads in the Cloud on Microsoft Azure MICROSOFT AZURE ISV.
Your Data Any Place, Any Time Beyond Relational. Overview of Beyond Relational Applications Today Beyond Relational Feature Overview Whirlwind Feature.
Group members: Phạm Hoàng Long Nguyễn Huy Hùng Lê Minh Hiếu Phan Thị Thanh Thảo Nguyễn Đức Trí 1 BIG DATA & NoSQL Topic 1:
Amazon Web Services. Amazon Web Services (AWS) - robust, scalable and affordable infrastructure for cloud computing. This session is about:
Sausalito: An Application Server for RESTful Services in the Cloud Matthias Brantner & Donald Kossmann 28msec Inc.
1 Jason Shepard – Managing Principal – Cresna MCS Bernd Harzog – CEO – OpsDataStore Dave Wagner – CTO – OpsDataStore ITOM 1.1 Application Performance Management.
Introduction to High Availability
Introduction of Week 6 Assignment Discussion
User-Based Innovation & Communities Drive Commercial Systems Software
Selling IIoT Solutions to Systems Integrators
Architecture for Modular Data Centers
Terms: Data: Database: Database Management System: INTRODUCTION
Mark Quirk Head of Technology Developer & Platform Group
Presentation transcript:

Key Challenges in Information Processing James Hamilton Microsoft SQL Server

2 Unsolved Challenges 1. Availability shows only incremental progress 2. Security broken & too hard to manage 3. Weakly structured data poorly supported or exploited 4. Writing Multi-tiered apps too hard  Data intensive mid-tiers need more DB help 5. Scalability over perf & big-iron

3 Availability: Largely unsolved problem  1985 Tandem study (Gray):  Administration: 42% downtime  Software: 25% downtime  Hardware 18% downtime  1990 Tandem Study (Gray):  Software 62%  Administration: 15%  Most studies have admin contribution much higher  Observations:  H/W downtime contribution trending to zero  Software & admin costs dominate & growing  We’re still looking at 10 to 15 year-old research

4 Availability: Cost in dollars/hour  Brokerage operations$6,450,000  Credit card authorization$2,600,000  Ebay (1 outage 22 hours)$225,000  Amazon.com$180,000  Package shipping services$150,000  Home shopping channel$113,000  Catalog sales center$90,000  Airline reservation center$89,000  Cellular service activation$41,000  On-line network fees$25,000  ATM service fees$14,000 From Dave Patterson Talk at HPTS Sources: InternetWeek 4/3/ Fibre Channel: A Comprehensive Introduction, R. Kembel 2000, p.8. ”... survey done by Contingency Planning Research."

5 Availability: Admin still the problem  Administrators expensive  Admin dominate H/W & S/W costs (5x or more)  Administrators make mistakes  Admin #1 or #2 cause of downtime  Big problem yet little research focus:  Still few data points available:  Most systems houses won’t publish... need research  No benchmarks:  Benchmarks drive industry & systems research  Goal: Server appliance model:  Auto-tuning, pluggable server-side resources  IBM SMART, Microsoft index tuning wizard, etc.  Dave Patterson, Aaron Brown, Armando Fox,...  More help needed

6 Availability: the S/W is broken  Even server-side software is BIG:  Windows2000: over 50 mloc  DB: 1.5+ mloc  SAP: 37 mloc (4,200 S/W engineers)  Tester to Developer ratios above 1:1  Quality per unit line only incrementally improving  Current massive testing investment not solving problem  New approach needed:  Assume S/W failure inevitable  Redundant, self-healing systems right approach  Tandem process-pair work good but getting fairly old... progress?

7 Security: Securing systems too hard  “Less than % of corp revenue invested in security” – Richard Clarke, Special security advisor to president  Data loss, intentional data & systems corruption  Clearly under-reported problem  S/W Vulnerabilities rampant:  Buffer overruns, stack smashing, code insertion, SQL insertion, elevation of privs,...  Programmers being more careful doesn’t solve problem  Most systems miss-configured:  Security systems too complex & hard to admin  Research needed: Autonomous threat detection  better tools to detect, correct, & prevent S/W security vulnerabilities  Monitor all measurable system metrics:  Detecting new threats & miss-configurations  Track execution profiles: detect changes: drive alerts, auto-config, reports to vendor, upgrade s/w,...

8 Unstructured Data: Mostly not stored in DB  All data has some schema but not always fully known nor affordable to pre-declare:  Most data in unstructured stores with text search  DB community is losing  Much research work on XML focused upon:  Mapping XML to relational scheamas  leverages existing relational IQ but not as flexible  New, non-relational (native XML) stores  Storing natively doesn’t leverage DB investment  Mostly mid-tier data integration servers  Research potential:  Native stores leveraging existing infrastructure esp. cost- based optimizers, storage engines, & utilities  IR work progressing but little integration into DB  Integrating IR work into DB W/O required schema, ability to exploit if there, ability to discover/infer if not

9 Multi-tiered apps: we’re not helping  Many high scale multi-tiered apps still hand crafted  Needed: Object access layer, data cache, queuing, query compiler & optimizer, data directed routing, security,...  Problem not adequately solved by industry  Integration with server-tier DB advantages:  ACID relaxation driven by attributes on apps or data  Relaxed models with auto-cache population & mgmt  Query parsing for data directed routing  Want to parse once & accept same lang as backend  Exploit optimizer: model full mid-tier to back-end costs  Where to run joins, functions, aggs, etc.  Need security integration W/O fully provisioning backend  Data intensive mid-tiers are a DB & TP problem:  Solve with DB tech & integrate with backend DB  Componentized DB for mid-tier use one approach

10 Scalability: perf not the problem  Focus still on performance rather than scalability:  Clusters only “nearly” work  Must buy biggest iron & get most from it  Research goal: Server appliances  Gray’s servers by the brick  brick includes disk, memory, & CPU resources  Only admin actions required:  Add brick to, or defect from, cluster  Data redundancy (potentially) on geo-scale:  adapts to access patterns & available bandwidth  If zero-admin clusters actually worked & scaled:  performance would be a secondary issue  The admin problem would nearly go away  The S/W quality problem greatly simplified  Hiesenbugs solved via retry and redundancy  Would shift investment dollars from H/W & admin to S/W (where it belongs )