Key Challenges in Information Processing James Hamilton Microsoft SQL Server
2 Unsolved Challenges 1. Availability shows only incremental progress 2. Security broken & too hard to manage 3. Weakly structured data poorly supported or exploited 4. Writing Multi-tiered apps too hard Data intensive mid-tiers need more DB help 5. Scalability over perf & big-iron
3 Availability: Largely unsolved problem 1985 Tandem study (Gray): Administration: 42% downtime Software: 25% downtime Hardware 18% downtime 1990 Tandem Study (Gray): Software 62% Administration: 15% Most studies have admin contribution much higher Observations: H/W downtime contribution trending to zero Software & admin costs dominate & growing We’re still looking at 10 to 15 year-old research
4 Availability: Cost in dollars/hour Brokerage operations$6,450,000 Credit card authorization$2,600,000 Ebay (1 outage 22 hours)$225,000$180,000 Package shipping services$150,000 Home shopping channel$113,000 Catalog sales center$90,000 Airline reservation center$89,000 Cellular service activation$41,000 On-line network fees$25,000 ATM service fees$14,000 From Dave Patterson Talk at HPTS Sources: InternetWeek 4/3/ Fibre Channel: A Comprehensive Introduction, R. Kembel 2000, p.8. ”... survey done by Contingency Planning Research."
5 Availability: Admin still the problem Administrators expensive Admin dominate H/W & S/W costs (5x or more) Administrators make mistakes Admin #1 or #2 cause of downtime Big problem yet little research focus: Still few data points available: Most systems houses won’t publish... need research No benchmarks: Benchmarks drive industry & systems research Goal: Server appliance model: Auto-tuning, pluggable server-side resources IBM SMART, Microsoft index tuning wizard, etc. Dave Patterson, Aaron Brown, Armando Fox,... More help needed
6 Availability: the S/W is broken Even server-side software is BIG: Windows2000: over 50 mloc DB: 1.5+ mloc SAP: 37 mloc (4,200 S/W engineers) Tester to Developer ratios above 1:1 Quality per unit line only incrementally improving Current massive testing investment not solving problem New approach needed: Assume S/W failure inevitable Redundant, self-healing systems right approach Tandem process-pair work good but getting fairly old... progress?
7 Security: Securing systems too hard “Less than % of corp revenue invested in security” – Richard Clarke, Special security advisor to president Data loss, intentional data & systems corruption Clearly under-reported problem S/W Vulnerabilities rampant: Buffer overruns, stack smashing, code insertion, SQL insertion, elevation of privs,... Programmers being more careful doesn’t solve problem Most systems miss-configured: Security systems too complex & hard to admin Research needed: Autonomous threat detection better tools to detect, correct, & prevent S/W security vulnerabilities Monitor all measurable system metrics: Detecting new threats & miss-configurations Track execution profiles: detect changes: drive alerts, auto-config, reports to vendor, upgrade s/w,...
8 Unstructured Data: Mostly not stored in DB All data has some schema but not always fully known nor affordable to pre-declare: Most data in unstructured stores with text search DB community is losing Much research work on XML focused upon: Mapping XML to relational scheamas leverages existing relational IQ but not as flexible New, non-relational (native XML) stores Storing natively doesn’t leverage DB investment Mostly mid-tier data integration servers Research potential: Native stores leveraging existing infrastructure esp. cost- based optimizers, storage engines, & utilities IR work progressing but little integration into DB Integrating IR work into DB W/O required schema, ability to exploit if there, ability to discover/infer if not
9 Multi-tiered apps: we’re not helping Many high scale multi-tiered apps still hand crafted Needed: Object access layer, data cache, queuing, query compiler & optimizer, data directed routing, security,... Problem not adequately solved by industry Integration with server-tier DB advantages: ACID relaxation driven by attributes on apps or data Relaxed models with auto-cache population & mgmt Query parsing for data directed routing Want to parse once & accept same lang as backend Exploit optimizer: model full mid-tier to back-end costs Where to run joins, functions, aggs, etc. Need security integration W/O fully provisioning backend Data intensive mid-tiers are a DB & TP problem: Solve with DB tech & integrate with backend DB Componentized DB for mid-tier use one approach
10 Scalability: perf not the problem Focus still on performance rather than scalability: Clusters only “nearly” work Must buy biggest iron & get most from it Research goal: Server appliances Gray’s servers by the brick brick includes disk, memory, & CPU resources Only admin actions required: Add brick to, or defect from, cluster Data redundancy (potentially) on geo-scale: adapts to access patterns & available bandwidth If zero-admin clusters actually worked & scaled: performance would be a secondary issue The admin problem would nearly go away The S/W quality problem greatly simplified Hiesenbugs solved via retry and redundancy Would shift investment dollars from H/W & admin to S/W (where it belongs )