Download presentation
Presentation is loading. Please wait.
Published byLia Radcliff Modified over 10 years ago
1
Three Perspectives & Two Problems Shivnath Babu Duke University
2
Outline I want to highlight two problems / thoughts First some context
3
Three Perspectives The Cloud era is ringing in interesting changes Increasingly overlapping roles Joe Schmoe can now provision a 100-node Hadoop cluster in minutes Administrators in traditional roles are getting laid off System Designers / Developers Users of the System Administrators
4
Three Perspectives The Cloud era is ringing in interesting changes Software abstractions / packing / release cycle have changed More visibility into how users use the software System Designers / Developers Users of the System Administrators
5
Problem 1: Automated Experiment-driven System Management
6
Taking the (Next) Bite Out of System Administration Cloud has automated some system administration tasks Can we automate others: System tuning (configuration parameters, SQL queries, MapReduce jobs) Detecting and repairing data corruption (disaster recovery) Software /service testing
7
Database Performance Tuning 2-dim Projection of a 11-dim Surface
8
MapReduce Job Tuning in Hadoop 2-dim Projection of a 13-dim Surface
9
Taking the (Next) Bite Out of System Administration Cloud has automated some system administration tasks Can we automate others: System tuning (configuration parameters, SQL queries, MapReduce jobs) Detecting and repairing data corruption (disaster recovery) Software /service testing
10
Data Corruption Stored data becomes different from what it is supposed to be Bugs in software / firmware Alpha particles, bit rot Human mistakes Bad things have happened Data loss System unavailability Incorrect results Stored Data Applications File-System Storage Database
11
Taking the (Next) Bite Out of System Administration Cloud has automated some system administration tasks Can we automate others: System tuning (configuration parameters, SQL queries, MapReduce jobs) Detecting and repairing data corruption (disaster recovery) Software /service testing
12
Key Insight: Need to Run Experiments System tuning: Running workload under various system settings Detecting data corruption: Running integrity checks to verify data correctness Software /service testing: Running the tests Stored Data Applications File-System Storage Database Challenge: Where / How / When to run experiments?
13
Cloud is Part of the Answer Take snapshots of production data at low overhead Fire up production-like instances of the system Pay-as-you-go, elasticity Run the experiments Production Data Applications File-System Storage Database Applications File-System Storage Database Data on system for doing experiments
14
Power of Experiments to the People Resources Declarative Language Plan optimized sequence of expts Conduct expts automatically Declarative benchmarking & tuning Protecting against data corruption
15
Problem 2: Data-Parallel Computing for the Masses
16
Challenges Joe Schmoe can now provision a 100-node Hadoop cluster in minutes. Is that enough? Joe may need to answers to: o How many reduce tasks to use in MapReduce job J for getting the best perf. on my 8-node production cluster? o My current cluster needs more than 6 hours to process 1 days worth of data. Want to reduce that to under 3 hours. How many and what type of Amazon EC2 nodes to use?
17
Performance Vs. Price Tradeoff
18
Spectrum Database Systems SQL Known data-access patterns Fixed set of operators Cost-based optimizers, What-if engines Grid Computing Python / R / Java Unknown data-access patterns Black-box functions Newer Data-Parallel Systems
19
Starfish: Self-Tuning Analytics on Big Data What-if Engine Workflow-level tuning Workflow-aware Optimizer/Scheduler Workload-level tuning Workload OptimizerElastisizer Data Manager Metadata Mgr. Intermediate Data Mgr. Data Layout & Storage Mgr. Just-in-Time Optimizer Profiler Job-level tuning Sampler
20
MapReduce Job Tuning in Hadoop True Surface Estimated Surface
21
Summary Three perspectives: Developer, User, & Administrator Two problems: Automated Experiment-driven System Management Data-Parallel Computing for the Masses
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.