Presented by Carl Erhard & Zahid Mian Authors: Herodotos Herodotou, Harold Lim, Fei Dong, Shivnath Babu Duke University.

Presented by Carl Erhard & Zahid Mian Authors: Herodotos Herodotou, Harold Lim, Fei Dong, Shivnath Babu Duke University

Analysis in the Big Data Era 9/26/20112 Massive Data Data Analysis Insight Key to Success = Timely and Cost-Effective Analysis Starfish

We want a MAD System 9/26/2011Starfish3 M agntetism Attracts or welcomes all sources of data, regardless of structure, values, etc. A gility Adaptive, remains in sync with rapid data evolution and modification D epth More than just your typical analytics, we need to support complex operations like statistical analysis and machine learning

No wait…I mean MADDER 9/26/2011Starfish4 D ata-lifecycle Do more than just queries, Awareness optimize the movement, storage, and processing of big data E lasticity Dynamically adjust resource usage and operational costs based on workload and user requirements R obustness Provide storage and querying services even in the event of some failures

Practitioners of Big Data Analytics Who are the users? Data analysts, statisticians, computational scientists… Researchers, developers, testers… Business Analysts… You! Who performs setup and tuning? The users! Usually lack expertise to tune the system 9/26/20115Starfish

Motivation 9/26/2011Starfish6

Tuning Challenges Heavy use of programming languages for MapReduce programs (e.g., Java/python) Data loaded/accessed as opaque files Large space of tuning choices (over 190 parameters!) Elasticity is wonderful, but hard to achieve (Hadoop has many useful mechanisms, but policies are lacking) Terabyte-scale data cycles 9/26/20117Starfish

Our goal: Provide good performance automatically Starfish: Self-tuning System 9/26/20118 MapReduce Execution Engine Distributed File System Hadoop Java / C++ / R / Python OozieHivePig Elastic MapReduce Jaql HBase Starfish Analytics System Starfish

What are the Tuning Problems? 9/26/20119 Job-level MapReduce configuration Workload management Data layout tuning Cluster sizing Workflow optimization J1J1 J2J2 J3J3 J4J4 Starfish

Starfishs Core Approach to Tuning 9/26/201110 1) if Δ(conf. parameters) then what …? 2) if Δ(data properties) then what …? 3) if Δ(cluster properties) then what …? Profiler Collects concise summaries of execution What-if Engine Estimates impact of hypothetical changes on execution Optimizers Search through space of tuning choices Job Workflow Workload Data layout Cluster Starfish

Starfish Architecture 9/26/201111Starfish

Job Level Tuning Just-in-Time Optimizer: Automatically selects efficient execution techniques for MapReduce jobs. Profiler: A Starfish component which is able to collect detailed summaries of jobs on a task-by-task basis. Sampler: Collects statistics about input, intermediate, and output data of a MapReduce job. 9/26/2011Starfish12

MapReduce Job Execution 9/26/201113 split 0 map out 0 reduce split 2 map split 1 map split 3 map Out 1 reduce job j = Starfish

What Controls MR Job Execution? Space of configuration choices: Number of map tasks Number of reduce tasks Partitioning of map outputs to reduce tasks Memory allocation to task-level buffers Whether output data from tasks should be compressed Whether combine function should be used 9/26/201114 job j = Starfish

Effect of Configuration Settings Use defaults or set manually (rules-of-thumb) Rules-of-thumb may not suffice 9/26/201115 Two-dimensional projection of a multi- dimensional surface (Word Co-occurrence MapReduce Program) Rules-of-thumb settings Starfish

MapReduce Job Tuning in a Nutshell Goal: Challenges: p is an arbitrary MapReduce program; c is high-dimensional; … 9/26/201116 Profiler What-if Engine Optimizer Runs p to collect a job profile (concise execution summary) of Given profile of, estimates virtual profile for Enumerates and searches through the optimization space S efficiently Starfish

Job Profile Concise representation of program execution as a job Records information at the level of task phases Generated by Profiler through measurement or by the What-if Engine through estimation 9/26/201117 Memory Buffer Merge Sort, [Combine], [Compress] Serialize, Partition map Merge split DFS SpillCollectMapRead Starfish

Job Profile Fields Dataflow: amount of data flowing through task phases Map output bytes Number of spills Number of records in buffer per spill 9/26/201118 Costs: execution times at the level of task phases Read phase time in the map task Map phase time in the map task Spill phase time in the map task Dataflow Statistics: statistical information about dataflow Width of input key-value pairs Map selectivity in terms of records Map output compression ratio Cost Statistics: statistical information about resource costs I/O cost for reading from local disk per byte CPU cost for executing the Mapper per record CPU cost for uncompressing the input per byte Starfish

Generating Profiles by Measurement Goals Have zero overhead when profiling is turned off Require no modifications to Hadoop Support unmodified MapReduce programs written in Java or Hadoop Streaming/Pipes (Python/Ruby/C++) Approach: Dynamic (on-demand) instrumentation Event-condition-action rules are specified (in Java) Leads to run-time instrumentation of Hadoop internals Monitors task phases of MapReduce job execution We currently use Btrace (Hadoop internals are in Java) 9/26/201119Starfish

Generating Profiles by Measurement 9/26/201120 split 0 map out 0 reduce split 1 map raw data map profile reduce profile job profile Use of Sampling Profile fewer tasks Execute fewer tasks JVM = Java Virtual Machine, ECA = Event-Condition-Action JVM Enable Profiling ECA rules Starfish

Results of Job Profiling 9/26/2011Starfish21

Results using Job Profiling 9/26/2011Starfish22

Workflow-Aware Scheduling Unbalanced Data Layout Skewed Data Data Layout Not Considered when SchedulingTasks Addition/Dropping PartitionsNo Rebalance Can Lead to Failures Due to Space Issues Locality-Aware Schedulers Can Make Problem Worse Possible Solutions: Change # of Replicas Collocating Data (Block Placement Policy) 9/26/2011Starfish23

Impact of Unbalanced Data Layout 9/26/2011Starfish24

Workflow-Aware Scheduling Makes Decisions by Considering Producer-Consumer Relationships 9/26/2011Starfish27 Nodes

Starfishs Workflow-Aware Scheduler Space of Choices: Block Placement Policy: Round Robin (Local Write is default) Replication Factor Size of blocks: general large for large files Compression: Impacts I/O; not always beneficial 9/26/2011Starfish28

Starfishs Workflow-Aware Scheduler What-If Questions A) Expected runtime of Job P if the RR block placement policy is used for Ps output files? B) New Data layout in the cluster if the RR block placement policy is used for Ps output files? C) Expected runtime of Job C1 (C2) if its input data layout is the one in the answer of Question (above)? D) Expected runtimes of Jobs C1 and C2 if scheduled concurrently when Job P completes? E) Given Local Write block policy and RF = 1 for Job Ps output, what is the expected increase in the runtime of Job C1 if one node in the cluster fails during C1s execution? 9/26/2011Starfish29

Estimates from the What-if Engine 9/26/201130 Hadoop cluster: 16 nodes, c1.medium MapReduce Program: Word Co-occurrence Data set: 10 GB Wikipedia True surfaceEstimated surface Starfish

Workflow Scheduler Picks Layout 9/26/2011Starfish31

Optimizations-Workload Optimizer 9/26/2011Starfish32

Provisioning--Elastisizer Motivation: Amazon Elastic MapReduce Data in S3, processed in-cluster, stored to S3 User Pays for Resources Used Elastisizer Determines … Best cluster Hadoop configurations … Based on user-specified goals (execution time and costs) 9/26/2011Starfish33

Elastisizer Configuration Evaluation 9/26/2011Starfish34

Elastisizer Configuration Evaluation 9/26/2011Starfish35

Elastisizer- Cluster Configurations 9/26/2011Starfish36

Multi-objective Cluster Provisioning 9/26/201137 Instance Type for Source Cluster: m1.large Starfish

Critique of Paper Good Source Available for Implementation Able to See the impact of various settings Good Visualization Tools Tutorials/Source available at duke.edu/starfish Bad The paper (and subsequent materials) talk a lot about what Starfish does, but not necessarily how it does it There is no documentation on LastWord, and this seems important Only works after a the job/workflow has been executed at least once 9/26/2011Starfish38

Starfishs Visualizer Timeline Views Shows progress of a job execution at the task level See execution of same job with different settings Data-flow Views View of flow of data among nodes, along with MR jobs Video Mode allows playback execution from past Profile Views Timings, data-flow, resource-level 9/26/2011Starfish39

Timeline Views 9/26/2011Starfish40

Timeline View 9/26/2011Starfish41

Data Skew View 9/26/2011Starfish42

Data-flow Views 9/26/2011Starfish45

References Herodotou, Herodotos, et al. "Starfish: A self-tuning system for big data analytics." Proc. of the Fifth CIDR Conf. 2011. Dong, Fei. Extending Starfish to Support the Growing Hadoop Ecosystem. Diss. Duke University, 2012. Herodotou, Herodotos, Fei Dong, and Shivnath Babu. "MapReduce programming and cost-based optimization? Crossing this chasm with Starfish." Proceedings of the VLDB Endowment 4.12 (2011). http://www.cs.duke.edu/starfish/ http://www.youtube.com/watch?v=Upxe2dzE1uk 9/26/2011Starfish46

Backup 9/26/2011Starfish47

Hadoop MapReduce Ecosystem Popular solution to Big Data Analytics 9/26/201148 MapReduce Execution Engine Distributed File System Hadoop Java / C++ / R / Python OozieHivePig Elastic MapReduce Jaql HBase Starfish

Workflow-level Tuning Starfish has a Workflow-aware Scheduler which addresses several concerns: How do we equally distribute computation across nodes? How do we adapt to imbalance of a load or energy cost? The Workflow-aware Scheduler works with the What- if Engine and the Data Manager to answer these questions 9/26/2011Starfish49

Workload-level Tuning Starfishs Workload Optimizer is aware of each workflow that will be executed. It reorders the workflows in order to make them more efficient. This includes reusing data for different workflows that use the same MapReduce jobs. 9/26/2011Starfish50

What-if Engine Job Oracle Virtual Job Profile for What-if Engine 9/26/201151 Task Scheduler Simulator Job Profile Properties of Hypothetical job Input Data Properties Cluster Resources Configuration Settings Possibly Hypothetical Starfish

Virtual Profile Estimation 9/26/201152 Given profile for job j = estimate profile for job j' = (Virtual) Profile for j' Dataflow Statistics Dataflow Cost Statistics Costs Profile for j Input Data d 2 Confi- guration c 2 Resources r 2 Costs White-box Models Cost Statistics Relative Black-box Models Dataflow White-box Models Dataflow Statistics Cardinality Models Starfish

Job Optimizer 9/26/201153 Best Configuration Settings for Subspace Enumeration Recursive Random Search Just-in-Time Optimizer Job Profile Input Data Properties Cluster Resources What-if calls Starfish

Presented by Carl Erhard & Zahid Mian Authors: Herodotos Herodotou, Harold Lim, Fei Dong, Shivnath Babu Duke University.

Similar presentations

Presentation on theme: "Presented by Carl Erhard & Zahid Mian Authors: Herodotos Herodotou, Harold Lim, Fei Dong, Shivnath Babu Duke University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Presented by Carl Erhard & Zahid Mian Authors: Herodotos Herodotou, Harold Lim, Fei Dong, Shivnath Babu Duke University.

Similar presentations

Presentation on theme: "Presented by Carl Erhard & Zahid Mian Authors: Herodotos Herodotou, Harold Lim, Fei Dong, Shivnath Babu Duke University."— Presentation transcript:

Similar presentations

About project

Feedback