Low Latency Geo-distributed Data Analytics Qifan Pu, Ganesh Ananthanarayanan, Peter Bodik, Srikanth Kandula, Aditya Akella, Paramvir Bahl, Ion Stoica.

Slides:

Advertisements

Similar presentations

Starfish: A Self-tuning System for Big Data Analytics.

Advertisements

MINERVA: an automated resource provisioning tool for large-scale storage systems G. Alvarez, E. Borowsky, S. Go, T. Romer, R. Becker-Szendy, R. Golding,

Lindsey Bleimes Charlie Garrod Adam Meyerson

Combating Outliers in map-reduce Srikanth Kandula Ganesh Ananthanarayanan , Albert Greenberg, Ion Stoica , Yi Lu, Bikas Saha , Ed Harris   1.

Effective Straggler Mitigation: Attack of the Clones Ganesh Ananthanarayanan, Ali Ghodsi, Srikanth Kandula, Scott Shenker, Ion Stoica.

Lecture 14:Combating Outliers in MapReduce Clusters Xiaowei Yang.

GRASS: Trimming Stragglers in Approximation Analytics Ganesh Ananthanarayanan, Michael Hung, Xiaoqi Ren, Ion Stoica, Adam Wierman, Minlan Yu.

Effective Straggler Mitigation: Attack of the Clones [1]

UC Berkeley a Spark in the cloud iterative and interactive cluster computing Matei Zaharia, Mosharaf Chowdhury, Michael Franklin, Scott Shenker, Ion Stoica.

Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.

A Software-Defined Networking based Approach for Performance Management of Analytical Queries on Distributed Data Stores Pengcheng Xiong (NEC Labs America)

Minimizing Multi-Hop Wireless Routing State under Application- based Accuracy Constraints Mustafa Kilavuz & Murat Yuksel University of Nevada, Reno.

Approximate Queries on Very Large Data UC Berkeley Sameer Agarwal Joint work with Ariel Kleiner, Henry Milner, Barzan Mozafari, Ameet Talwalkar, Michael.

SPANStore: Cost-Effective Geo-Replicated Storage Spanning Multiple Cloud Services Zhe Wu, Michael Butkiewicz, Dorian Perkins, Ethan Katz-Bassett, Harsha.

Network Coding for Large Scale Content Distribution Christos Gkantsidis Georgia Institute of Technology Pablo Rodriguez Microsoft Research IEEE INFOCOM.

Quality-Aware Segment Transmission Scheduling in Peer-to-Peer Streaming Systems Cheng-Hsin Hsu Senior Research Scientist Deutsche Telekom R&D Lab USA Los.

Scalable Information-Driven Sensor Querying and Routing for ad hoc Heterogeneous Sensor Networks Maurice Chu, Horst Haussecker and Feng Zhao Xerox Palo.

Adaptive Stream Processing using Dynamic Batch Sizing Tathagata Das, Yuan Zhong, Ion Stoica, Scott Shenker.

UC Berkeley Improving MapReduce Performance in Heterogeneous Environments Matei Zaharia, Andy Konwinski, Anthony Joseph, Randy Katz, Ion Stoica University.

The Power of Choice in Data-Aware Cluster Scheduling

Distributed Low-Latency Scheduling

Word Wide Cache Distributed Caching for the Distributed Enterprise.

Making Every Bit Count in Wide Area Analytics Ariel Rabkin Joint work with: Matvey Arye, Siddhartha Sen, Michael J. Freedman, and Vivek Pai 1.

Min Xu1, Yunfeng Zhu2, Patrick P. C. Lee1, Yinlong Xu2

Network Aware Resource Allocation in Distributed Clouds.

Introduction to Hadoop and HDFS

Mesos A Platform for Fine-Grained Resource Sharing in the Data Center Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony Joseph, Randy.

임규찬. 1. Abstract 2. Introduction 3. Design Goals 4. Sample-Based Scheduling for Parallel Jobs 5. Implements.

Aditya Akella The Performance Benefits of Multihoming Aditya Akella CMU With Bruce Maggs, Srini Seshan, Anees Shaikh and Ramesh Sitaraman.

Stochastic DAG Scheduling using Monte Carlo Approach Heterogeneous Computing Workshop (at IPDPS) 2012 Extended version: Elsevier JPDC (accepted July 2013,

Ariel Rabkin Princeton University Aggregation and Degradation in JetStream: Streaming Analytics in the Wide Area Work done with.

Reining in the Outliers in Map-Reduce Clusters using Mantri Ganesh Ananthanarayanan, Srikanth Kandula, Albert Greenberg, Ion Stoica, Yi Lu, Bikas Saha,

Fine-grained Partitioning for Aggressive Data Skipping Liwen Sun, Michael J. Franklin, Sanjay Krishnan, Reynold S. Xin† UC Berkeley and †Databricks Inc.

Resource Predictors in HEP Applications John Huth, Harvard Sebastian Grinstein, Harvard Peter Hurst, Harvard Jennifer M. Schopf, ANL/NeSC.

Architectures and Algorithms for Future Wireless Local Area Networks  1 Chapter Architectures and Algorithms for Future Wireless Local Area.

A Utility-based Approach to Scheduling Multimedia Streams in P2P Systems Fang Chen Computer Science Dept. University of California, Riverside

Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica.

1 Iterative Integer Programming Formulation for Robust Resource Allocation in Dynamic Real-Time Systems Sethavidh Gertphol and Viktor K. Prasanna University.

Network-Aware Scheduling for Data-Parallel Jobs: Plan When You Can

Resilient Distributed Datasets: A Fault- Tolerant Abstraction for In-Memory Cluster Computing Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave,

Surviving Failures in Bandwidth Constrained Datacenters Authors: Peter Bodik Ishai Menache Mosharaf Chowdhury Pradeepkumar Mani David A.Maltz Ion Stoica.

Multi-Resource Packing for Cluster Schedulers Robert Grandl Aditya Akella Srikanth Kandula Ganesh Ananthanarayanan Sriram Rao.

ApproxHadoop Bringing Approximations to MapReduce Frameworks

Author Utility-Based Scheduling for Bulk Data Transfers between Distributed Computing Facilities Xin Wang, Wei Tang, Raj Kettimuthu,

A Platform for Fine-Grained Resource Sharing in the Data Center

An overlay for latency gradated multicasting Anwitaman Datta SCE, NTU Singapore Ion Stoica, Mike Franklin EECS, UC Berkeley

Multi-Resource Packing for Cluster Schedulers Robert Grandl, Ganesh Ananthanarayanan, Srikanth Kandula, Sriram Rao, Aditya Akella.

Beyond Hadoop The leading open source system for processing big data continues to evolve, but new approaches with added features are on the rise. Ibrahim.

PACMan: Coordinated Memory Caching for Parallel Jobs Ganesh Ananthanarayanan, Ali Ghodsi, Andrew Wang, Dhruba Borthakur, Srikanth Kandula, Scott Shenker,

Scheduling Jobs Across Geo-distributed Datacenters Chien-Chun Hung, Leana Golubchik, Minlan Yu Department of Computer Science University of Southern California.

IncApprox The marriage of incremental and approximate computing Pramod Bhatotia Dhanya Krishnan, Do Le Quoc, Christof Fetzer, Rodrigo Rodrigues* (TU Dresden.

Resilient Distributed Datasets A Fault-Tolerant Abstraction for In-Memory Cluster Computing Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave,

Geo-distributed Data Analytics Qifan Pu, Ganesh Ananthanarayanan, Peter Bodik, Srikanth Kandula, Aditya Akella, Paramvir Bahl, Ion Stoica.

R-Storm: Resource Aware Scheduling in Storm

GRASS: Trimming Stragglers in Approximation Analytics

Packing Tasks with Dependencies

Jennifer Rexford Princeton University

International Conference on Data Engineering (ICDE 2016)

Measurement-based Design

Managing Data Transfer in Computer Clusters with Orchestra

New Workflow Scheduling Techniques Presentation: Anirban Mandal

CFA: A Practical Prediction System for Video Quality Optimization

PA an Coordinated Memory Caching for Parallel Jobs

How to Achieve Application Performance

Systems for ML Clipper Gaia Training (TensorFlow)

Haoyu Zhang, Microsoft and Princeton University;

Discretized Streams: A Fault-Tolerant Model for Scalable Stream Processing Zaharia, et al (2012)

Reining in the Outliers in MapReduce Jobs using Mantri

First Hop Offloading of Mobile DAG Computations

Presentation transcript:

Low Latency Geo-distributed Data Analytics Qifan Pu, Ganesh Ananthanarayanan, Peter Bodik, Srikanth Kandula, Aditya Akella, Paramvir Bahl, Ion Stoica

WAN Geo-distributed Data Analytics Seattle Berkeley Beijing London Slow & Wasteful 2 Perf. counters User activities … “Centralized” Data Analytics Paradigm

3 WAN Seattle Berkeley Beijing London A single logical analytics cluster across all sites.

44 WAN Seattle Berkeley Beijing London Incorporating WAN bandwidths is key to geo-distributed analytics performance. A single logical analytics system across all sites.

Incorporating WAN bandwidths Task placement – Decides the destinations of network transfers Data placement – Decides the sources of network transfers 5

Example Analytics Job SELECT time_window, percentile(latency, 99) GROUP BY time_window Seattle 40GB 20GB London 40GB 800 MB/s 200 MB/s WAN

Task Fractions Upload Time (s) Download Time (s) Input Data (GB) Calculating Transfer Time Seattle London GB 12.5s 50s s 2.5s 2.5x How to solve the general case, with more sites, BW heterogeneity and data skew? Seattle London 40

Task Placement (TP Solver) Task 1 -> London Task 2 -> Beijing Task 5 -> London … Sites M Tasks N Data Matrix (MxN) Upload BWs Download BWs 8 TP Solver TP Solver Optimization Goal: Minimize the longest transfer of all links

Task Fractions Upload Time (s) Download Time (s) Input Data (GB) London Seattle 100GB 50s 6.25s 40GB 160GB s 6s 2x 50s How to jointly optimize data and task placement? Seattle London 100 Another example Query Lag

Iridium Jointly optimize data and task placement with greedy heuristic improve query response time bandwidth, query arrivals, etc Approach Goal Constraints 10

Iridium with Single Dataset Iterative heuristics for joint task-data placement. 1, Identify bottlenecks by solving task placement 2, assess:find amount of move data to alleviate current bottleneck 11 TP Solver TP Solver TP Solver TP Solver Until query arrivals, repeat.

Iridium with Multiple Datasets Prioritize high-value datasets: score = value x urgency / cost - value = sum(timeReduction) for all queries - urgency = 1/avg(query_lag) - cost = amount of data moved 12

13 Iridium: putting together Placement of data – Before query arrival – prioritize the move of high-value datasets Placement of tasks – During query execution: – constrained solver TP Solver TP Solver Not talked about: estimation of query arrivals, contention of move&query, etc

Evaluation Spark and HDFS – Override Spark’s task scheduler with ours – Data placement creates copies in cross-site HDFS Geo-distributed EC2 deployment across 8 regions – Tokyo, Singapore, Sydney, Frankfurt, Ireland, Sao Paulo, Virginia (US) and California (US). 14

Spark jobs, SQL queries and streaming queries – Conviva: video sessions paramters – Bing Edge: running dashboard, streaming – TPC-DS: decision support queries for retail – AMP BDB: mix of Hive and Spark queries Baseline: – “In-place”: Leave data unmoved + Spark’s scheduling – “Centralized”: aggregate all data onto one site How well does Iridium perform? 15

Iridium outperforms 4x-19x 3x-4x Conviva Bing-Edge TPC-DS Big-Data vs. In-place vs. Centralized 16 10x 19x 7x 4x Reduction (%) in Query Response Time 3x 4x 3x

Iridium subsumes both baselines! vs. Centralized: Data placement has higher contribution vs. In-place: Equal contributions from two techniques Median Reduction (%) Vs. CentralizedVs. In-place Task placement Data placement Iridium (both) 18% 38% 75% 24% 30% 63%

Reduction (%) in WAN Usage 1.5xBmin 1.3xBmin 1xBmin (64%, 19%) better MinBW: a scheme that minimizes bandwidth, to Bmin Iridium: budget the bandwidth usage to be m*Bmin Iridium can speed up queries while using near-optimal bandwidth cost Bandwidth Cost

Related work JetStream (NSDI’14) – Data aggregation and adaptive filtering – Does not support arbitrary queries, nor optimizes task and data placement WANalytics (CIDR’15), Geode (NSDI’15) – Optimize BW usage for SQL & general DAG jobs – Can lead to poor query performance time 19

20 Low Latency Geo-distributed Data Analytics Data is geographically distributed Services with global footprints Analyze logs across DCs “99 percentile movie rating” “Median Skype call setup latency” Abstraction: Single logical analytics cluster across all sites  Incorporating WAN bandwidths  Reduce response time over baselines by 3x – 19x WAN Seattle Berkeley Beijing London