Presentation is loading. Please wait.

Presentation is loading. Please wait.

Performance Assurance for Large Scale Big Data Systems

Similar presentations


Presentation on theme: "Performance Assurance for Large Scale Big Data Systems"— Presentation transcript:

1 Performance Assurance for Large Scale Big Data Systems
Rekha Singhal, TCS Research, India, Mumbai.

2 Flow Define Performance & Large Scale System
Motivation for Performance Assurance Challenges in Assuring Performance Solutions for Assuring Performance Future directions

3 What is Performance Throughput Latency Job submit (#Jobs per second)
Job output Throughput Latency (#Jobs per second) Source:

4 What is Large SCALE ? Machine/human data sources
Distributed systems on WAN Data size Workload size Cluster size System capacity Computing, data size, underlying infrastucture, workload Many input streams – Concurrent loads on system Large distributed nodes Source:

5 Given : Underlying System, Application growth
Ensure: A job finished within guaranteed latency.

6 Why Performance Assurance
Faster growth in analytic application’s data Rapid increase in workload Newer developments in Parallel Data Processing platforms Migration to new platform On-line analytics Maximal utilization of system

7 Lack of Performance Assurance?
Faster growth in application’s data Rapid increase in workload Newer developments in Parallel Data Processing platforms Migration to new platform Maximal utilization of system Loss of Users: User cannot tolerate delay in their job execution !! Heavy cost on Application Owners on SLA violations

8 When to Assure Performance
In Development Environment Testing stage Design phase Capacity planning phase In Production Environment Alerts before Performance violations Proactive tuning of application or system

9 How to Assure Performance
Performance Profiling & Tuning in Production Reactive, Cost & downtime time Volume Testing Proactive Deployment time & cost Volume Performance Prediction Models

10 Volume Testing What we Have: Efficient Data Generation for RDBMS
Generate large data sizes Emulate large number of Users What we Have: Efficient Data Generation for RDBMS Parallel Data Generator on HDFS Reference: Efficient Synthetic Data Generator for structured Data" Chetan Phalak, Rekha Singhal, CMG USA San Deigo, November 2016.

11 Why NOT Volume Testing Require LARGE Resources and Time What we Have:
Generate large data sizes Emulate large number of Users What we Have: Efficient Data Generation for RDBMS Parallel Data Generator on HDFS Require LARGE Resources and Time

12 Performance Extrapolation Models
Eliminate Volume Testing for Performance Assurance Prospective capacity planning SQL Query Tuning and scheduling Reduce application benchmarking cycle time *. * Reducing Structure Big Data Benchmark Cycle time using Query Performance Prediction Model, IEEE International conference on Computing, Communication Systems (ICCCS) 2015, Mauritius, December 2015.

13 Performance Violation – Increase in Data Size

14 Challenges Unavailability of the projected data size DB system
Limited availability of the Production System Unavailability of the projected data size DB system Estimation of complex query output size Transparency to the underlying Hardware Subsystem Transparency to the underlying data management server - DB Server (Oracle, Postgres), Big data architectures

15 Approach- Measurement based Black Box
Emulate query processing on underlying system on large data size Use optimizer cost Get complex query processing steps in form of elementary operators and data access steps. Identify elementary operators in the system and build prediction model for each of them as function of data size RDBMS: sort, hash join MR : map, reduce, shuffle Hive: map join, reduce join Identify different modes of data access and build models for each of them Index, full table

16 Query Processing Steps
SQL Select Hash Join Table T1 Index Access Table T2 Full Access HiveQL Stage 1: Fetch Stage 3: MR Stage 2: Map

17 Performance Prediction Models: What We Have
Predict SQL Query Execution time for large data volume on RDBMS R.Singhal and Manoj Nambiar, “Predicting SQL Query Execution Time for Large Data Volume”, in Proceedings of IDEAS, Montreal, Canada, July, 2016. Database Buffer Cache Simulator to Study and Predict Cache Behavior for Query Execution, Chetan Phalak, Rekha Singhal and Tanmay Jhunjhunwala, proceedings of conference DATA, Portugal, July 2016. “Measurement based model to study the affect of increase in data size on query response time”, Rekha Singhal, Manoj Nambiar, Performance and Capacity CMG 2013, La Jolla, California, November 2013. “Extrapolation of SQL Query Elapsed Response Time at Application Development Stage”, Rekha Singhal, Manoj Nambiar, Proceedings of INDICON 2012, Kochi, India, December 2012. Predict HiveQL Job Execution time for Large data volume and cluster size in homogenous environment A. Sangroya and R. Singhal, “Performance Assurance Model for HiveQL on Large Data Volume,” in Proceedings of the International Workshop on Foundations of Big Data Computing in conjunction with 22nd IEEE International Conference on High Performance Computing, HiPC ’15, December 2015. Predict HiveQL Job Execution time for Large data volume and cluster size in heterogenous environment using MR simulator R.Singhal and Abhishek Verma, “Predicting Job Completion Time in Heterogeneous MapReduce Environments”, in Proceedings of IPDPS: Heterogeneous computing workshop, IPDPS, May, 2016.

18 Use Cases Performance Assurance Auto Tuning
Performance Cost model as function of data processing engine performance parameters such as #map slots, sizeof map memory etc. Bottleneck Analysis Job log parsers co-relation with system utilization logs.

19 Prediction Model for RDBMS

20 Prediction Accuracy for Hive+Hadoop

21 MR Simulator Prediction Environment Measurement Environment Appln
Machine A Machine B Large Data MR Framework Machine A Machine B MR Framework Small Data

22 MR Simulator design

23 Prediction Accuracy using MR Simulator

24 Topics Not covered Performance extrapolation across different deployments Cpu cores, RAM size, Storage sub system Performance extrapolation for larger concurrent workload Performance extrapolation for different mix of SQL query concurrent workloads

25 Conclusions Motivation for performance models in large scale systems
Volume testing for Performance Assurance Performance Prediction Models for large scale systems – RDBMS, Hive+Hadoop Extension of these models for Auto Tuning and Capacity Sizing

26


Download ppt "Performance Assurance for Large Scale Big Data Systems"

Similar presentations


Ads by Google