Detecting Transient Bottlenecks in n-Tier Applications through Fine- Grained Analysis Qingyang Wang Advisor: Calton Pu.

Slides:

Advertisements

Similar presentations

The Impact of Soft Resource Allocation on n-tier Application Scalability Qingyang Wang, Simon Malkowski, Yasuhiko Kanemasa, Deepal Jayasinghe, Pengcheng.

Advertisements

SLA-Oriented Resource Provisioning for Cloud Computing

Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.

KMemvisor: Flexible System Wide Memory Mirroring in Virtual Environments Bin Wang Zhengwei Qi Haibing Guan Haoliang Dong Wei Sun Shanghai Key Laboratory.

Proactive Prediction Models for Web Application Resource Provisioning in the Cloud _______________________________ Samuel A. Ajila & Bankole A. Akindele.

XENMON: QOS MONITORING AND PERFORMANCE PROFILING TOOL Diwaker Gupta, Rob Gardner, Ludmila Cherkasova 1.

Adapted from Menascé & Almeida.1 Workload Characterization for the Web.

Providing Performance Guarantees for Cloud Applications Anshul Gandhi IBM T. J. Watson Research Center Stony Brook University 1 Parijat Dube, Alexei Karve,

Mohammed Abouzour, Kenneth Salem, Peter Bumbulis Presentation by Mohammed Abouzour SMDB2010.

Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting Roy, N., A. Dubey, and A. Gokhale 4th IEEE International Conference.

1 Economical and Robust Provisioning of N-Tier Cloud Workloads: A Multi-level Control Approach Pengcheng Xiong 1, Zhikui Wang 2, Simon Malkowski 1, Qingyang.

1 Virtual Machine Resource Monitoring and Networking of Virtual Machines Ananth I. Sundararaj Department of Computer Science Northwestern University July.

An Adaptable Benchmark for MPFS Performance Testing A Master Thesis Presentation Yubing Wang Advisor: Prof. Mark Claypool.

CS CS 5150 Software Engineering Lecture 19 Performance.

Energy Efficient Web Server Cluster Andrew Krioukov, Sara Alspaugh, Laura Keys, David Culler, Randy Katz.

Measuring Performance Chapter 12 CSE807. Performance Measurement To assist in guaranteeing Service Level Agreements For capacity planning For troubleshooting.

Yaksha: A Self-Tuning Controller for Managing the Performance of 3-Tiered Web Sites Abhinav Kamra, Vishal Misra CS Department Columbia University Erich.

OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.

DotSlash: Providing Dynamic Scalability to Web Applications Weibin Zhao and Henning Schulzrinne Department of Computer Science, Columbia University More.

Distributed Systems Meet Economics: Pricing In The Cloud Authors: Hongyi Wang, Qingfeng Jing, Rishan Chen, Bingsheng He, Zhengping He, Lidong Zhou Presenter:

New Challenges in Cloud Datacenter Monitoring and Management

Computer Systems Performance Evaluation CSCI 8710 Kraemer Fall 2008.

Computer Science 1 Resource Overbooking and Application Profiling in Shared Hosting Platforms Bhuvan Urgaonkar Prashant Shenoy Timothy Roscoe † UMASS Amherst.

Generating Adaptation Policies for Multi-Tier Applications in Consolidated Server Environments College of Computing Georgia Institute of Technology Gueyoung.

Bottlenecks: Automated Design Configuration Evaluation and Tune.

+ CS 325: CS Hardware and Software Organization and Architecture Cloud Architectures.

Task Scheduling for Highly Concurrent Analytical and Transactional Main-Memory Workloads Iraklis Psaroudakis (EPFL), Tobias Scheuer (SAP AG), Norman May.

November , 2009SERVICE COMPUTATION 2009 Analysis of Energy Efficiency in Clouds H. AbdelSalamK. Maly R. MukkamalaM. Zubair Department.

Systems Support for End-to-End Performance Management Sandip Agarwala PhD Advisor: Karsten Schwan College of Computing Georgia Tech.

AUTHORS: STIJN POLFLIET ET. AL. BY: ALI NIKRAVESH Studying Hardware and Software Trade-Offs for a Real-Life Web 2.0 Workload.

Profile Driven Component Placement for Cluster-based Online Services Christopher Stewart (University of Rochester) Kai Shen (University of Rochester) Sandhya.

RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008.

©NEC Laboratories America 1 Huadong Liu (U. of Tennessee) Hui Zhang, Rauf Izmailov, Guofei Jiang, Xiaoqiao Meng (NEC Labs America) Presented by: Hui Zhang.

1 Wenguang WangRichard B. Bunt Department of Computer Science University of Saskatchewan November 14, 2000 Simulating DB2 Buffer Pool Management.

1 Specification and Implementation of Dynamic Web Site Benchmarks Sameh Elnikety Department of Computer Science Rice University.

© 2009 IBM Corporation Best Practices in making production - grade applications -A Performance Architect’s View Archanaa Panda, Bharathraj – IBM, HiPODS,

Architectural Characterization of an IBM RS6000 S80 Server Running TPC-W Workloads Lei Yang & Shiliang Hu Computer Sciences Department, University of.

Usenix Annual Conference, Freenix track – June 2004 – 1 : Flexible Database Clustering Middleware Emmanuel Cecchet – INRIA Julie Marguerite.

Ó 1998 Menascé & Almeida. All Rights Reserved.1 Part V Workload Characterization for the Web (Book, chap. 6)

Computer Science 1 Resource Overbooking and Application Profiling in Shared Hosting Platforms Bhuvan Urgaonkar Prashant Shenoy Timothy Roscoe † UMASS Amherst.

Towards Dynamic Green-Sizing for Database Servers Mustafa Korkmaz, Alexey Karyakin, Martin Karsten, Kenneth Salem University of Waterloo.

1 Admission Control and Request Scheduling in E-Commerce Web Sites Sameh Elnikety, EPFL Erich Nahum, IBM Watson John Tracey, IBM Watson Willy Zwaenepoel,

Case Study: A Database Service CSCI 8710 September 25, 2008.

Ó 1998 Menascé & Almeida. All Rights Reserved.1 Part V Workload Characterization for the Web.

June 30 - July 2, 2009AIMS 2009 Towards Energy Efficient Change Management in A Cloud Computing Environment: A Pro-Active Approach H. AbdelSalamK. Maly.

When Average is Not Average: Large Response Time Fluctuations in n-Tier Applications Qingyang Wang, Yasuhiko Kanemasa, Calton Pu, Motoyuki Kawaba.

1 Exploiting Nonstationarity for Performance Prediction Christopher Stewart (University of Rochester) Terence Kelly and Alex Zhang (HP Labs)

Emir Halepovic, Jeffrey Pang, Oliver Spatscheck AT&T Labs - Research

An Efficient Threading Model to Boost Server Performance Anupam Chanda.

SLA-driven Elastic Cloud Hosting Provider J. Oriol Fito, Inigo Goiri and Jordi Guitart Accepted in Proceedings of the th Euromicro Conference on.

LIOProf: Exposing Lustre File System Behavior for I/O Middleware

Emerging applications in cloud High performance computing E-Commerce Media hosting Web hosting Content delivery... –from Amazon AWS survey 1 Emulated network.

Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,

Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.

Dynamic Resource Allocation for Shared Data Centers Using Online Measurements By- Abhishek Chandra, Weibo Gong and Prashant Shenoy.

A Hierarchical Edge Cloud Architecture for Mobile Computing IEEE INFOCOM 2016 Liang Tong, Yong Li and Wei Gao University of Tennessee – Knoxville 1.

Abhinav Kamra, Vishal Misra CS Department Columbia University

Diskpool and cloud storage benchmarks used in IT-DSS

Distributed Network Traffic Feature Extraction for a Real-time IDS

Cultivating Software Quality In Cloud Via Load Testing Tools

Frequency Governors for Cloud Database OLTP Workloads

A Framework for Automatic Resource and Accuracy Management in A Cloud Environment Smita Vijayakumar.

Department of Computer Science University of California, Santa Barbara

Zhen Xiao, Qi Chen, and Haipeng Luo May 2013

Admission Control and Request Scheduling in E-Commerce Web Sites

Hardware Counter Driven On-the-Fly Request Signatures

Performance Evaluation

Request Behavior Variations

Performance And Scalability In Oracle9i And SQL Server 2000

Department of Computer Science University of California, Santa Barbara

Presentation transcript:

Detecting Transient Bottlenecks in n-Tier Applications through Fine- Grained Analysis Qingyang Wang Advisor: Calton Pu

 Response time is an important performance factor for Quality of Service (e.g., SLA for web-facing e-commerce applications).  Experiments at Amazon show that every 100ms increase in the page load decreases sales by 1%.  Akamai reported that 40% of users expect a website to load in 2 seconds or less. April 16, 2013CERCS Industry Advisory Board (IAB) meeting 2 Response Time is Important Source: [K. Ron et al., IEEE Computer 2010]

 Transient bottlenecks may cause wide-range end-to-end response time fluctuations and lead to severe SLA violations.  Traditional monitoring tools may not be able to detect transient bottlenecks due to their coarse granularity (e.g., one second).  We will show a motivational experiment of this phenomenon.  The goal of this research is to propose a novel transient bottleneck detection method. April 16, 2013CERCS Industry Advisory Board (IAB) meeting 3 Transient Bottlenecks in n-Tier Web Applications

 Background & Motivation  Background  Motivational experiment  Method for Detecting Transient Bottlenecks  Trace monitoring tool  Fine-grained load/throughput analysis  Two Case Studies  Intel SpeedStep  JVM garbage collection  Conclusion & Future Works April 16, 2013CERCS Industry Advisory Board (IAB) meeting 4 Outline

April 16, 2013CERCS Industry Advisory Board (IAB) meeting 5  RUBBoS benchmark  Bulletin board system like Slashdot (  Typical 3-tier or 4-tier architecture  Two types of workload  Browsing only (CPU intensive)  Read/Write mix  24 web interactions Experimental Setup (1): Benchmark Application

April 16, 2013CERCS Industry Advisory Board (IAB) meeting 6 Experimental Setup (2): Software Configurations HypervisorVMware ESXi v5.0 Guest OSRHEL Server 6.2 (64-bit, kernel ) Web ServerApache-httpd Application ServerApache-Tomcat Cluster middlewareC-JDBC Database ServerMySQL a-Linux-i686-glibc23 Sun JDKJdk1.5.0_07, jdk 1.6.0_14 System monitorSysstat , esxtop 5.0 Software Stack

April 16, 2013CERCS Industry Advisory Board (IAB) meeting 7 Experimental Setup (3): Hardware and VM Configurations ModelDell Power Edge T410 CPUQuad-core Xeon 2.27GHz * 2 CPU Memory16GB Storage7200rpm SATA local disk Type# vCPUCPU limitCPU sharesvRAMvDisk Large (L)24.52GHzNormal2GB20GB Small (S)12.26GHzNormal2GB20GB ESXi Host Configuration VM Configuration

April 16, 2013CERCS Industry Advisory Board (IAB) meeting 8 Experimental Setup (4): System Topology Sample topology (1/2/1/2)

 Response time & throughput of a 10 minute benchmark on the 4-tier application with increasing workloads.  How does the system actually behave at workload 8,000? April 16, 2013CERCS Industry Advisory Board (IAB) meeting 9 Motivational Example

April 16, 2013CERCS Industry Advisory Board (IAB) meeting 10 Motivational Example Response time distribution at workload 8,000 Percentage of requests over two seconds

 Average resource utilization is far from full saturation when system is at WL 8,000. April 16, 2013CERCS Industry Advisory Board (IAB) meeting 11 Motivational Example Server/ResourceCPU util. (%) Disk I/O (%) Network receive/send (MB/s) Apache /24.1 Tomcat /6.5 CJDBC /7.9 MySQL /2.8

April 16, 2013CERCS Industry Advisory Board (IAB) meeting 12 Motivational Example Timeline graphs of Tomcat/MySQL CPU utilization (every second) at WL 8,000 Traditional monitor tools (e,g., sar) cannot detect the performance bottleneck due to their coarse granularity

 Propose a novel transient bottleneck detection method with no or negligible monitoring overhead.  Based on passive network tracing  Detecting transient bottlenecks caused by various system factors.  Intel SpeedStep  JVM garbage collection April 16, 2013CERCS Industry Advisory Board (IAB) meeting 13 Focus of This Research

 Background & Motivation  Background  Motivational experiment  Method for Detecting Transient Bottlenecks  Trace monitoring tool  Fine-grained load/throughput analysis  Two Case Studies  Intel SpeedStep  JVM garbage collection  Conclusion & Future Works April 16, 2013CERCS Industry Advisory Board (IAB) meeting 14 Outline

 A bottleneck in an n-tier system is the place where requests start to congest in the system.  A transient bottleneck means the lifecycle of the bottleneck is short (e.g., millisecond level). It only causes short-term congestion in the bottleneck server.  Detecting transient bottlenecks in an n-tier system requires finding component servers that frequently present short-term congestions. April 16, 2013CERCS Industry Advisory Board (IAB) meeting 15 Our Hypothesis of Detecting Transient Bottlenecks

April 16, 2013CERCS Industry Advisory Board (IAB) meeting 16 Trace Monitoring Tool  We use a passive network tracing tool (i.e., Fujitsu SysViz ) to reconstruct the transaction execution in an n-tier system.

 Given the precise arrival/departure timestamps of each request for a server, we can calculate the following two metrics of the server:  Fine-grained load  The average number of concurrent jobs in a fixed time interval (e.g., 50ms)  Fine-grained throughput  The number of complete requests in a server in the same time interval April 16, 2013CERCS Industry Advisory Board (IAB) meeting 17 Fine-Grained Load/Throughput Measurement

April 16, 2013CERCS Industry Advisory Board (IAB) meeting 18 How Do We Detect Transient Bottlenecks of a Server ? Time window 1 Time window 3 Time window 2 TP max Saturation point N* Saturation area

April 16, 2013CERCS Industry Advisory Board (IAB) meeting 19 Fine-Grained Load/Throughput Analysis for MySQL at WL 7,000 Load at every 50ms Throughput at every 50ms

 Background & Motivation  Background  Motivational experiment  Method for Detecting Transient Bottlenecks  Trace monitoring tool  Fine-grained load/throughput analysis  Two Case Studies  Intel SpeedStep  JVM garbage collection  Conclusion & Future Works April 16, 2013CERCS Industry Advisory Board (IAB) meeting 20 Outline

 Intel SpeedStep is designed to adjust CPU frequency to meet instantaneous performance needs while minimizing power consumption  We found that the Dell’s BIOS-level SpeedStep control algorithm is unable to adjust the CPU frequency quick enough to match the bursty real- time workload, which causes frequent transient bottlenecks April 16, 2013CERCS Industry Advisory Board (IAB) meeting 21 Transient bottlenecks Caused by Intel SpeedStep P-stateP0P1P4P5P8 CPU Frequency [MHz]

April 16, 2013CERCS Industry Advisory Board (IAB) meeting 22 Transient bottlenecks of MySQL at Workload 8,000 SpeedStep On caseSpeedStep Off case CPU is in low frequency CPU is in high frequency

April 16, 2013CERCS Industry Advisory Board (IAB) meeting 23 Transient bottlenecks of MySQL at Workload 10,000 SpeedStep On case SpeedStep Off case

 Background & Motivation  Background  Motivational experiment  Method for Detecting Transient Bottlenecks  Trace monitoring tool  Fine-grained load/throughput analysis  Two Case Studies  Intel SpeedStep  JVM garbage collection  Conclusion & Future Works April 16, 2013CERCS Industry Advisory Board (IAB) meeting 24 Outline

 Transient bottlenecks in an n-tier system cause wide-range response time variations.  Transient bottlenecks may be invisible for traditional monitoring tools with coarse granularity.  We proposed a transient bottleneck detection method through fine-grained load/throughput analysis  Ongoing work: more analysis of different types of workloads and more system factors that cause transient bottlenecks. April 16, 2013CERCS Industry Advisory Board (IAB) meeting 25 Conclusion & Future Work

Thank You. Any Questions? Qingyang Wang April 16, 2013CERCS Industry Advisory Board (IAB) meeting 26