Download presentation
Presentation is loading. Please wait.
Published byVincent Lindsey Modified over 9 years ago
1
Systems Support for End-to-End Performance Management Sandip Agarwala PhD Advisor: Karsten Schwan College of Computing Georgia Tech
2
Source: Gartner (December 2005) Complexity, complexity, complexity…
3
Reasons for Complexity Application diversity Interdependencies Heterogeneous components –Too many different technologies and platform Too little “hints” from the system to the administrators –Legacy issues; Application-specific solutions Insufficient information about the system to drive self-management Lack of Automation
4
Online System Management ControlExecute MonitorAnalyze Workload Scheduling Capacity and SLA management Design evaluation and tuning Bottleneck detection Resource provisioning, accounting, etc. Proposed Approach: Service Path
5
Service Path Front - end Web Servers Middle-tier Servlet Server Application Logic (EJBs, etc.) Data Base Back - end I n t e r n e t Proxy Server System abstractions that describe the dynamic dependencies between the different distributed application components Service Class: Application-level request class, e.g. SLA class
6
Service Path Characteristics End-to-End analysis Online Non-intrusive Application-generic
7
Outline Background Motivation Service path –Discovery with E2EProf –Refinement with SysProf –Automated SLA Enforcement Related Work Future Plans
8
E2EProf time (A B) (B C) time D1D1 D2D2 Black-box approach Correlate per-edge time series signals Monitor network packet traces ( source, destination, timestamps ) Model traces as per-edge time series signals or density functions A X B C D
9
Basic Approach Delay at B Compute cross-correlation (D 1 D 2 ) A X B C D (A B) (B C) (A B) (B D) Spike Causality Spike’s position Delay No spike
10
Evaluation with 4-tier RUBiS 1 Tomcat Server 1 Tomcat Server 2 MySQL Server Apache Web Server 1 http://rubis.objectweb.org/ Clients comment bidding CPU bound I/O bound EJB Server 2 EJB Server 1
11
Service Path Detection in RUBiS Highest delay node Highest delay nodes Static server assignment Round-robin load balancer
12
Change detection in RUBiS Injected Delay
13
Revenue Pipeline Total Traffic: 1.34 million / day (56k / hour) Delta Air Lines’ Application TACS IN & TACS OUT XIN & XOUT APEX IN & APEX OUT Error/Warning (Tivoli) Logs
14
Time of the day Latency (sec) Delta Air Lines’ Application TACS S1S1 S8S8 S7S7 S3S3 S2S2 Client requests TACS Huge request burst
15
Outline Background Motivation Service path –Discovery with E2EProf –Refinement with SysProf –Automated SLA Enforcement Related Work Future Plans
16
Beyond dependency and latency… C1 C2 S1 S3 S2 S5 S6 S4 Solution: Zoom into the servicepath with SysProf No application hints or instrumentation Monitor resource usage on per-class basis
17
SysProf Methodology eth driver BDD Network Stack System Call FS/ VM/ etc. A1A1 A2A2 ANAN Scheduler User Kernel Scheduler Instrumentation points From client To client Init CID Context Switches Net softirq system call parameters, PID, App functions Disk I/O Track request context –Work done for processing a request class –May span user-level or kernel-level –Executes in more than one contexts (e.g. processes, threads, softirqs) –Happens in a system-visible event (e.g. system calls)
18
Class ID Propagation Init CID Process CID From client To client Msg CID Packet CID Inherits CID Front-Tier Middle-TierEnd-Tier User Kernel
19
Application of SysProf Resource Accounting Utility Billing Bottleneck detection Capacity Estimation Root-Cause Analysis Black-Box SLA management
20
Resource-Aware Adaptive Control Tomcat Server 1 Tomcat Server 2 MySQL Server EJB Server 2 EJB Server 1 Class 1 Class 2 Class 3 Cluster workloads contending for same resources Separate Queue/Controller for each cluster Front-end Controller + Scheduler
21
Resource-Aware Adaptive Control With SysProf Capacity = 80 req/s per server No SysProf
22
Summary Service Path –System abstractions to represent dependencies and request path E2EProf and Pathmap –Dependency and latency analysis SysProf –Service-based resource analysis Aid human operator and automate end-to-end performance management
23
Thank You! Questions? Email: sandip@cc.gatech.edu
24
Extra Slides
25
Pathmap Optimizations time Packet timestamp trace Time-series signal Or Density Function Cross-correlation series Bursty traffic Sliding window (W) Run-length compression Upper-bound On latency W
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.