Download presentation
Presentation is loading. Please wait.
Published byCleopatra Lindsey Modified over 8 years ago
1
1 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. ProxySG Performance Monitoring and Troubleshooting April 2016 Rob Ritchardson: Product Support Specialist
2
2 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. ProxySG has a finite amount of resources which it uses to process traffic. Internal and external issues can create situations where available resources become scarce and traffic processing is impacted. It is critical that ProxySG administrators understand what those resources are, what data is available, how to monitor that data and how to react to issues impacting resources. Three key areas of performance monitoring on ProxySG: CPU Memory Bandwidth Why is performance monitoring important?
3
3 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. CPU monitoring via the management console and CLI Statistics available and their use, CPU Monitor Identifying and investigating current or past CPU issues Memory monitoring via the management console and CLI Statistics available and their use, Threshold monitor Identifying and investigating current and past memory issues Bandwidth monitoring via the management console Understanding bandwidth impact on proxy Troubleshooting CPU and memory issues Agenda
4
CPU monitoring via the management console and CLI
5
5 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. CPU statistics in the management console UI CPU percentages are available in many places within the management console UI: Statistics->Summary->Device Statistics->Health Monitoring->General Statistics->System->Resources->CPU
6
6 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. CPU statistics in the management console UI Statistics->System->Resources->CPU includes historical graphs of CPU usage over different time periods.
7
7 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. CPU statistics in the management console UI All three CPU reports/graphs are generated from the same set of data on ProxySG. This data is available in the persistent data manager (PDM) statistics in a sysinfo and default snapshots. All three CPU reports/graphs show a single CPU percentage On ProxySG 6.6.1.x or older (6.5, 6.4, etc.) the busiest CPU on multiple CPU ProxySG platforms is shown in these reports/graphs On ProxySG 6.6.2.x and newer an average of all the platforms CPUs is shown in these reports/graphs The CPU percentage shown is the average CPU over 60 seconds Very short spikes in CPU usage might not show in these reports/graphs
8
8 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. CPU statistics in the CLI CLI also has many ways to display CPU: ‘show status’ ‘show cpu’ Single CPU reports follow same calculations as management console UI reports. #show cpu Current maximum CPU usage (%): 3.5 #show cpu Current maximum CPU usage (%): 3.5
9
9 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. CPU statistics in the CLI ‘show cpu’ has two optional flags to give additional information: [all] which shows all CPUs individually [extended] which shows CPU usage over 1, 5, 10 and 60 second averages Command examples are from a 2 CPU SG900 The extended information contains shorter timeframes for CPU average which allows for short CPU spike visibility. #show cpu all Current CPU usage (%): CPU 0: 3.3 CPU 1: 0.6 #show cpu extended Current maximum CPU usage (%): 1sec 5sec 10sec 60sec CPU: 3.3 3.3 3.3 3.5 #show cpu extended all Current CPU usage (%): 1sec 5sec 10sec 60sec CPU 0: 3.2 3.3 3.3 3.5 CPU 1: 0.5 0.6 0.6 0.6 #show cpu all Current CPU usage (%): CPU 0: 3.3 CPU 1: 0.6 #show cpu extended Current maximum CPU usage (%): 1sec 5sec 10sec 60sec CPU: 3.3 3.3 3.3 3.5 #show cpu extended all Current CPU usage (%): 1sec 5sec 10sec 60sec CPU 0: 3.2 3.3 3.3 3.5 CPU 1: 0.5 0.6 0.6 0.6
10
10 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. CPU statistics in the advanced URLs Advanced URLs include two CPU statistics pages: CPU Usage Statistics CPU Monitor CPU Usage Statistics Information: Advanced URL: /Diagnostics/CPU/Statistics CPU 0 at the top and increments going down 1, 5, 30 and 60 second averages
11
11 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. CPU Monitor general information CPU Monitor is a tool that allows an administrator a way to identify suspect ProxySG processes or components when investigating high CPU issues. This information is key for quicker resolution of high CPU issues. CPU Monitor is the ProxySG’s equivalent to Linux’s top command or Windows’s task manager application CPU Monitor is available in an advanced URL and CLI CPU Monitor configuration retained after a reboot Running CPU Monitor incurs 1-2% CPU overhead.
12
12 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. CPU Monitor general information Example: Configurable interval duration CPUs listed with its CPU usage (rounded up) Components shown Configured interval duration: 5 seconds Current interval complete in: 4 seconds CPU 0 99% HTTP and FTP 89% Object Store 10% Access Logging 1% Miscellaneous 1% CPU 1 22% TCPIP 20% DNS service 1% Configured interval duration: 5 seconds Current interval complete in: 4 seconds CPU 0 99% HTTP and FTP 89% Object Store 10% Access Logging 1% Miscellaneous 1% CPU 1 22% TCPIP 20% DNS service 1% Most component names are meaningful Two components commonly seen that need clarification: Object store – Kernel, Cache Engine, Storage Miscellaneous – Processing that does not fit into a main component
13
13 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. CPU Monitor in the advanced URL CPU Monitor advanced URL: /Diagnostics/CPU_Monitor/Statistics Advanced URL options: Start CPU Monitor Stop CPU Monitor View CPU Monitor data (automatic browser refresh) CPU Monitor advanced URL included in sysinfo and default snapshots
14
14 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. CPU Monitor in the CLI Configuration done in ‘configure terminal’ mode. CPU Monitor’s interval configuration in CLI only. #conf t #(config)diagnostics #(config diagnostics)cpu-monitor ? disable Disable the CPU Monitor enable Enable the CPU Monitor interval Configure the CPU Monitor interval #conf t #(config)diagnostics #(config diagnostics)cpu-monitor ? disable Disable the CPU Monitor enable Enable the CPU Monitor interval Configure the CPU Monitor interval
15
15 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. CPU Monitor in the CLI Viewing CPU Monitor output from the CLI must be done from ‘enable’ mode Command to view CPU monitor is ‘show cpu-monitor’ #show cpu-monitor CPU Monitor: Configured interval duration: 59 seconds Current interval complete in: 18 seconds CPU 0 6% Console Agent 3% Miscellaneous 2% #show cpu-monitor CPU Monitor: Configured interval duration: 59 seconds Current interval complete in: 18 seconds CPU 0 6% Console Agent 3% Miscellaneous 2% Data is not updated until interval expires Time in interval remaining also displayed
16
16 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. CPU Health check alerting How to view CPU usage is clear but constant viewing in anticipation of a CPU issue is not reasonable. Health Monitoring includes CPU alerting along with many other resource alerting. Statistics->Health Monitoring->General shows: CPU utilization Current state Health shows states in prioritized order: Critical, Warning, OK
17
17 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. CPU Health check alerting Health Monitor states are controlled by configurable thresholds. Changes in state trigger alerts to all of the configured alerting mechanisms. Configurations in Maintenance->Health Monitoring->General Configurable CPU percentage thresholds for critical and warning state Configurable intervals for critical and warning state Selectable log, email and trap alerting facilities Email and trap alerts require additional configurations on the ProxySG to function properly.
18
18 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. CPU Health check alerting Sample image of CPU alert configurations: Typically thresholds and intervals are not changed.
19
19 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. CPU Health check alerting For trap alerts SNMP must be enabled and configured: SNMP Console service: SNMP settings:
20
20 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. CPU Health check alerting For email alerts SMTP must be enabled and configured:
21
21 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Identifying and investigating an active CPU issue When investigating an active CPU issue the main goal is to first identify the component(s) using the most CPU. Check health status in the management console UI. Is it Warning or Critical state? Observe CPU usage trends in Statistics->System->Resources->CPU. Constant high CPU usage versus CPU spikes must be kept in mind when working with CPU monitor. Enable CPU monitor and record samples based on the CPU usage trends seen in Statistics->System->Resources->CPU. If the management console UI is unresponsive use the CLI (SSH or serial console) and use ‘show cpu’ and ‘show cpu-monitor’
22
22 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Identifying and investigating an active CPU issue Once a suspect component is identified analysis can begin Configured interval duration: 5 seconds Current interval complete in: 4 seconds CPU 0 99% Policy evaluation - HTTP 50% HTTP and FTP 35% Object Store 10% Access Logging 1% Miscellaneous 1% CPU 1 22% TCPIP 20% DNS service 1% Configured interval duration: 5 seconds Current interval complete in: 4 seconds CPU 0 99% Policy evaluation - HTTP 50% HTTP and FTP 35% Object Store 10% Access Logging 1% Miscellaneous 1% CPU 1 22% TCPIP 20% DNS service 1% Suspect: Policy Investigation: Policy change? Regex policy rules? Active sessions analysis Access logs More examples like this in the troubleshooting section
23
23 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Identifying and investigating a past CPU issue Investigating past CPU issues are more difficult since CPU Monitor is typically not enabled. Graph or statistical data analysis is needed to match CPU trends that are found. Check each graph duration in Statistics->System->Resources- >CPU to see if the CPU issue can be seen. Use that duration in other graph data (Traffic Mix, Client Workers, etc.) to see if there are matching trends to help identify a root cause. SNMP monitoring can greatly assist in this type of investigation SNMP monitoring knowledge asset: https://youtu.be/PvH30MLfEQYhttps://youtu.be/PvH30MLfEQY SNMP resource monitoring document on BTO
24
24 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Identifying and investigating a past CPU issue If nothing is found in the graph data, statistical analysis is an option Persistent Data Management (PDM) statistical data is available in the ProxySG’s sysinfo, default snapshots and heartbeats. All graphs within the management console UI are built from PDM data. Snapshots and heartbeats provide historical statistics that can provide visibility into past CPU issues, sometimes at a very granular level. Snapshots can be viewed and download from the /Diagnostics/Snapshot advanced URL Heartbeats are sent on a daily bases to all email recipients configured
25
25 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Identifying and investigating a past CPU issue PDM statistical data example: The above shows CPU usage over an hour of time Name of statistic follows PDM syntax: system:cpu-usage~hourly Date and time mentioned Current/newest sample is on the right side (x, y) or (60, 60) above; x = number of samples and y = time in seconds for each sample. 60 samples * 60 seconds = 3600 seconds or 1 hour Each sample the value at that time, not an average. system:cpu-usage~hourly@Fri, 01 Apr 2016 00:08:00 UTC[07](60, 60): 9 9 9 9 9 9 9 9 9 9 9 9 10 9 9 9 9 9 9 9 12 9 10 9 10 9 10 9 9 9 9 9 9 9 9 9 9 9 9 9 9 10 9 9 9 9 9 9 9 12 10 9 18 12 9 10 10 10 9 9
26
26 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Identifying and investigating a past CPU issue All CPU related data available: Default snapshots contain all of the above. Heartbeats only contain ‘daily15minute’. system:cpu-usage~hourly@Fri, 01 Apr 2016 00:08:00 UTC[07](60, 60): 9 9 9 9 9 9 9 9 9 9 9 9 10 9 9 9 9 9 9 9 12 9 10 9 10 9 10 9 9 9 9 9 9 9 9 9 9 9 9 9 9 10 9 9 9 9 9 9 9 12 10 9 18 12 9 10 10 10 9 9 system:cpu-usage~daily15minute@Fri, 01 Apr 2016 00:00:00 UTC[95](96, 900): 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 5 5 5 5 5 5 5 5 5 11 9 9 9 9 9 9 9 9 9 system:cpu-usage~daily@Fri, 01 Apr 2016 00:00:00 UTC[23](24, 3600): 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4 5 7 9 9 system:cpu-usage~weekly@Fri, 01 Apr 2016 00:00:00 UTC[19](28, 21600): 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 5 system:cpu-usage~monthly@Fri, 01 Apr 2016 00:00:00 UTC[17](31, 86400): 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 system:cpu-usage~yearly@Sun, 27 Mar 2016 00:00:00 UTC[15](52, 604800): 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 system:cpu-usage~hourly@Fri, 01 Apr 2016 00:08:00 UTC[07](60, 60): 9 9 9 9 9 9 9 9 9 9 9 9 10 9 9 9 9 9 9 9 12 9 10 9 10 9 10 9 9 9 9 9 9 9 9 9 9 9 9 9 9 10 9 9 9 9 9 9 9 12 10 9 18 12 9 10 10 10 9 9 system:cpu-usage~daily15minute@Fri, 01 Apr 2016 00:00:00 UTC[95](96, 900): 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 5 5 5 5 5 5 5 5 5 11 9 9 9 9 9 9 9 9 9 system:cpu-usage~daily@Fri, 01 Apr 2016 00:00:00 UTC[23](24, 3600): 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4 5 7 9 9 system:cpu-usage~weekly@Fri, 01 Apr 2016 00:00:00 UTC[19](28, 21600): 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 5 system:cpu-usage~monthly@Fri, 01 Apr 2016 00:00:00 UTC[17](31, 86400): 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 system:cpu-usage~yearly@Sun, 27 Mar 2016 00:00:00 UTC[15](52, 604800): 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
27
27 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Identifying and investigating a past CPU issue Once a CPU issue is found in the PDM data other types of data can be analysed to find correlations. An example: In this example we can see that a spike in user counts correlates with the spike in CPU usage. system:cpu-usage~hourly@Fri, 01 Apr 2016 00:08:00 UTC[07](60, 60): 9 9 9 9 9 9 9 9 9 9 9 9 10 9 9 9 9 9 9 9 12 9 10 9 10 9 10 9 9 9 9 9 9 9 9 9 9 9 9 9 9 10 9 9 9 9 9 9 9 12 10 9 50 90 100 46 10 10 9 9 users:current~hourly@Fri, 01 Apr 2016 00:08:00 UTC[07](60, 60): 1 0 1 1 1 0 0 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 1 0 1 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 3057 5932 6348 2680 1 0 0 0 system:cpu-usage~hourly@Fri, 01 Apr 2016 00:08:00 UTC[07](60, 60): 9 9 9 9 9 9 9 9 9 9 9 9 10 9 9 9 9 9 9 9 12 9 10 9 10 9 10 9 9 9 9 9 9 9 9 9 9 9 9 9 9 10 9 9 9 9 9 9 9 12 10 9 50 90 100 46 10 10 9 9 users:current~hourly@Fri, 01 Apr 2016 00:08:00 UTC[07](60, 60): 1 0 1 1 1 0 0 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 1 0 1 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 3057 5932 6348 2680 1 0 0 0
28
28 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Identifying and investigating a past CPU issue There are many PDM statistics that are helpful in investigating CPU issues. Most of the names are meaningful so that you can determine what they track. PDM data is space separated for ease of graphing in the spreadsheet tool of your choice. If the CPU issue has a predictable pattern use Health Monitoring’s ‘Warning’ state to alert you early enough that the issue is going to happen so it can be investigated live. Enable CPU monitor and leave it enabled if future occurrences of the CPU issue are expected.
29
Memory monitoring via the management console and CLI
30
30 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Memory statistics in the management console UI Like CPU, Memory percentages are available in many places within the management console UI: Statistics->Summary->Device: Only historical view Statistics->Health Monitoring->General: Statistics->System->Resources->Memory Use:
31
31 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Memory statistics in the management console UI Statistics->System->Resources->Memory Use includes a number of data points. For memory issues look at the following: Committed and available memory at the top Committed and free application memory at the bottom Issues are usually with application memory
32
32 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Memory statistics in the CLI Two CLI commands show available memory: ‘show status’ ‘show resources’ contains disk and memory information
33
33 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Memory statistics in the advanced URLs Advanced URLs include two Memory statistics pages: System Memory Statistics Threshold Monitor System Memory Statistics Information: Advanced URL: /System/memory Similar information in the management console UI Committed memory / Application data
34
34 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Threshold monitor general information Threshold Monitor is a set of statistics that track bytes of memory allocated per ProxySG component in 3 different time intervals. Threshold Monitor allows an administrator a way to identify suspect ProxySG components when investigating high memory issues. This information is key for quicker resolution of high memory issues. Threshold Monitor is available in an advanced URL and the advanced URL data can be shown in the CLI. The memory allocation statistics are cleared on a reboot.
35
35 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Threshold monitor general information Threshold Monitor statistics are included in the default snapshots and sysinfo. Threshold Monitor statistics in the advanced URL displays component names where the same statistics in the default snapshots and sysinfo does not. Example: Comparison of the advanced URL of a ProxySG to the snapshot/sysinfo data is recommended. Advanced URL (from hourly, Linear memory stats)Snapshot/Sysinfo 1, MiscellaneousTM004.1.0 1, AuthenticationTM004.1.1
36
36 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Threshold monitor advanced URL Threshold Monitor advanced URL: /TM/Statistics Statistics contain: Each ProxySG component and its memory usage in bytes Current/newest entries on the right ‘-’ indicate no data recorded (reboot or short uptime) Three grouping of sample intervals 60 minutes total; 30 samples at 2 minutes each 1 day total; 24 samples at 1 hour each 1 month total; 30 samples at 1 day each Two groupings of the above; Linear and Physical memory CSV values for easy graphing in preferred spreadsheet application
37
37 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Threshold monitor advanced URL TCPIP’s memory usage over the last hour, static (this is OK): SSL’s memory usage over the last hour, changes drastically: ProxySG had 4GBs of RAM in the above examples 1, TCPIP: 3358720, 3358720, 3358720, 3358720, 3358720, 3358720, 3358720, 3358 720, 3358720, 3358720, 3358720, 3358720, 3358720, 3358720, 3358720, 3358720, 3358720, 3358720, 3358720, 3358720, 3358720, 3358720, 3358720, 3358720, 33587 20, 3358720, 3358720, 3358720, 3358720, 3358720, 1, SSL and Cryptography: 2195826688, 2200390451, 2131156718, 2001598054, 1963358617, 1908186180, 1840209783, 1707371861, 1525337702, 1403926664, 1203994077, 989983402, 978771285, 1045835776, 1109430408, 1493174681, 1624545416, 1783512541, 1830913092, 1929578222, 1929619046, 1950044706, 2146023833, 2117209838,
38
38 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Memory Health check alerting Similar to CPU alerting, Health Monitoring also allows for memory usage alerts. Statistics->Health Monitoring->General shows: Memory utilization Current state Health shows states in prioritized order: Critical, Warning, OK
39
39 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Memory Health check alerting Health Monitor states are controlled by configurable thresholds. Changes in state trigger alerts to all of the configured alerting mechanisms. Memory warning threshold is 90% and critical threshold is 95% Interval times are the same; 120 seconds TCP regulation (Memory protection function that delays new requests, more on this on the next slide) occurs at 80% memory usage, changing the warning threshold to 75% will alert you about this event Use the warning threshold to proactively alert administrators of upcoming memory issues that need to be resolved.
40
40 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Identifying memory issues Understanding TCP Acceptance Regulation At 80% memory pressure the ProxySG will go into TCP Regulation. When this occurs the Proxy will STOP accepting new TCP Connections until memory drops below the threshold ( lower limit ). Recorded in the event log and Threshold Monitor statistics.
41
41 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Identifying memory issues Understand memory usage patterns as a climb in memory usage might not always be an indication of an issue Memory will rise and fall during operational times A leak is an ever increasing value over time. A small, slow leak can be hidden within normal usage
42
42 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Identifying and investigating an active memory issue When investigating an active memory issue the main goal is to first identify the component(s) using the most memory. Check health status in the management console UI. Is it Warning or Critical state? Observe memory usage trends in Statistics->Summary->Device Check memory statistics in Statistics->System->Resources->Memory Use Access Threshold Monitor’s advanced URL, save output and analyze the data to find the suspect component. If the management console UI is unresponsive use the CLI (SSH or serial console) to enter enable mode and save output of ‘show advanced-url /TM/Statistics’ command.
43
43 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Identifying and investigating an active memory issue Once a suspect component is identified analysis can begin 1, Configuration: 2689495040, 2756556526, 3015291153, 3166623744, 3238112597, 3353121177, 3356315648, 3426601642, 3452166144, 3385105203, 3356364800, 3407458850, 3452166144, 3452166144, 3452165597, 3452149760, 3452149760, 3452149760, 3448963481, 3547983872, 3598999278, 3643637760, 3621263906, 3643473920, Unusual component to be leaking memory This is a bug More examples like this in the troubleshooting section Suspect: Configuration Investigation: Configuration change? Director involvement? Configuration failures in event logs
44
44 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Identifying and investigating a past Memory issue Investigating a past memory issue is difficult as usually this is done after a reboot. A reboot clears all of the memory related statistics. Aspects of investigating CPU issues applies to memory issues Analysis of the graphs in the management console UI can be helpful. Statistics->Protocol details, most proxy types include client count graphs. Spike in clients can cause memory issues. Statistics->Traffic mix, look for large spikes in load SNMP monitoring can greatly assist in this type of investigation
45
45 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Identifying and investigating a past Memory issue If nothing is found in the graph data, statistical analysis is an option using PDM data which we previously discussed for CPU issue investigation. Memory statistics are included with CPU statistics in the same snapshot and heartbeat historical data. PDM statistical data example: system:memory-usage~hourly@Tue, 21 Oct 2014 15:34:00 UTC[33](60, 60): 78 79 78 78 78 78 79 79 78 79 79 79 79 78 78 78 78 78 78 79 79 78 79 79 79 79 79 79 79 79 79 79 79 79 79 79 79 79 79 79 79 79 78 79 78 79 79 79 79 79 79 79 79 76 76 76 76 76 76 76
46
46 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Identifying and investigating a past memory issue All memory related data available: system:memory-usage~hourly@Tue, 21 Oct 2014 15:34:00 UTC[33](60, 60): 78 79 78 78 78 78 79 79 78 79 79 79 79 78 78 78 78 78 78 79 79 78 79 79 79 79 79 79 79 79 79 79 79 79 79 79 79 79 79 79 79 79 78 79 78 79 79 79 79 79 79 79 79 76 76 76 76 76 76 76 system:memory-usage~daily15minute@Tue, 21 Oct 2014 15:30:00 UTC[61](96, 900): 70 70 70 70 70 71 71 72 73 73 73 74 74 75 74 74 74 74 74 74 74 74 72 72 71 71 71 70 70 70 70 70 69 69 68 69 68 68 67 66 66 67 66 65 64 66 66 66 66 66 66 66 66 66 66 66 66 66 66 65 65 66 66 66 66 66 65 64 63 62 62 62 62 62 63 62 63 64 65 66 67 69 72 74 75 76 76 77 77 77 77 78 78 78 79 78 system:memory-usage~daily@Tue, 21 Oct 2014 15:00:00 UTC[14](24, 3600): 70 70 72 74 74 74 71 70 69 68 67 65 66 66 66 65 66 63 62 63 67 74 77 78 system:memory-usage~weekly@Tue, 21 Oct 2014 12:00:00 UTC[09](28, 21600): 0 0 0 0 0 0 0 0 0 0 0 0 0 21 60 55 56 56 55 53 54 55 54 55 68 72 66 64 system:memory-usage~monthly@Tue, 21 Oct 2014 00:00:00 UTC[16](31, 86400): 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 57 54 62 system:memory-usage~yearly@Sun, 19 Oct 2014 00:00:00 UTC[44](52, 604800): 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8
47
47 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Identifying and investigating a past memory issue Correlate memory usage trends to trends found in the Threshold Monitor statistics. Example: In this example we can see how the drops and increases in SSL’s memory usage match the memory usage percentage. Investigating SSL load, worker counts, connections, etc. is next step. system:memory-usage~daily@Tue, 21 Oct 2014 15:00:00 UTC[14](24, 3600): 70 70 72 74 74 74 71 70 69 68 67 65 66 66 66 65 66 63 62 63 67 74 77 78 2, SSL and Cryptography: 2915475456, 2923571063, 2963431150, 2999666824, 2990583125, 2951294020, 2793216955, 2659983633, 2433017173, 2283298542, 2088039628, 1958240529, 1905309832, 1818618265, 1810595976, 1804682854, 1752943001, 1280302557, 1158922376, 1299194675, 1788627217, 2681711684, 2884126446, 2936206677, system:memory-usage~daily@Tue, 21 Oct 2014 15:00:00 UTC[14](24, 3600): 70 70 72 74 74 74 71 70 69 68 67 65 66 66 66 65 66 63 62 63 67 74 77 78 2, SSL and Cryptography: 2915475456, 2923571063, 2963431150, 2999666824, 2990583125, 2951294020, 2793216955, 2659983633, 2433017173, 2283298542, 2088039628, 1958240529, 1905309832, 1818618265, 1810595976, 1804682854, 1752943001, 1280302557, 1158922376, 1299194675, 1788627217, 2681711684, 2884126446, 2936206677,
48
Bandwidth monitoring via the management console
49
49 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Bandwidth impact on ProxySG Client and server side bandwidth processing directly affects the ProxySG’s resources; CPU and memory. Each ProxySG platform is sized to be able to process a certain amount of bandwidth while maintaining an appropriate level of resource usage. Increases in bandwidth where the amount is over what was sized for the platform can cause high CPU and/or memory usage. Changes in the types of traffic being processed while maintaining the same bandwidth can also cause high CPU and/or memory usage. Understanding bandwidth processing is needed for normal operations and problem investigations.
50
50 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Bandwidth statistics in the management console UI Statistics->Traffic Details->Traffic Mix Total processed bandwidth for client side, server side and bypassed traffic. Multiple durations available By service name or proxy type reporting Total bytes counted and savings calculated
51
51 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Bandwidth statistics in the management console UI Statistics->Traffic Details->Traffic History Processed bandwidth for client side, server side and bypassed traffic per service name or proxy type Multiple durations available Separate graphs showing client and server bandwidth together or individually Bytes counted and savings calculated
52
52 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Bandwidth statistics in the management console UI Both Traffic Mix and Traffic History use the same service names and proxy types. Service name tracks bandwidth matching IPs or ports. Within that traffic different proxies can process the traffic. Explicit HTTP can contain HTTPS traffic within it Proxy reports give better visibility into types of traffic processed
53
53 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Bandwidth statistics in the management console UI Custom service names allow for granular reporting Service name based on new application Service name based on clients location, client function, etc. Custom reports helpful in resource investigations.
54
Troubleshooting CPU and memory issues
55
55 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Troubleshooting CPU and memory issues Suspect component identification using CPU monitor or Threshold Monitor speeds up investigation time dramatically. Without these graph data from the management console UI or statistical data from snapshots or heartbeats must be analyzed for correlations. For memory issues understand memory usage for normal operations versus a leak If a reboot is planned to resolve either issue then a full core should be dumped KB http://bluecoat.force.com/knowledgebase/articles/Solution/How- do-I-enable-a-full-core-dump-on-the-ProxySG
56
56 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. High CPU in TCPIP component CPU Monitor shows high CPU in TCPIP Check connection table Advanced URL: /TCP/connections Time-wait entries (2MSL) Attacks Check interface statistics Statistics->Network-Interface Errors? Bypass data in transparent deployments Extremely high packets per second CPU 0 99% TCPIP 90% SSL and Cryptography 4% HTTP and FTP 3% Policy evaluation – HTTP 1% Object Store 1% Access Logging 1% Miscellaneous 1% CPU 0 99% TCPIP 90% SSL and Cryptography 4% HTTP and FTP 3% Policy evaluation – HTTP 1% Object Store 1% Access Logging 1% Miscellaneous 1%
57
57 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. TCP connections advanced URL TCP connection table Advanced URL: /TCP/connections Lists all incoming and outgoing connections Can be very large Shows problematic clients that open to many connections Connection states listed Large time_wait lists can consume CPU Half-opened connections could be a sign of an attack
58
58 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. High CPU in HTTP or Policy components CPU Monitor shows high CPU in HTTP or Policy View access log for suspicious activity live on the ProxySG Statistics->Access Logging, Tail main access log CPU 0 99% Policy evaluation – HTTP 64% SSL and Cryptography 16% HTTP and FTP 10% TCPIP 10% Object Store 1% Access Logging 1% Miscellaneous 1% CPU 0 99% Policy evaluation – HTTP 64% SSL and Cryptography 16% HTTP and FTP 10% TCPIP 10% Object Store 1% Access Logging 1% Miscellaneous 1% Check active sessions Many connections from a single client? Suspicious destinations? User counts and connections Connection table TCP user counts in advanced URL: /TCP/users
59
59 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. TCP users advanced URL User information tracked in the TCPIP stack Advanced URL: /TCP/users Active users list shows list of IP addresses and number of connections they have opened High connection counts on a single IP can be a sign of an issue
60
60 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. High CPU in SSL component CPU Monitor shows high CPU in SSL and Cryptography KB Article: http://bluecoat.force.com/knowledgebase/articles/Solution/000024136 SSL interception on exception default mode in SGOS 6.2+ Add splash text to SSL interception on exception policy rule CPU 0 99% SSL and Cryptography 62% HTTP and FTP 20% Policy evaluation – HTTP 13% TCPIP 7% Object Store 1% Access Logging 1% Miscellaneous 1% CPU 0 99% SSL and Cryptography 62% HTTP and FTP 20% Policy evaluation – HTTP 13% TCPIP 7% Object Store 1% Access Logging 1% Miscellaneous 1% SSL interception consumes CPU Configurations lower CPU: Disable DHE support Increase certificate timeout Add splash text to policy
61
61 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. CPU issues caused by general load Proxy operations involves many components HTTP, SSL, TCPIP, Policy, DNS, Object Store High CPU issues divided amongst these components typically points to sizing issues (high bandwidth, high user counts) Busiest: HTTP, SSL, Policy Middle: Object store, TCPIP Lowest: DNS, Access Logging CPU 0 99% SSL and Cryptography 33% HTTP and FTP 25% Policy evaluation – HTTP 20% TCPIP 15% Object Store 7% Access Logging 1% Miscellaneous 1% CPU 0 99% SSL and Cryptography 33% HTTP and FTP 25% Policy evaluation – HTTP 20% TCPIP 15% Object Store 7% Access Logging 1% Miscellaneous 1%
62
62 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. CPU issues caused by general load Check the following when CPU pattern looks like general load: Did something in the environment change that triggered the CPU? More traffic moving from HTTP to HTTPS? Cloud services being adopted? Bandwidth being processed from Traffic Mix graphs. Is the proxy sizing correct? User counts and connections, are the values expected? Statistics->System->Resources->Concurrent Users TCP Users advanced URL: /TCP/users Active sessions, clients accessing data that should be controlled?
63
63 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. High memory in HTTP/TCP/SSL/ADN components Analyze load in Traffic Mix. Is the bandwidth too high? Examine connection counts, users, and worker counts Connection table: /TCP/connection User’s connections: /TCP/users Management console UI Statistics->Protocol details Each protocol has a worker or client count, are the values the same as the baseline If load look good, verify dependency health Statistics->Health checks, response times for DNS, Auth, and ICAP Statistics->ICAP, queued connections? (sign of an ICAP server issue)
64
64 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Blue Coat Customer Forums Community where you can learn from and share your valuable knowledge and experience with other Blue Coat customers Research, post and reply to topics relevant to you at your own convenience Blue Coat Moderator Team ready to offer guidance, answer questions, and help get you on the right track Access at forums.bluecoat.com and register for an account today!forums.bluecoat.com
65
65 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Thank you for Joining Today! Please provide feedback on this webcast and suggestions for future webcasts to: john.dyer@bluecoat.com Webcast replay and slide deck found here within 48 hours: https://bto.bluecoat.com/training/customer-support-technical- webcasts (Requires BTO log-in)
66
66 Copyright © 2016 Blue Coat Systems Inc. All Rights Reserved. Quick Survey We are truly committed to continuous improvement for these Technical Webcasts. At the end of the event you will be re- directed to a very short survey about satisfaction with this Program. Please help us out by taking two minutes to complete it. Thank you! Questions for Rob?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.