Clarity Educational Community Root Causes and Quick Solutions Enhancing Application Performance Presented by: Steve Seaney & Chris Shaffer
2 Clarity Educational Community Agenda Survey Why Performance is Important Improving Performance Monitoring Performance Real-world Cases
3 Clarity Educational Community Survey How many of you have users that complain about CA PPM performance? How many of you feel performance negatively impacts overall perception of CA PPM? Any great “performance improvement” stories? What did you do?
4 Clarity Educational Community Why Performance is Critical Performance issues impact the usability, dependability, and scalability of a system Application performance always has the potential to burden a CA PPM instance and discourage users Usability ScalabilityDependability
5 Clarity Educational Community Where Performance Data Exists There are a variety of sources providing key information on performance, but most are only available for On Premise hosting Rego has portlets that help monitor jobs, processes, and database (Oracle) performance Log & Config Files App Logs BG Logs App Access Logs Process engine Database Reports Oracle advantages Deadlocks Waits Physical IO Cache Memory Long sessions Top SQL by CPU Top SQL by IO Table fragmentation Table configuration Infrastructure Diagrams Confirm Cores Confirm Memory Confirm Storage Confirm IO Confirm Network GC Performance
6 Clarity Educational Community Key Levers to Improve Performance Items driving performance are typically resolved by adjustments to the base infrastructure, application deployment, and the application configuration. Base Infrastructure VM Architecture Database Cores Database Memory Shared usage Application Configuration Jobs Pages Security Settings Code Optimization Data Limits Application Deployment JVM Architecture JVM Memory App Server Maintenance Database Settings On Premise Only
7 Clarity Educational Community Base Infrastructure With an on premise instance, one of the first things to validate is that the overall sizing and setup of the infrastructure is correct – These are not settings, they are changes that require purchases and/or changes to the physical/virtual setup VM Architecture – Virtual machines can be used for most of the infrastructure, but we recommend the production database NOT be virtual – 1-2 Cores per APP or BG Database Memory – Most common problem – CA PPM LOVES memory Database Cores – 1 core for every concurrent users (minimum 2 cores) – Additional cores can be needed if heavy reporting is done during business hours Shared usage – Applications sharing the base infrastructure need to ‘play nice’
8 Clarity Educational Community Application Deployment Application deployment is how the application and database settings are configured for optimal use. This is a common area to find issues with the application technical setup. JVM Architecture – 1 or 2 BGs (Never More than 2 for the process engine) – 50 to 150 concurrent users per app JVM – Leverage a XOG / Admin / Scheduler app JVM to offload non-UI users JVM Memory – 6 GB for the APP JVM is great – Never less than 2 GB for the BG – typically 2.25 GB is good Application Server Maintenance – Weekly restarts may be necessary, but can usually be avoided if architecture is sized correctly – Monitor heap usage for excessive CPU and garbage collecting – Parallel garbage collection if more than 1 core Database Settings – CA PPM demands a high speed SAN; Memory compensates for IOPs. – Do not run standard Oracle database stats job – Contact Rego for environment specific database parameters – Monitor and plan for Crystal and Webi reporting
9 Clarity Educational Community Application Configuration Application configuration issues typically require a mix of both functional and technical changes. The changes are often caused by suboptimal configurations. Jobs – Leverage evening downtime window but coordinate with DBA – Run database and system maintenance jobs after CA PPM jobs complete – Avoid large data changes (e.g. post timesheets regularly) – Avoid conflicts between slicing, posting, etc. Pages – Pages load Portlet queries together when the page is loaded – limit pages to 1 Portlet if possible Security – Use global rights if possible – Minimize the use of instance rights Settings – Minimize slice time ranges – Close time and fiscal periods – Set portlets to wait to display results until after filter – Turn off display conditions in the project and other object list views Code Optimization – Suboptimal SQL and GEL is the number one issue we see – Understand that OOTB does not always mean optimal code – you may want to re-write some OOTB portlets Data Limits – Minimize your set of roles – Purge old investments – shrinking key tables like investments, tasks, assignments, time – CA PPM contains governors for aggregation rows, Export to Excel, and XOG usage
10 Clarity Educational Community Application Configuration – Best Practice Set realistic session timeouts. Set Attribute Value Protection settings. Run the Delete Process Instance job on a regular basis. Run the Remove Job Logs & Report Library Entries job on a regular basis. Ensure proper data is being audited. Do not audit every attribute. Run database heavy jobs off hours.
11 Clarity Educational Community Performance Improvement Process Always start performance work with the facts. Without understanding exactly what you are trying to fix, you will never know when to declare success. After you have the facts, addressing performance issues is an iterative process – you start with the low hanging fruit and move through additional layers over time. 4. Make the Change. Test the theory by making changes based on your initial thoughts. Go after low hanging fruit or larger changes – trying to remove “big” things early on. 5. Analyze and Regroup. Analyze results compared with the baseline, maybe collect more data, and determine next steps 3. Decide on an Action. Review the data to find your first course of action (may be a few things) – a theory about what the issue may be and how to solve it. 1. Identify the issue. Understand the exact behavior that is causing the issue. Key questions: all users? All times of the day? Certain pages? 2. Gather baseline data. Capture as much information as you can from user timings, logs, AWR reports, etc. Make sure you have something to compare to once changes are made.
12 Clarity Educational Community Monitoring Performance Proactively monitor CA PPM utilization, user response, and performance. Remote monitoring On-Premise and On-Demand Incites BEFORE end user concerns When and how end users are using CA PPM
13 Clarity Educational Community DISCUSSION Any questions before we continue? 13
14 Clarity Educational Community Real-world Case – Application Sizing Symptom – Application heap dumps, out of memory warnings, and periodic slowness – Database appears to be healthy Clues – CA PPM 13 requires additional capacity (memory & cores) – JVMs heaps were 1.5 GB or the app and 1.0 GB for the BG – Application JVMs spending significant time garbage collecting – Symptoms appear during peak usage time – Database is healthy Resolution – Increase memory for JVMs to 6 GB for the apps and 2 GB for the BB – Increase database cores – Add JVM for XOGs / Admin / Scheduler Result – Stable Dev environment – Pushing to production – If cores are limited, try memory increase first and watch GC performance
15 Clarity Educational Community Real-world Case – Database Tuning Symptom – Frequent & persistent application slowness – Database CPU and application CPU are inconsistent Clues – Long running sessions on the database (more than a few minutes) – Blocking sessions – High physical I/O Resolution – Moved to the CA PPM stats job (Oracle) nightly & disabled the Oracle cron job – Increased the size of the database cache (PGA and SGA) – Rebuilt indexes associated with long running SQL Result – Stable and consistent production environment – Exposed additional queries that needed to be tuned – Exposed need to establish tuning and maintenance process
16 Clarity Educational Community Real-world Case – User Portlets and Configuration Symptom – Select users long waits with select portlets and pages – Periodic system slowness – Oddly high database CPU Clues – Long running sessions on the database (more than a few minutes) – Top SQL includes portlet queries with poor SQL Resolution – Tuned the queries causing the issue – Changed the page configurations to render on filter only – Set default filter for the portlets Result – Improved portlet speed – Eliminated long running sessions (led to system issues) – This can also apply to Webi and reports
17 Clarity Educational Community Real-world Case – Security Rights Symptom – Most users experiencing long wait times throughout the application – Periodic system slowness Clues – Admins did not experience slowness – Tested key portlets by removing the OOTB calls to see difference in timing Resolution – Tuned the queries causing the issue by moving the security call within an inner join vs. at the end – Removed 70% of the instance and OBS rights – moved to global “View” of resources and projects Result – Portlet response time cut in half – nearly equal to admins – Login and general navigation faster – CA PPM using OOTB security calls throughout
18 Clarity Educational Community Questions Phone Website We hope that you found this session informative and worthwhile. Our primary goal was to increase your understanding of the topic and CA PPM in general. There were many concepts covered during the session, if you would like to contact any presenter with questions, please reach out to us. Thank you for attending regoUniversity 2015!