Presentation is loading. Please wait.

Presentation is loading. Please wait.

Problem Management Practitioners Forum Thursday January 19, 2012 Jon Dowell Jorge A. Wong.

Similar presentations


Presentation on theme: "Problem Management Practitioners Forum Thursday January 19, 2012 Jon Dowell Jorge A. Wong."— Presentation transcript:

1 Problem Management Practitioners Forum Thursday January 19, 2012 Jon Dowell Jorge A. Wong

2 Agenda o Housekeeping & Introductions o Define a successful investigation o Makeup of a successful Problem Manager o Proactive monitoring of automated alerts for trends/patterns o Impact of Change Management on PbM o Feedback & next steps

3 Housekeeping & Introductions Fire & Washrooms Name, Company, & Experience Jon Dowell

4 Jon Dowell Senior Consultant with KSLD Consulting. 15 years of experience solving I.T. mysteries. Facilitation and critical thinking during: o Major Incidents o Problem investigations o Project quality assessments prior to go-live o Project warranty periods Training and mentoring o Critical thinking o Root cause analysis o Impact assessments o Potential risks associated with requests for change KSLD Consulting specializes in I.T. Problem Management and problem solving for today’s busy world.

5 Jorge Wong Over 13 years in IT with Enmax and Accenture o Senior Systems Analyst o Applications Support Team Lead o Contact Center Technology Team lead o Service Delivery Lead o Relationship Manager o Problem Manager ITIL Background Focuses on reactive and proactive problem management Facilitates and conducts problem investigations with cause mapping analysis method to capture the complete investigation to: o Assess impact and cost o Identify root cause(s) o Best solution(s) to prevent recurrence Reviews and analyzes data from incident management and pinpoint problems which will give the best results once resolved.

6 Define a successful investigation Jorge A. Wong

7 Successful Problem Investigations Must first understand: Why do we have problem investigations? An investigation should be conducted to diagnose the root cause of the problem. How long should it take? The speed and nature of the investigation will vary depending upon the impact, severity, and urgency of the problem. What resources are required? The appropriate level of resources and expertise should be applied to finding a resolution corresponding to the priority and service levels targeted. Then, use your problem investigation toolkit. There are many problem solving analysis, diagnosis and solving techniques available and much research has been done in this area.

8 Successful Problem Investigations Some of the most useful and frequently used techniques include: Chronological analysis Timeline of events Pain Value Analysis What level of pain has been caused to the organization/business by these problems Kepner and Tregoe Deeper rooted problems Cause Mapping Deeper rooted problems 5 Whys Cause and effect Brainstorming Gather together the relevant people and brainstorm the problem Ishikawa Diagrams Document causes and effects which can be useful in helping identify where something may be going wrong, or be improved Pareto Analysis Separate important potential causes from more trivial issues Use what is appropriate and what you feel comfortable with.

9 Successful Problem Investigations o End results o Expected and desired outcome realized o Root cause(s) identified and or validated o Corrective measure(s) identified and or implemented o Effective use of resources throughout the investigation o Which means o Increased benefits to the business and the IT organization of: o Decreased downtime o Increased business satisfaction o Decreased amount of IT resources spent on incident management o Other benefits o Influences future cost avoidance o CMDB o Improved IT service quality o Incident volume reduction o Permanent solutions o Improved organizational learning o Better first time fix rate at the Service Desk o Improves existing processes and procedures o Happy Staff, including Problem Manager!

10 What makes a successful Problem Manager? Jon Dowell

11 11 Root Cause

12 12 Root Cause

13 13 Root Cause

14 14 Root Cause

15 15 Root Cause

16 16 Root Cause

17 17 Root Cause

18 18 Root Cause

19 19 Root Cause

20 Kepner Tregoe has a process called Incident Mapping that performs a similar process.

21 ThinkReliabilty also has a process called "Cause Mapping"

22 Brainstorm traits for a good Problem Manager…

23 Are these good Problem Managers?

24

25 What about these individuals…?

26

27 What are the traits of a Problem Manager? Listening o Ability to listen o Attention to detail… while listening Questioning o Open questions… to allow the story to flow o Closed questions… to confirm facts/details o Ability to ask tough questions and not be side tracked by miss direction. Leadership Ability to lead a teams, resolve conflict, and drive resolution. Prioritization with a focus on business, not technical, impact. Strong organization & time management abilities. Business writing skills And… Understanding of business terminology and concepts. Understanding of basic technical concepts, architecture, and methodologies.

28 Helpful educational opportunities? Dale Carnegie Kepner Tregoe Problem Solving & Decision Making Incident Mapping ThinkReliabilty Cause Mapping FranklinCovey Focus General Business Writing

29 Proactive monitoring of automated alerts for trends/patterns Jorge Wong

30 Alerts and monitoring, why? Identify future problems. Prevent problems from happening. Manage technology infrastructure based on business. Anticipate and meet the needs of the business. Effectively manage an increasingly intricate and complex infrastructure. Predict and solve problems before they affect business. Industry analyst reports, IT still discovers about 70% of problems through the service desk.

31 Alerts and monitoring, why? o Reactive to Proactive o End-user experience o Application performance and availability o Service level commitments o Outages o Cost avoidance o Resources o Productivity o Efficiency o Capacity o Predictive analytics o MTTR o MTBF

32 Alerts and monitoring, what? o Demand o Capacity o Availability o KPIs o Logs o Services o Network o Servers o User Defined Monitoring and Instant Alerts  Monitor the Windows Event log  Alert on hardware and software changes  Alert on specific file changes and protection violations  Know if disk space is running low on computers  Monitor computer online/offline status  Know if a server goes down  Know when traveling users with notebooks connect  Alert message and recipient configuration

33 Alerts and monitoring, what? o Pro-active approach  Server's utilization exceeds predefined percentage of total capacity available......raise alert!  Server CPU breaches 90% utilization, or disk becomes 80% full. o Food For Thought  What happens when a server goes down?  Alarms, alerts, and notifications are triggered all over the place.  The application, database, and operating system may appear to be down.  However, this problem behavior may be due to a single point of failure elsewhere in the network.  What is the problem?  What is the impact?  What is or are the root causes?  What is or are the workarounds and resolutions?  Or......should we even be worried about it?  Problem Management Categories  Re-active  Pro-active  Predictive Intelligence?

34 Feedback & next steps Jorge A. Wong

35 Next Steps o Future sessions  Problem Management Practitioner Forums 2012  January 19 (9am - Noon)  March 15 (9a - Noon)  June 7 (9a - Noon)  Followed by casual lunch  Change Management Practitioner Forum 2012  April 12 (9a - Noon)  Business Analyst World Conference 2012  May 7, 8, & 9  Practitioner Forums 2012  Looking for subject ideas  Configuration Management  Service Level Management  Looking for thought leaders and interested participants

36 15 Minute Break

37 Thank you!

38 Appendix

39 Problem Management: What it is? Is not? Jorge A. Wong

40 IT Problem Management What is a Problem? A cause of one or more Incidents. The cause is not usually known at the time a Problem Record is created. What is Problem Management? The objective of Problem Management is to resolve the root cause of Incidents, and to prevent the recurrence of Incidents related to these errors. What does a Problem Manager do? The Problem Manager is responsible for managing the lifecycle of all Problems. He undertakes research for the root-causes of Incidents and thus ensures the enduring elimination of interruptions. His primary objectives are to prevent Incidents from happening, and to minimize the impact of Incidents that cannot be prevented.

41 What is Root cause Analysis? A standard process of: o Identifying a problem What happened? o Containing and analyzing the problem What were the root causes of the problem? o Defining the root cause What internal options are available to deal with the problem? o Defining and implementing the actions required to eliminate the root cause What is the cost of acting upon the available options? o Validating that the corrective action prevented recurrence of problem Which decision options will provide the most cost- effective solution? Validate Follow Up Plan Complete Plan Action Plan Root Cause Immediate Action Identify Team Identify Problem

42 At a high level, problem investigation looks at: What were we doing? (Before Major Incident, Incident) What was the problem? Why did it happen? What should be done? What will we be doing now? (After Problem Investigation)


Download ppt "Problem Management Practitioners Forum Thursday January 19, 2012 Jon Dowell Jorge A. Wong."

Similar presentations


Ads by Google