Presentation is loading. Please wait.

Presentation is loading. Please wait.

© Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.1 Failure Prevention and Recovery Chapter coverage: System.

Similar presentations


Presentation on theme: "© Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.1 Failure Prevention and Recovery Chapter coverage: System."— Presentation transcript:

1 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.1 Failure Prevention and Recovery Chapter coverage: System failure Failure detection and analysis Improving process reliability Recovery

2 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.2 Failure There is always a chance that things might go wrong – we must accept this NOT ignore this. Critical failure: –Lost of customer –High downtime –High repair cost –Injury or lost of lives (company reputation) Non - critical failure – lesser effect Organizations must discriminate and give priority to critical failure – “why things fail” & “how to measure the impact of failure”

3 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.3 All failure can be traced back to some kind of human failure. –A machine failure might have been cause by someone’s poor design or maintenance. –Delivery failure might have been someone’s error in managing the supply schedule. Failures are rarely a random chance. –It can be controlled to a certain extent –Can learn from failure and change accordingly Opportunity to examine and plan for elimination Failure as an Opportunity

4 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.4 System Failure Why things fail: 1)Failure resulting from within the operation: Design failure Facilities failure People failure 2)Failure resulting from material or information input Supplier failure 3)Failure resulting from customer actions Customer failure

5 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.5 Why Things Fail Design failure: –Operations may look fine on paper but cannot cope with real circumstances. –Type 1: Characteristic of demand was overlooked or miscalculated. Bearing factory designed to produce 100 bearings per day but customers demand 125 bearings per day. –Type 2: The circumstances under which the operation has to work are not as expected. A factory building designed to house stationary machinery fails when it was used to store a vibrating machine.

6 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.6 Why Things Fail Facilities failure: –All facilities (machines, equipment, buildings, fittings) are liable to ‘breakdown’. –Type 1: Partial breakdown Worn out carpet in a hotel Machine can only half its normal rate –Type 2: Complete breakdown Sudden stop of operation –It is the effect of the breakdown that is important – some breakdowns could paralyse the whole operation. –Some failures have a cumulative significant impact.

7 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.7 Why Things Fail People failure: –Type 1: ‘Errors’ are mistakes in judgement A managers decision to continue running the plant with a partially failed heat exchanger resulted in a more expensive complete breakdown. –Type 2: ‘Violation’ are acts which are contrary to defined operating procedures A machine operator failure to lubricate the bearings of the motor resulted in the bearings overheating and failing

8 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.8 Why Things Fail Supplier failure: –A supplier failed to Deliver. Deliver on time. Deliver quality goods and services can lead to failure within an operation. Customer failure: –Customer failure can result when customers misuse products and services Example: Someone loading a 14kg washing machine with 18kg of cloths will cause the machine to fail.

9 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.9 –There are three main ways of measuring failure: Failure rates – how often a failure occurs Reliability – the chances of failure occurring Availability – the amount of available useful operating time Measuring Failure

10 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.10 Failure rate (FR): Example: If an engine fails 4 times after operating for 300 hours, it has a failure rate of 0.013 (0.13%). Example: If out of 250 products tested for operability 5 failed, the failure rate is 0.02 (0.2%) Measuring Failure

11 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.11 Failure over time – the ‘bath-tub’ curve At different stages during the life of anything, the probability of it failing will be different. Most physical entity failure pattern will follow the bath-tub curve. Measuring Failure

12 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.12 The ‘bath tub curve’ comprises three stages: The ‘infant-mortality’ stage where early failures occur caused by defective parts or improper use. The ‘normal life’ stage when the failure rate is low and reasonably constant and caused by normal random factors. The ‘wear-out’ stage when the failure rate increases as the part approaches the end of its working life and failure is caused by the ageing and deterioration of parts

13 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.13 Bath-Tub Curve Time Failure rate Infant- mortality stage Normal-life stage Wear-out stage X y

14 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.14 Reliability –Measures the probability of a system, product or service to perform as expected over time. –Values between 0 and 1 (0 to 100% reliability) –Used to relate parts of the system to the system. If components in a system are all interdependent, a failure in any individual component will cause the whole system to fail. Hence, reliability of the whole system, R s, R s = R 1  R 2  R 3  …R n Where:R1 = reliability of component 1 R 2 = reliability of component 2 R 3 = reliability of component 3 Etc…

15 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.15 Worked Example An automated pizza-making machine in a food manufacturer’s factory has five major components, with individual reliabilities (the probability of the component not failing) as follows: Dough mixerReliability = 0.95 Dough roller and cutterReliability = 0.99 Tomato paste applicatorReliability = 0.97 Cheese applicatorReliability = 0.90 OvenReliability = 0.98 If one of these parts of the production system fails, the whole system will stop working. Thus the reliability of the whole system is: Rs = 0.95  0.99  0.97  0.90  0.98 = 0.805

16 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.16 Worked Example Notes: –The reliability of the whole system is 0.8 even though the reliability of the individual components was higher. –If the system had more components, its reliability would be lower. –E.g. for a system with 10 components having reliability of 0.99 each, the reliability of the system is 0.9 BUT if the system has 50 components having reliability of 0.99 each, the reliability of the system reduces to 0.8. Reliability chart given on page 687 of recommended text.

17 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.17 Availability –Availability is the degree to which the operation is ready to work. –An operation is not available if it has either failed or is being repaired following a failure.

18 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.18 The three tasks of failure prevention and recovery Failure detection and analysis Finding out what is going wrong and why Improving system reliability Stopping things going wrong Recovery Coping when things do go wrong

19 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.19 Failure detection and analysis Mechanisms to detect failure: 1.In process checks 2.Machine diagnostic check 3.Point-of-departure interviews 4.Phone surveys 5.Focus groups 6.Complaint cards of feedback sheets 7.Questionnaires

20 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.20 Failure detection and analysis Mechanisms to detect failure: 1.In process checks – employees check that the process is acceptable during the process. Example: “Is everything alright with your meal, madam?” 2.Machine diagnostic check – a machine is put through a prescribed sequence of activities to expose any failures or potential failures. Example: A heat exchanger tested for leaks, cracks and wear

21 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.21 Failure detection and analysis Mechanisms to detect failure: 3.Point-of-departure interviews – at the end of a service, staff may check that the service has been satisfactory. 4.Focus group – groups of customers are brought together to some aspects of a product or service. 5.Phone survey, Complaint cards & Questionnaires – these can be used to ask for opinions about products or services.

22 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.22 Failure analysis: 1.Accident investigation Trained staff analyse the cause of the accident. Make recommendations to minimize or eradicate of the failure happening again. Specialized investigation technique suited to the type of accident 2.Product liability Ensures all products are traceable. Traced back to the process, the components from which they were produced and the supplier who supplied them. Goods can be recalled if necessary.

23 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.23 3.Complaint analysis Complaints and compliments are recorded and taken seriously. Cheap and easily available source of information about errors. Involves tracking number of complaints over time. 4.Critical incident analysis Requires customers to identify the elements of products or services they found either satisfying or not satisfying. Especially used in service operations.

24 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.24 4.Failure mode and effect analysis (FMEA) Used to identify failure before they happen so proactive measures can be taken. For each possible cause of failure the following type questions are asked:  What is the likelihood a failure will occur?  What would the consequence of the failure be?  How likely is such a failure to be detected before it affects the customer? Risk priority number (RPN) calculated based on these questions. Corrective action taken based on RPN.

25 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.25 6.Fault-tree analysis This is a logical procedure that starts with a failure or potential failure and works backwards to identify all the possible causes and therefore the origins of that failure. Made up of branches connected by AND nodes and OR nodes. Branches below AND node all need to occur for the event above the node to occur. Only one of the branches below an OR node needs to occur for the event above the node to occur

26 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.26 Fault-tree analysis for below-temperature food being served to customers Food served to customer is below temperature Cold plate used Plate taken too early from warmer Plate warmer malfunction Oven malfunction Timing error by chef Ingredients not defrosted Plate is cold Food is cold Key AND node OR node

27 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.27 To be continued…

28 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.28 Improving Process Reliability After the cause and effect of a failure is known, the next course of action is to try to prevent the failures from taking place. This can be done in a number of ways –Designing out fail points in the process –Building redundancy into the process –‘Fail-safeing’ some of the activities in the process –Maintenance of the physical facilities in the process

29 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.29 Designing out fail points Identifying and then controlling process, product and service characteristics to try to prevent failures. Use of process maps to detect potential fail points in operations. Redundancy Building up redundancy to an operation means having back-up systems in case of failure. Increases the reliability of a component Expensive solution Used for breakdowns with critical impact.

30 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.30 Fail-safeing Called poka-yoke in Japan. Based on the principle that human mistakes are to some extent inevitable. The objective is to prevent them from becoming a defect. Poka-yokes are simple (preferably inexpensive) devices of systems which are incorporated into a process to prevent inadvertent operator mistakes resulting in a defect.

31 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.31 Maintenance Maintenance is the method used by organizations to avoid failure by taking care of their physical activities Important to organizations whose physical activities play a central role in creating their goods and service. Benefits of maintenance: Enhanced safety Increased reliability Higher quality Lower operating costs Longer life span Higher end value

32 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.32 Benefits of Maintenance Enhanced safety: Well maintained facilities are less likely to behave in an unpredictable or non-standard way, or fail outright, all of which would pose a hazard to staff. Increased reliability – This leads to less time lost while facilities are repaired, less disruption to the normal activities of the operation, and less variation in output rates. Higher quality – Badly maintained equipment is more likely to perform below standard and cause quality errors.

33 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.33 Benefits of Maintenance Lower operating costs – Many pieces of process technology run more efficiently when regularly serviced. Longer life span – Regular care prolong the effective life of facilities by reducing the problems in operation whose cumulative effect causes deterioration. Higher end value – Well maintained facilities are generally easier to dispose of into the second-hand market.

34 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.34 Approaches to maintenance 1.Run to breakdown (RTB) Allowing the facilities to continue operating until they fail. Maintenance work is performed after failure has taken place. The effect of the failure is not catastrophic or frequent – e.g. does not paralyze the whole operation. Regular checks are sufficient.

35 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.35 Approaches to maintenance 2.Preventive maintenance (PM) Attempts to eliminate or reduce the chances of failure by servicing the facilities at pre-planned intervals. Used when the consequence of failure is considerably more serious. Can be used to detect impending failures. Remedial actions can be planned for, thus improving overall efficiency. The useful life of certain components can be increase beyond their recommended life span.

36 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.36 3.Conditioned-based maintenance (CBM) Attempts to perform maintenance only when the facilities require it. May involve continuously monitoring parameters (vibrations, temperature, displacement) of the facility. The results of the monitored parameter is used to decide whether to stop the facility to conduct maintenance. Approaches to maintenance

37 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.37 4.Mixed maintenance strategies Most operations adopt a mixture of these approaches because different elements of their facilities have different characteristics. Approaches to maintenance Use ???

38 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.38 5.Run to breakdown versus preventive maintenance The more frequent preventive maintenance is carried out, the lesser chance it has of breaking down. The cost of preventive maintenance is often high. Infrequent preventive maintenance will cost less but will result in higher chances of breaking down. The cost of an unplanned breakdown is often high. Approaches to maintenance

39 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.39 Cost of Preventive Maintenance Costs of PM Amount of preventive maintenance

40 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.40 Cost of Breakdown Costs of breakdown Amount of preventive maintenance

41 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.41 Maintenance cost model 1: One model of the costs associated with preventive maintenance shows an optimum level of maintenance effort. Costs Amount of preventive maintenance Total cost Cost of providing preventive maintenance ‘Optimum’ level of preventive maintenance Cost of breakdowns

42 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.42 Maintenance cost model 2: an optimum level of maintenance effort. Costs Amount of preventive maintenance Actual cost of providing preventive maintenance Model 1 cost of providing preventive maintenance

43 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.43 Maintenance cost model 2: an optimum level of maintenance effort. Costs Amount of preventive maintenance Actual cost of breakdowns Model 1 cost of breakdowns

44 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.44 Maintenance cost model 2: an optimum level of maintenance effort. Costs Amount of preventive maintenance Total cost Cost of breakdowns Cost of providing preventive maintenance

45 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.45 Notes: In actuality the cost of PM does not increase as steeply as indicated in Model 1. –Model 1 assumes that all maintenance jobs must be carried out by a specialist maintenance team but Model 2 recognizes that operators themselves can carry out simple, in process maintenance. Etc… The cost of breakdown could be higher than indicated in Model 1. –A breakdown may cost more than the cost of repair and the cost of the stoppage itself – a stoppage can take away the stability in the operation.

46 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.46 Run To Breakdown or Preventive Maintenance? Based on the arguments above, the shift is more towards the use of Preventive Maintenance.

47 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.47 6.Failure distributions The shape of the failure probability distribution of a facility can determine if it benefits from preventive maintenance. Machine A Machine B Probability of failure Time x y

48 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.48 Notes: Machine A –The probability that it will break down before time x is relatively low. –It has high probability of breaking down between times x and y. –If preventive maintenance was carried out just before point x, the chances of breakdown can be reduced.

49 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.49 Notes: Machine B –It has a relatively high probability of breaking down at any time. –Its failure probability increases gradually as it passes through time x. –Carrying out preventive maintenance at point x or any other cannot dramatically reduce the probability of failure.

50 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.50 Total Productive Maintenance (TPM) Approach Total productive maintenance (TPM) is defined as: …the productive maintenance carried out by all employees through small group activities… Where productive maintenance is: …maintenance management which recognizes the importance of reliability, maintenance and economic efficiency in plant design…

51 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.51 The five goals of TPM: 1.Improve equipment effectiveness: Examine how the facilities contribute to the effectiveness of the operation by examining all the losses which occur. 2.Achieve autonomous maintenance: Allow people who operate the equipment to take responsibility for some maintenance task. Maintenance staff to take responsibility for the improvement of maintenance performance.

52 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.52 There are three levels at which maintenance staff can take responsibility for process reliability: Repair level – staff carry out instructions but do not predict the future, they simply react to problems. Prevention level – staff can predict the future by foreseeing problems, and take corrective action. Improvement level – staff can predict the future by foreseeing problems, they not only take corrective action but also propose improvements to prevent recurrence.

53 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.53 Example: Suppose the screws on a machine become loose. Each week it jams up and is passed to maintenance to be fixed. A ‘repair level’ maintenance engineer will simply repair it and hand it back to production. A ‘prevention level’ maintenance engineer will spot the weekly pattern to the problem and tighten the screws in advance of their loosening. An ‘improvement-level’ maintenance engineer will recognize that there is a design problem and modify the machine so that the problem cannot recur.

54 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.54 The five goals of TPM (cont): 3.Plan maintenance: To have a fully worked out approach to all maintenance activities. Includes –the level of preventive maintenance required for each piece of equipment. –the standard for condition-based maintenance –the respective responsibilities of operating staff and maintenance staff. See Slide 19.55 4.Train all staff in relevant maintenance skills: TPM emphasises on appropriate and continuous training to ensure staff have the skills to carry out their roles.

55 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.55 The roles and responsibilities of operating staff and maintenance staff in TPM Maintenance staffOperating staff RolesTo develop: Preventive actions Breakdown services To take on: Ownership of facilities Care of facilities ResponsibilitiesTrain operators Device maintenance practice Problem-solving Assess operating practice Correct operation Routine preventive maintenance Routine condition- based maintenance Problem detection

56 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.56 The five goals of TPM (cont): 5.Achieve early equipment management: This goal is directed at avoiding maintenance altogether by ‘maintenance prevention’ (MP). MP involves considering root causes of failure and maintainability of equipment during the design stage, manufacture, installation and its commissioning.

57 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.57 Reliability Centred Maintenance (RCM) Approach 1.TPM tends to recommend preventive maintenance even when it is not appropriate. 2.Uses the pattern of failure for each type of failure mode to dictate the approach of maintenance. 3.The approach of RCM is sometimes summarized as “If we cannot stop it from happening, we had better stop it from mattering” – efforts need to be directed at reducing the impact of the failure.

58 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.58 Example: Take the process illustrated in Slide 19.59. This is a simple shredding process which prepares the vegetables prior to freezing. The most significant part of the process which requires the most maintenance attention is the cutter sub- assembly. However, there are several modes of failure. 1)They require changing because they have worn out through usage 2)They have been damaged by small stones entering the process 3)They have shaken loose because they were not fitter correctly.

59 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.59 One part in one process can have several different failure modes, each of which requires a different approach Failures Time Cutter ‘shake loose’ failure pattern Cutter ‘damage’ failure pattern Cutter ‘wear out’ failure pattern Solution Preventive maintenance before end of useful life Solution Preventive damage, fix stone screen Solution Ensure correct fitting through training Cutters Shredding process

60 © Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.60 The End


Download ppt "© Nigel Slack, Stuart Chambers & Robert Johnston, 2004 Operations Management, 4E: Chapter 19 19.1 Failure Prevention and Recovery Chapter coverage: System."

Similar presentations


Ads by Google