Presentation is loading. Please wait.

Presentation is loading. Please wait.

PEP-II Reliability and Uptime Roger Erickson 13 October 2003 With thanks to C.W. Allen, W. Colocho, P. Schuh, M. Stanek, and the Operations staff members.

Similar presentations


Presentation on theme: "PEP-II Reliability and Uptime Roger Erickson 13 October 2003 With thanks to C.W. Allen, W. Colocho, P. Schuh, M. Stanek, and the Operations staff members."— Presentation transcript:

1 PEP-II Reliability and Uptime Roger Erickson 13 October 2003 With thanks to C.W. Allen, W. Colocho, P. Schuh, M. Stanek, and the Operations staff members who collected the data.

2 Excludes “long” downtimes and holiday shut-downs.

3 Statistics: Causes of Unscheduled Down Time 3 PEP-II running periods considered: January 2000 through June 2003. 22,936 total scheduled operating hours. 2994 hours unscheduled down time. 5469 reported malfunctions (“events”). 1317 events directly tied to lost hours. We can sort the data by area of the machine (HER, linac, etc.), by system categories (RF, vacuum, etc.), by date, and by details of resolution.

4 Accelerator Performance Statistics Definitions: Revealed failures: malfunctions resulting in lost beam time. Also called “events”. Unscheduled down time: hours lost from scheduled program due to malfunctions. Mean Time to Fail: MTTF = Scheduled beam time Events Mean Time to Repair: MTTR = Unscheduled down time Events Availability = 1 - Unscheduled down time Scheduled beam time NOTE: PEP-II aborts are not counted as downtime, unless the event is reported; i.e., unless we stop to fix something and make a database entry.

5

6

7

8 PEP-II Run Totals Run 1: 1/12/00 – 10/31/00Run 2: 2/4/01 – 6/30/02 Run 3: 11/15/02 – 6/30/03 Long annual downtimes and holiday shut-downs are not included.

9 Hardware Availability by Run MTTFMTTRAvailability hours percent Run 118.572.3987.1 Run 217.882.0288.7 Run 315.282.6382.8 MTTF has been getting shorter (worse) each run. MTTR improved from Run 1 to Run 2, but got worse during Run 3.

10 Unscheduled Downtime by Major System SystemRun 1Run 2Run 3 Injection5.65.04.2 PEP Rings6.84.610.7 BaBar0.31.20.8 PG&E0.20.51.5 Availability87.188.782.8 Total100.0 Unscheduled down time (percentage), sorted by responsible system.

11

12 MTTR : PEP-II Rings Run 1Run 2Run 3Run 1Run 2Run 3 MTTR EvntsDT hrsEvntsDT hrsEvntsDT hrs Power Supplies2.371.521.5061144.79714783124.9 Magnets3.052.504.8026.137.5314.4 RF2.471.802.7155135.858104.247127.6 Vacuum10.583.8228.68552.92699.46172.1 Utilities3.291.931.8814462853.91222.6 Controls1.391.451.694258.56391.33254.0 Safety0.7010.7 Other2.851.694.1325.7813.5624.8 Totals182450.4283516.8189540.4

13 Time Required for Repairs Beam time lost Events Percent of total events Hours down % of total DT > 0 to 1.0 hours64148.7%383.412.8% > 1.0 to 2.0 hours28621.7%463.615.5% > 2.0 to 4.0 hours24118.3%723.024.1% > 4.0 to 8.0 hours856.5%485.816.2% > 8.0 to 24.0 hours564.3%686.022.9% > 24.0 hours80.6%252.78.4% 1317100.0%2994.5100.0% Combined data set from all three runs.

14 PEP Rings Events Requiring > 2 hours to Repair Run 3 Data: 33 % of PEP ring events require > 2 hours to repair. These account for 81 % of PEP ring down time.

15 Problems Requiring > 24 hours to Fix January 2000 – June 2003: 5 vacuum chamber failures in PEP rings. Some known vulnerabilities were already receiving attention. Vacuum task force is studying options for upgrading some chambers. 2 site-wide electrical power outages. These were outside SLAC’s control. SLTR quadrupoles overheated when cooling water pump stopped, but power remained on.

16 Recent Problems Requiring > 24 hours to Fix August 20, 2003: VVS transformer failure in linac. Failure occurred during E158; no impact on PEP. Two days for full recovery. Failure was in the only dry-type transformer among 16 VVS’s. Oil-filled, fixed-ratio replacement options being investigated. September 12, 2003: Site-wide power failure when tree grew too close to 230 kV line. Time lost to PEP program >47 hours. Tree trimming had not been done on established schedule. SLAC now has new contract with tree-trimmer company, with option to renew for five years.

17 Underlying Problems Sometimes Cross Technical and Jurisdictional Boundaries Seasonal high ambient temperatures cause drift, jitter, timing-shifts, spurious trips, and sometimes component failures in power supplies and sensitive electronics. Plan to air-condition the electronics alcove at Linac Sector 0, which houses the master oscillator and electronics critical to accelerator timing. A contract has been awarded. Several PEP support buildings have temperature control problems on hot days. More needs to be done to identify cost-effective improvements. An example of a problem not easily identified by counting malfunction reports.

18 Injection and Tuning Normal top-off: Typically 4 to 5 minutes to fill at intervals of 40 to 50 min. Approx. 10% of scheduled run time. Why is 21% spent injecting and tuning? Beam aborts require fill from scratch; typically 15 to 25 minutes each time.

19 Beware of Double counting: An abort in one ring usually leads to an abort in the other.

20 HER RF Aborts StationRun 2Run 3 –12-1: 0.33  1.1 aborts/day –12-3: 0.50  0.34 – 8-1: 0.22  0.57 – 8-3: 0.50  0.68 – 8-5: 0.51  0.66 –12-6:  1.65* Total = 2.1  5.0aborts/day –All stations were worse in 2003, except 12-3. * 12-6 fault accounting only available since 10-May-2003.

21 LER RF Aborts StationRun 3 –4-3:0.88 aborts/day* –4-4:0.55 (was 0.56 in 2002) –4-5:0.55 (was 0.53 in 2002) Total = 2 aborts per day * 4-3 fault accounting only available since 10-May-2003.

22 BaBar Radiation Aborts 3-year trend, based on data latched by accelerator control system: –2000:5.6 aborts/day –2001:4.1 –2002:3.6 –2002/3:2.8

23 Injection and Tuning Summary Percentages of scheduled operating hours: Normal top-offs: 10% Fill from scratch following: RF aborts:6.3% BaBar radiation aborts:3.5% Approximate total:20% Trickle charging could have significant beneficial impact!

24 Scheduled Off Time No routine scheduled maintenance days. Repair Opportunity Days (“RODs”) are launched when needed for show-stoppers or upgrade projects (typically 1/month). As many ROD and SML jobs as possible are completed during program interruption (typically 50 to 100 identified jobs).

25 Personnel Protection System (PPS) Testing Formerly required approx 3 months of beam-off, most of which was folded into long downtimes, but “verifications” were required at 6-month intervals. Net impact on PEP program depended on interval between long downtimes. Typically about 2 weeks/year. New policies and procedures have reduced testing to about 3 weeks once each year to coincide with long downtimes, plus operator interlock checks.

26 Opportunities for Further PPS Testing Improvements Add switches and indicators to further decouple zones/subsections/systems for testing purposes. Further streamline test procedures (much progress made last year). Train/authorize more staff members, so that testing can be done 24 hours/day when opportunities arise. Additional uptime to be gained? Possibly 1 week/year, depending on long downtime schedule and “opportunistic” down days. Long-range proposal: Replace linac and BSY PPS with modern system to facilitate testing and minimize downtime for diagnosing problems.

27 How to Increase PEP-II Up Time: Challenges to Ourselves Allocate resources among hardware projects to achieve optimal improvement in MTTF. Identify common-mode or infrastructure projects that will improve overall uptime and stability. Find ways to reduce frequency of aborts. Minimize scheduled off time through policy and procedure changes and aggressive scheduling. Reduce MTTR with improved procedures, diagnostic tools, and organizational efficiency.


Download ppt "PEP-II Reliability and Uptime Roger Erickson 13 October 2003 With thanks to C.W. Allen, W. Colocho, P. Schuh, M. Stanek, and the Operations staff members."

Similar presentations


Ads by Google