Operations/Failure Analysis Status of Equipment/Production Readiness Plans in Case of Part/Systems Failure For Each Stand Type
Inventory The US testing group has finished a complete inventory of all components used in hybrid/module/rod testing The inventory is available on the UCSB CMS website http://hep.ucsb.edu/cms/cms.html We have identified potential failure modes for our stands that we had not previously considered DAQ equipment, cables, Vienna box interlock spares, chillers, CAEN power supplies, etc. We have contacted all the sources of these components and hopefully we can receive all the spares we need before production fully starts Some can be purchased commercially Many come from within CMS
DAQ Components We requested spare DAQ components roughly 5 months ago. We have received many parts. However, we still need: 2 TSC 1 FED (replacement for broken UCSB board) 3 TPO 2 CCU25 4 VUTRI 6 PAACB 11 hybrid-to-utri boards All requests have been acknowledged and accepted. No ETA has been given for any of the parts. What are they for – what are our vulnerabilities if we do not get them? TSC-master DAQ controller. If one of our units fails we cannot run either a Vienna box, a single rod setup, or a multirod setup FED-Data buffer. This is the spare for us. If a FED fails here, a stand would be down until the FNAL spare is mailed TPO-extends number of command lines to TSC. Needed to MUX. Without 1, UCR cannot run more than 4 modules at a time. If one fails, we cannot use the multiplexer for that stand. Severely cripples one stand (Vienna box, single rod, multirod) CCU25+VUTRI+PAACB-needed to fully populate UCR box Hybrid-to-utri adapters-Spares for failures every 400 modules
Cables We have 48 different cable types in the systems For 29 of these types there were no spares planned. 4 more have inadequate numbers of spares Most can cause complete system failures We have requested spares for all cables that we can not make ourselves Duccio, Wim, and Torino have already responded We are making a number of the cables ourselves Ribbon cables for the TRHX box for the Module LT stands TPO-to-VUTRI control cables How many of the types does Duccio make for us? Wim? And how crucial are these. Do we have time esimates? Duccio makes 10 cables. A failure of 7 of the cable type would result in the multi-rod stand being non-functional. They will start making the cables in-house soon. Wim makes 5 cables. Can’t use UCR Vienna box without them. Should arrive soon Torino makes 1 cable. Should arrive soon. We can make 10 cables. Which requests were not answered. And what are the vulnerabilities associated with them. I did not hear about the DAQ cables. Without spares, a cable failure is equivalent to a component failure. Most, if not all, would make the stand non-operational. 5 have no spares. 2 have inadequate spares
CAEN Power Supplies With a new crate recently rented from CERN, we have the exact number we need CERN has recently ordered more CAEN crates (SYS127) and controller cards (A128HS and A1303). We have requested one set as a spare for the US When do we expect these to come in? We expect the new crate to arrive some time this month. We expect the crate+controllers we are renting from CERN to arrive in later January-early February
Assorted Equipment Issues (I) UCSB is in the process of ordering/manufacturing a spare of each component used in the 4 hybrid test box One set of spares for FNAL/UCSB/Mexico City FNAL and UCSB have located a spare NIM and VME crate Spare computers with hybrid/module software will be assembled and tested for UCSB/FNAL Spare computers with module LT/single rod/multi-rod/interlock software will be assembled and tested for UCSB/FNAL VME crates are needed for the Opto-Electrical Converters (OEC). If a crate failed and we did not have a spare, no rods could be tested. NIM crates are used for the pulsers for the 4 hybrid test stand and for the electrometers for the Vienna box and single rod stands. Without a spare, we would lose the ability to measure HV current in one of the stands.
Assorted Equipment Issues (II) Torino has agreed to supply spare Vienna box interlock equipment for each site Vienna has agreed to supply spare Vienna box sensors (RH% and T) to both FNAL and UCSB We have ordered one spare chiller for the hybrid thermal cyclers and another for the Vienna boxes
Test Stand Failure Analysis We have a draft of the US testing operations/failure analysis document available at (See “Testing Operations and Maintenance” under “Documents”: http://hep.ucsb.edu/cms/cms.html The process was really useful; it got us to think of worse case failure scenarios and how we could operate under such conditions
4 Hybrid Thermal Cycler Large over-capacity in the US CMS group We can test ~90 hybrids per day with expected peak rate of 45 per day if one assumes 400 per week delivered to us If a stand has a major failure, move production to other sites Can have only very short term effect on production. I.e. the shipping time from other sites to gantry site. The stands have three primary weakness: software, the Peltier element, and the NESLAB chiller Backup computer and spare Peltier elements at each site reduces these risks We have ordered one spare chiller which can be shipped in case of a chiller failure UCSB obtaining spares of all other components Can be shipped overnight in case of failure at FNAL or Mexico
ARCS module testing We are in the best shape in this area We can test ~17 modules/stand/day with expected peak rate of production of 30/day/site. Rate made possible by improvements of ARCS software and automation of data handling FNAL has 4 stands, UCSB has 3 stands, UCR has 1 stand Both sites have a complete live spare Obtained spare cables to reduce stand down-time to minimum Repairs from Aachen have only taken 2-4 weeks in past Biggest problem could be failure of the CMS database No wire bonding data, sensor data, or hybrid data We’d have to find all faults during testing. May require each part to be tested twice. (Although with HPK we have seen few faults) Would have to check testing results against known failures after database is working. Running more than two stands would remove back-log of parts.
Module LT systems Since expected production rate exceeds testing capacity (30 vs 15 modules a day) any failure would decrease fraction of sampled modules Only way to clear backlog is weekend testing Considering reduction of tests to increase capacity to 20 per day Spares acquired or ordered for almost all components: Power supplies, DAQ components, cables, chillers, interlocks, etc. We also modified Vienna box to be more stable/long-lived Brass plates and extender connectors If the stand is completely non-functional,TOB production can still continue Produce what can be assembled/tested on rods. Would reduce production capacity to ~20 modules a day unless we reduce the Rod LT period from 3 to 2 days. What do extenders do? Allow minimal connection/reconnection to backplane
Single Rod System Test systems have over-capacity 2-4 rod assembled a day with ~8 rod test capacity/stand/day Same issues with DAQ equipment/CAEN HV as module LT Ordered and received 6 extra MUX Cables from Duccio ordered: electrical and optical Two pieces of equipment with no spares in foreseeable future: OEC & Delphi LV power supply In both cases, we can take OEC or PS from multi-rod stand in an emergency (with multi-rod loss of capacity of 7% and 12%) How can we get them repaired? Need spares! If single rod stand fails completely, production can still continue at slower rate Test rods as they are loaded into multi-rod stand. Adds 1 day to the LT test cycle – compensate with shorter LT operation? Would have to reduce rod assembly rate to match testing rate UCSB could switch to mostly TEC production MUX allows recabling to be avoided
Multi-rod System Most complex system with least amount of experience Same potential problems as module LT or single rod systems plus: Chiller Interlocks Freezer infrastructure UR have thought about different operational scenarios for these components; will have to be revisited after accumulation of more experience Spares in hand of all components that company believe could likely fail Plan for finding and removing leaks in the cooling system needed 1 or 2 spare C6F14 loads ordered for both sites
Multi-rod System (2) Interlocks Ordered spares of sensors, etc. that can be easily replaced If interlock hardware fails: Company has 48 hour express repair plan Power supply interlocks would be used Control of system done by hand until repair is made If case of complete system failure, rod assembly at site can still proceed at a lower rate (undesirable) All rods will have to be tested with single rod stand Single rod stand would have to reproduce as many of the multi-rod stand’s testing until multi-rod available Only way to remove the backlog of rods assembled would be the reduction of the LT cycle time once repaired.