Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 ALICE DCS – Technical Challenges Peter Chochula For ALICE DCS.

Slides:



Advertisements
Similar presentations
JCOP FW Update ALICE DCS Workshop 6 th and 7 th October, 2005 Fernando Varela Rodriguez, IT-CO Outline Organization Current status Future work.
Advertisements

The Detector Control System – FERO related issues
Peter Chochula CERN-ALICE ALICE DCS Workshop, CERN September 16, 2002 DCS – Frontend Monitoring and Control.
1 DCS Installation & commissioning TB 18 May 06 L.Jirden Central DCS Detector DCS Status.
CPV DCS STATUS REPORT Mikhail Bogolyubsky (IHEP, Protvino) Serguei Sadovsky (IHEP, Protvino) CERN, DCS meeting, 30 January, 2007.
Peter Chochula CERN-ALICE ALICE DCS Workshop, CERN September 16, 2002 DCS – Frontend Monitoring and Control.
André Augustinus ALICE Detector Control System  ALICE DCS is responsible for safe, stable and efficient operation of the experiment  Central monitoring.
Peter Chochula, January 31, 2006  Motivation for this meeting: Get together experts from different fields See what do we know See what is missing See.
1 ALICE Detector Control System (DCS) TDR 28 January 2004 L.Jirdén On behalf of ALICE Controls Coordination (ACC): A.Augustinus, P.Chochula, G. De Cataldo,
Supervision of Production Computers in ALICE Peter Chochula for the ALICE DCS team.
Test Systems Software / FEE Controls Peter Chochula.
Clara Gaspar, May 2010 The LHCb Run Control System An Integrated and Homogeneous Control System.
3 June 2003U. Frankenfeld1 TPC Detector Control System Status.
1 DCS TDR Key technical points & milestones TB 15 Dec 2003 L.Jirdén.
5 March DCS Final Design Review: RPC detector The DCS system of the Atlas RPC detector V.Bocci, G.Chiodi, E. Petrolo, R.Vari, S.Veneziano INFN Roma.
Summary DCS Workshop - L.Jirdén1 Summary of DCS Workshop 28/29 May 01 u Aim of workshop u Program u Summary of presentations u Conclusion.
1 Status & Plans DCS WS L.Jirdén. 2 DCS Planning FINAL INST COM- MISS BEAM OP PRE- INST DET DCS URD ENG. SOLUTIONS PROTOTYPE SUBSYSTEM.
1 ALICE Control System ready for LHC operation ICALEPCS 16 Oct 2007 L.Jirdén On behalf of the ALICE Controls Team CERN Geneva.
(Preliminary) Results of Evaluation of the CCT SB110 Peter Chochula and Svetozár Kapusta 1 1 Comenius University, Bratislava.
JCOP Workshop September 8th 1999 H.J.Burckhart 1 ATLAS DCS Organization of Detector and Controls Architecture Connection to DAQ Front-end System Practical.
André Augustinus 10 September 2001 Common Applications to Prototype A two way learning process.
Clara Gaspar, October 2011 The LHCb Experiment Control System: On the path to full automation.
TPC DCS update. CR3 UX-A/C CR4-Z08 UX-A/C CR3 CR4-Z08 UX-A/C CR4-Y08 Chambers ISEG Chambers Wiener Ethernet RCU Chambers, detector ELMB
1 Alice DAQ Configuration DB
Update on Database Issues Peter Chochula DCS Workshop, June 21, 2004 Colmar.
Peter Chochula ALICE DCS Workshop, October 6,2005 DCS Computing policies and rules.
Status of NA62 straw electronics and services Peter LICHARD, Johan Morant, Vito PALLADINO.
Peter Chochula ALICE DCS Workshop, October 6,2005 PVSSII Alert Handling.
Peter Chochula and Svetozár Kapusta ALICE DCS Workshop, October 6,2005 DCS Databases.
DCS Workshop - L.Jirdén1 ALICE DCS PROJECT ORGANIZATION - a proposal - u Project Goals u Organizational Layout u Technical Layout u Deliverables.
DCS T0 DCS Answers to DCS Commissioning and Installation related questions ALICE week T.Karavicheva and the T0 team T0 DCS Answers to.
ALICE DCS Meeting.- 05/02/2007 De Cataldo, Franco - INFN Bari - 1 ALICE dcsUI Version 3.0 -dcsUI v3.0 is ready and will be soon posted on the ACC site.
André Augustinus 10 September 2001 DCS Architecture Issues Food for thoughts and discussion.
1 Status and Plans DCS workshop L.Jirdén u Budget u Status u Planning u Goals for end 2002.
1 Responsibilities & Planning DCS WS L.Jirdén.
André Augustinus 17 June 2002 Technology Overview What is out there to fulfil our requirements? (with thanks to Tarek)
André Augustinus 10 October 2005 ALICE Detector Control Status Report A. Augustinus, P. Chochula, G. De Cataldo, L. Jirdén, S. Popescu the DCS team, ALICE.
Peter Chochula DCS Remote Access and Access Control Peter Chochula.
P. Chochula ALICE Week Colmar, June 21, 2004 Status of FED developments.
20th September 2004ALICE DCS Meeting1 Overview FW News PVSS News PVSS Scaling Up News Front-end News Questions.
André Augustinus 21 June 2004 DCS Workshop Detector DCS overview Status and Progress.
Naming and Code Conventions for ALICE DCS (1st thoughts)
André Augustinus 10 March 2003 DCS Workshop Detector Controls Layout Introduction.
Management of the LHCb DAQ Network Guoming Liu * †, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
Clara Gaspar, March 2005 LHCb Online & the Conditions DB.
JCOP Review, March 2003 D.R.Myers, IT-CO1 JCOP Review 2003 Architecture.
CERN, O.Pinazza: ALICE TOF DCS1 ALICE TOF DCS Answers to DCS Commissioning and Installation related questions ALICE week at CERN O. Pinazza and.
The DCS lab. Computer infrastructure Peter Chochula.
Peter Chochula ALICE Offline Week, October 04,2005 External access to the ALICE DCS archives.
André Augustinus 9 October 2006 Interlocks update.
Peter Chochula.  DCS architecture in ALICE  Databases in ALICE DCS  Layout  Interface to external systems  Current status and experience  Future.
Alice DCS workshop S.Popescu ISEG Crate controller + HV modules ISEG HV modules 12 Can bus PVSS OPC Client 1 Generic OPC Client Iseg OPC.
DCS Software Installation computing, network, software guidelines, procedures Peter Rosinsky, Peter Chochula, ACC team ALICE DCS Workshop, CERN, 5-6 March.
DAQ Status & Plans GlueX Collaboration Meeting – Feb 21-23, 2013 Jefferson Lab Bryan Moffit/David Abbott.
14 November 08ELACCO meeting1 Alice Detector Control System EST Fellow : Lionel Wallet, CERN Supervisor : Andre Augustinus, CERN Marie Curie Early Stage.
Clara Gaspar, April 2006 LHCb Experiment Control System Scope, Status & Worries.
The DCS Databases Peter Chochula. 31/05/2005Peter Chochula 2 Outline PVSS basics (boring topic but useful if one wants to understand the DCS data flow)
CONFIGURATION OF FERO IN ALICE Peter Chochula 7 th DCS Workshop, June 16, 2003.
Windows Terminal Services for Remote PVSS Access Peter Chochula ALICE DCS Workshop 21 June 2004 Colmar.
E Ethernet C CAN bus P Profibus HV HV cables LV LV cables (+busbar) S Serial (RS232) Signal cable Other/Unknown Liquid or Gas Cable and/or Bus PCI-XYZ.
Summary of TPC/TRD/DCS/ECS/DAQ meeting on FERO configuration CERN,January 31 st 2006 Peter Chochula.
André Augustinus 17 June 2002 Detector URD summaries or, what we understand from your URD.
Database Issues Peter Chochula 7 th DCS Workshop, June 16, 2003.
JCOP Framework and PVSS News ALICE DCS Workshop 14 th March, 2006 Piotr Golonka CERN IT/CO-BE Outline PVSS status Framework: Current status and future.
André Augustinus 13 June 2005 User Requirements, interlocks, cabling, racks, etc. Some remarks.
ACO & AD0 DCS Status report Mario Iván Martínez. LS1 from DCS point of view Roughly halfway through LS1 now – DCS available through all LS1, as much as.
DCS Status and Amanda News
Peter Chochula Calibration Workshop, February 23, 2005
Presentation transcript:

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 ALICE DCS – Technical Challenges Peter Chochula For ALICE DCS

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006Outline ALICE Front-End and Readout Electronics DCS Performance Studies ALICE DCS Computing

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Alice-specific technical challenge – the Front-end and Readout electronics (FERO)Alice-specific technical challenge – the Front-end and Readout electronics (FERO)

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 ALICE FrontEnd and Readout Electronics (FERO) Several architectures for FERO access are implemented in ALICE sub-detectors –Different buses (JTAG, CAN, DDL, Ethernet, I 2 C, custom…), different operation modes –DAQ is in charge of control of architectures connected via the optical link (DDL) –DCS is in charge of controlling the rest For some detectors both systems are involved

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 FERO Access Architectures Control: DAQ Monitoring DCS Class A Class C Class D Control DCS Control DCS Monitoring DCS Monitoring DCS FERO Class B Control DAQ OR/AND DCS Monitoring DCS DDL Non-DDL FERO Monitoring and Control task are sharing the same bus and cannot operate in parallel

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 The variety of FERO access mechanisms almost excludes implementation of common solutions The FrontEnd Device (FED) provides a API between PVSSII and custom architectures –Standard in ALICE –API definition (Commands, Services, Operational Guidelines) available –Based on DIM

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Generic Architecture of the FED Server InterCom Layer (PVSS) DIM Client CA1CAiMA1MAi Hardware Hardware Access FED Server Services FED Server (DIM Server) Provides access to FED via the FED API Hardware access layer contains device drivers Commands IntercCom layer contains detector control and monitoring code (agents) Supervisory layer Control layer Field layer FED API

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Example of “simple” Hardware Access Layer (SPD) InterCom Layer (PVSS) DIM Client CA1CAiMA1MAi Hardware Hardware Access FED Server Services Commands FED API InterCom Layer CA1CAiMA1MAi Hardware Access FED Server FED API VISA Libraries. VME Master (MXI) VME Crate Routers PCI-MXI JTAG over CO

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Yet Another Example of Hardware Access Layer (TPC/TRD/PHOS) InterCom Layer (PVSS) DIM Client CA1CAiMA1MAi Hardware Hardware Access FED Server Services Commands FED API InterCom Layer CA1CAiMA1MAi Hardware Access FED Server FED API FEE Client DCS Board FEE API FEE Server CE RCU/ROB

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 FERO Configuration FERO and FED server configuration FED Server Configuration FED API Intercom Hardware Access PVSS FEDFED FERO Configuration of the FED: alerts, monitoring parameters, services, etc. (Needed for DCS) Configuration of the electronics

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 What is stored in the FERO configuration? The FERO Configuration contains all setting needed for the FERO operation such as –DAC settings –Thresholds –Mask matrices … Sometimes the configuration contains code for embedded processors –This code might be compiled on-fly by dedicated software (to avoid repetition of huge data blocks) Expected data size differs from detector to detector and ranges from few Bytes up to ~100 MB –the data might be compiled on fly, amount of data written to the FERO might be considerably bigger that the amount of data read from the DB Some parameters written to the chips are not available for the DCS monitoring –Data cannot be easily provided to offline

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Assembling a Configuration Record add r13 c3 r13 shl 1 r11 r11 jmp cc_carry return1 mov bitnum work const dut = 3 reset dut write dut, NMOD, 0x1C expect dut, NMOD, 0x1C write NICLK, 1.ASM.TCS ASM_MIMD TCC CODEM DAT.C + SCSN Master TRAPs

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 FED Server FEE Client InterCom Layer Config. DB PVSS FED Client... FeeServer incl. CE FeeServer incl. CE FeeServer incl. CE Command (Broadcast) Cmd ACK Acknowledges Service Services... Config. File Load configuration data for ICL from file OR database Command Coder © ~ MB ~GBs Configuration Data Flow

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Status of FERO Developments FERO access implementation is advancing well for 4 detectors: SPD, TPC, TRD and PHOS –SPD full slice is being commissioned now (ACC participation) –TPC and TRD tested the chain from intercom layer to devices, PVSS integration in progress, database tests started –PHOS aims for full chain in June (test beam) Main worry: manpower is missing in detector teams. ACC is providing help, but soon we will loose the key person (SK)

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Obtaining the configuration data FERO Configuration Database Calibration Procedure (online systems) Analysis Procedure DCS Archive DAQ Data Calibration data is a result of a complex chain of steps Some detectors can execute several calibration procedures Offline and all Online systems are involved We are facing insufficient information flow between different experts within the detector group Difficult coordination, regular workshops of involved systems launched

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 The FERO and the DCS Network Some FERO components rely on hardware controlled over Ethernet installed in magnetic field –Customized Ethernet interfaces require installation of network switches close to the detector (in the cavern) –No IT support for those switches –Hardware is tested, but long-term stability remains a question

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, Laserhut 10 Cooling DCS network in UX -- [power supplies, VME crates etc.] –248 ports, only in racks DCS network in UX -- [“ALICE” switches for DCS boards, RCU] –41 Gbit uplinks GPN network -- [e.g. for commissioning/debugging] –Wireless (enough to cover whole cavern) –~50 ports ‘strategically distributed across rack areas’ –Exact locations being defined now N

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 The DCS Performance Tests and Related ChallengesThe DCS Performance Tests and Related Challenges

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 DCS Performance Tests Results of performance tests presented at ALICE DCS Review

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Summary of test Campaign Tests covered all aspects of the DCS Results of several tests were collected and evaluated: –JCOP tests –JCOP tests with ALICE contribution –ALICE test campaign –Results provided by colleagues from other experiments (special thanks to Clara) No major problems discovered, each PVSS system can digest its load Distribution of PVSS systems provides very flexible tool for performance tuning, BUT: –All systems will meet in a single point – the ORACLE configuration and archive. This is our major performance concern.

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Dealing with Detector Performance DCS configured according to performance needs –Number of sub-systems per PVSS –Number of PC’s per sub-system Critical issues –Switch-on of many channels –Configuration of many channels

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Start-up Time Switch-on of many channels: –Test: Total switch-on for 180 CAEN HV channels: 7 sec –SDD: 520 Caen HV channels: ~15 sec –TRD: 1080 = 180 Iseg channels * 6 via DCS board: 7 + ? Sec –TOF: 3600 = 180 Caen channels * 20 fanout: 7 sec NOTE: to be compared with ramping times of minutes !! Configuration: normally done outside physics time !! –Test: Configuration of a full Caen crate 192 ch: 20 sec –SDD: 520 Caen HV channels: <54 sec –Test: DB retrieval of FEE 150MB BLOB’s: 15 – 50 sec –SPD: 3 sec –TPC: 10*10kB/DCS board: 25 sec –TRD: 10*10KB/DCS board: 50 sec –if required: Oracle tuning and Caching will improve

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Performance: alert avalanches” Tests have shown that PVSS copes with –an alert avalanche of at least alerts per PVSS system ~ 60 PVSS systems in ALICE: alerts acceptable –a sustained alert rate of ~200 alerts/sec per PVSS system ~ 60 PVSS systems in ALICE: alerts per second acceptable –all alerts from a full CAEN crate displayed within 2 sec Max 6 crates on one PVSS system: all alerts displayed within 12 sec Many means to limit alert avalanches –Scattering of PVSS systems –Correct configuration of Alert limits for each channel Summary alerts & filtering –Verified by ACC at installation time

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Config. PVSSII Archive ECS Electricity Ventilation Cooling Gas Magnets Safety Access Control LHC DAQTRIHLT DIM, DIP FERO Version Tag Devices Devices Devices Devices The DCS Archival

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 What needs to be archived? For offline use we need to archive at least HV, LV and some (many) FrontEnd parameters This data will be produced by ~60 machines Number of archived parameters: –LV: 3100 channels –HV: channels –FERO: ~20000 parameters In addition we need to archive the information provided by services, DCS states, environment, crate status, …

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 RDB Archival Status and Tests The final release of the RDB-based archival mechanism is repeatedly delayed At present we participate in tests of the latest release –setup procedure still requires expert knowledge, cannot be recommended in this stage to detector teams (it is a sort of beta version) –worrying problems on the server side: High CPU utilization (which limits the number of clients to be handled by a single server to ~5) –Server overload causes loss of data Big data volumes created at the database servers (mainly redo logs) resulting in unacceptable database size We consider today the RDB archival as still not ready for production

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Implications of missing archival mechanism Some detectors already started the preinstallation and data needs to be archived This data will be needed also in the future we need a mechanism for transporting the data produced today into the final archive to be implemented tomorrow We cannot provide recommendations to detector teams for archive setup. The only solution is to use the present file-based archival and parametrize the whole project again in the future

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Implications of Archival Performance As shown, Alice has ~40000 channels to be archived –Number of corresponding DPEs to be archived is higher ~60 computers will provide the data for archival The database server(s) must cope with the situation when all channels change at the same time (e.g. ramp-up) If the situation does not improve, we need to plan for 6-12 database servers

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Implications of Archived Data Size As we do not know the final data volumes which will be created on the server by the archival mechanism, we cannot refine our specifications –For example, present mechanism creates ~2GB of data per hour for a client archiving 5000DPEs/s (tests done with 4 clients each archiving 5000DPE/s). This is about 900% overhead compared to raw information produced by the machines Unclear situation concerning the archival complicates developments of DCS-OFFLINE interface (see later)

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Final Archival Implementation There is no caching mechanism which will cover the period when the connectivity to DB server is lost –Implication: we will need to run a local database server in P2 which is not compliant with the IT policy on DB support For example, the DAQ can run ~20 hours in standalone mode. The DCS must be able to cover at least this period with fully working archival

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 What are the next steps? We need to set a deadline on the archival solution –This date cannot be moved beyond May 31 st If the RDB archival does not qualify for production, the only solution is file-based archival –Implication: we did not foresee the extra disk space on the DCS computers. If we have to order the extra disks, it must be done now. (The extra cost involved is ~9000CHF, ordering starts now )

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Conditions Data ALICE offline is not using COOL –DCS needs to provide an API to its data RDB Archive structure is not yet settled down File-based archival is using proprietary format with missing API DCS and Offline teams developed AMANDA –PVSS API manager –Data exchange protocol (over TCP/IP)

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 OFFLINE Config. PVSSII Archive ECS Electricity Ventilation Cooling Gas Magnets Safety Access Control LHC DAQTRIHLT DIM, DIP FERO Version Tag Devices Devices Devices Devices Conditions AMANDA

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 PVSS Architecture: Implications on Amanda The RDB manager is not threaded safe, one request can be handled at a time –Amanda needs therefore to queue the requests, which causes severe performance limitations –Even in distributed system, the RDB manager can retrieve data only from it’s own archive –If data from remote system is needed, its managers will be involved as well Implications on AMANDA: –Extra load to PVSS systems is added –We will need to run at least 1 Amanda per detector DMEM DIS DMEM

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 DCS Conditions The RDB archival can solve performance problems which we see in Amanda –Data can be directly retrieved from the database, no need to involve PVSSI API Developments need some time, but can be started only after the situation is clear BUT: –Conditions data is needed now (TPC commissioning, SPD commissioning, upcoming data challenge …)

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 The DCS software: installation, maintenanceThe DCS software: installation, maintenance

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Software Developments, Installation and Maintenance Rules and guidelines discussed in DCS workshops Procedures need to be tested and refined Software installation procedure: –Basic checks done in the lab –Software uploaded to production network via the application gateway –Configuration, tuning and tests by DCS team and detector experts Worry: very often the full tests cannot be performed in advance because the hardware will not be available –The associated risk is, that detectors rely on software developments on the production network Management of the software installation is a challenge

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Software Versions The DCS is a big distributed system based on components provided by many parties –Policy on software version is inevitable for successful integration We are following the FW and PVSS developments List of recommended software versions for ALICE DCS is released, all detectors are requested to keep up to date

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Version freezing Problem: we need to freeze the versions at some point –Some detectors are already pre-commissioning now and will not be able/willing to touch their software during the tests –Some detectors will be installed in June 06 and will not be fully accessible until April 07 –Some detectors finished their DCS developments using the existing FW components. The upgradea are painful and require manpower We are aware that the new developments are important and we rely on the new features However, at some point we need to freeze the developments –We can make an internal decision in ALICE and compromise on the functionality in favor of a working system, but –we need to assure that the recommended components will be supported at least during the next year

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Example: PVSSII 3.5 Concerns and Worries 3.5 should be ready by end of June Compatibility with older releases should be assured by gateway functionality between 3.0 and 3.5, but –Can we really profit from it? How much will the new software depend on Qt (and therefore will not be running on 3.0)? new version of compiler for Windows is still not yet decided –We are running FED servers on the same machines as PVSSII and we are forcing our colleagues to use the compilers compatible with PVSS (to avoid problems with libraries): All FED Servers need to be recompiled ! What will be the policy for parallel support of 3.5 and 3.0 (e.g. libraries, framework)? What are the final deadlines? –If the 3.5 is really released in June, it will need some testing. When will be the release date for sub-detectors? –How do we react if ETM delays the release? PVSS 3.5 will contain many useful features, but some important improvements will not be implemented: changes in alert handling If we accept 3.5, it should be really the last version valid for the startup

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 The DCS computing:The DCS computing: –organization –hardware –Management and supervision –Remote access

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 DCS computers fall into one of three categories: –Worker nodes (WN) performing the DCS tasks –Operator nodes (ON) running the UI –Backend servers – providing services for the whole DCS (fileservers, remote access servers, database servers…) First batch of DCS computers is being delivered now –ON for all detectors –WN for DCS infracstructure Second (so far the last) batch is being ordered now Total number of DCS computers is ~100

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 DCS Computers All machines are based on Intel server boards equipped with dual core CPUs Special emphasis was given to HW compatibility tests (the duration of the test cycle involving the ordering of prototypes, tests, tendering and purchasing procedure is comparable with the mainboard production lifetime) main technical problem are the 2U PCI risers Additional worry: the 5V PCI disappeared on the PIV server boards The 3V version has a limited number of available ports/computer (replaced by PCI-E) and will probably also disappear very soon Solution: probably USB The selected computer models solve our problems at least for the ALICE startup phase

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Accessing the DCS Remote access to the DCS is based on the Windows Terminal Service (WTS) Access from the ALICE control room (ACR) –Consoles will display the UI from the Operator Nodes Access from outside –Dedicated Windows terminal servers

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Detector 1 ON - WTS WN ACR CR3 RDP PVSS ON - WTS WN PVSS WN PVSS RemoteGPN RDP WTS cluster ON - WTS Remote access to the DCS network

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 WTS Performance Terminal Server #clients Average CPU load [%] Mem [kB] Workstation running the DCS project #clients Average CPU load [%] Mem [kB] WTS Performance was studied – no problems observed Master project generated datapoints and updated 3000 /s. Remote client displayed 50 values at a time

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 DCS Computers – Services and Back-End not Included Operator Node SPD HV + LV FED + Crate FED Operator Node SDD HV LV + FED + Crate FED Operator Node SSD HV + LV FED + Crate + ELMB FED Operator Node TPC HV LV + ELMB FED VHV FED Pulser Laser Drift velocity Operator Node TRD HV LV FED Operator Node TOF HV LV FED [18] FED + Crate Operator Node HMPID HV + LV Crate + PLC Operator Node PHOS HV + FED + LED LV + ELMB + Crate FED Operator Node CPV HV + LV + ELMB Operator Node Muon Trk HV LV Crate + ELMB +GMS Operator Node Muon Trg HV + LV Crate + ELMB Operator Node FMD HV + LV + FED FED Operator Node T0 HV + LV FED + Crate + Laser FED Operator Node V0 HV + LV + Crate Operator Node PMD HV + LV Crate + ELMB Operator Node ZDC HV + Crate Operator Node ACORDE HV + LV Operator Node EMC HV + LV + FED FED

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Monitoring the D CS farms – Intel Server Management (ISM) ISM selected as the temporary monitoring and supervision tools –IPMI-based –Windows/Linux monitoring agent –Out-of-band monitoring (supported mainboards) We are using Intel server boards everywhere in the DCS, but this might change n the future –Admin console (subnet monitoring), web GUI –Alerts, logs, counters, graphs –Software monitoring (logs changes)

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Admin console Host error report Host system log

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Admin console – reboot/power support Host software inventory

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Admin console Host counter graph (CPU) Host counter settings

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 Admin console Host HW report (Temperatures)

Peter Chochula for ALICE DCS, DCS review, Geneva April 3, 2006 OS Maintenance OS management is –Following CNIC architecture and –Based on NICEFC and LinuxFC Participating in evaluation of NICEFC –CMF –Remote system installation We appreciate the help of IT (Ivan), CMF will be used in the production cluster Minor concern is connected to application packaging – distribution of PVSS patches