Peter Chochula and Svetozár Kapusta ALICE DCS Workshop, October 6,2005 DCS Databases
Outline DCS data flow DCS Configuration Framework devices FERO Configuration DCS Archive Implementation of the DCS archive and present status DCS and the condition data AMANDA Additional developments of the DCS archive interface
The DCS Data Flow Config. PVSSII Archive Devices ECS Electricity Ventilation Cooling Gas Magnets Safety Access Control LHC DAQTRIHLT DIM, DIP FERO Version
The DCS Configuration FW tools provide methods for creation and use of the configuration database for FW devices ALICE add-on is the FERO configuration Ways to organize the FERO configuration database were presented several time (for example see the slides from Utrecht Workshop, or contact Peter or Sveťo)
FERO Configuration - remarks There is a common misunderstanding concerning the FERO configuration There is a difference between FERO configuration data and FED configuration FERO configuration data is directly loaded by the FED to the front-end electronics (e.g. thresholds, mask matrices etc.) FERO configuration should be organized as tables or BLOBs as explained in previous meetings FED configuration concerns the monitoring and operation of the FED (alert limits etc.) The FED configuration should be based on the FW concepts and components The PVSSII FED interface should follow the FW device definition The FW configuration database tools can be used to create and manipulate the configuration records Tools for defining the FW device are available
The DCS Archive implementation The concept has been presented several times (for example see slides from Utrecht Workshop)
The DCS Archive and Conditions Data The DCS archive contains all measured values (tagged for archival) Used by trending tools allows for access to historical values, e.g. for system debugging too big and too complex for offline data analysis (contains information which is not directly related to physics data such as safety, services, etc.) The main priority of the DCS is assure reliable archival of its data The conditions database contains a sub-set of measured values, stripped from the archive
DCS Archive implementation Standard PVSS archival: Archives are stored in local files (one set of files per PVSSII system) OFFLINE archives are stored on backup media (tools for backup and retrieval are part of the PVSSII) Problems with external access to archive files due to PVSS proprietary format The PVSS RDB archival: is replacing the previous method based on local files Oracle database server is required Architecture resembles the previous concept based on files ORACLE tablespace replaces the file A library handles the management of data on the Oracle Server (create/close tables, backup tables…)
DCS Archival status Mechanism based on ORACLE was delivered by ETM and tested by the experiments All PVSS-based tools are compatible with the new approach (we can profit from the PVSS trending etc.) problems were discovered, ETM is working on improved version ALICE main concern is the performance (~100 inserts/s per PVSS system) – which is not enough to handle an alert avalanche in a reasonable time Architectural changes were requested – e.g. the archive switching process should be replaced by ORACLE partitioning, etc. DCS will provide file-based archival during the pre- installation phase and replace it with ORACLE as soon as a new version qualifies for deployment Tools for later conversion from files to the RDB will be provided
AMANDA AMANDA is a PVSS-II manager which uses the PVSS API to access the archives Manager and Win C++ client developed by the DCS team (Vlado Fekete) Root client and interface to the OFFLINE developed by Boyko Yordanov Powerful communication protocol handles the data transfer and error reporting Archive architecture is transparent to AMANDA AMANDA can be used as interface between the PVSS archive and non-PVSS clients AMANDA returns data for requested time period
AMANDA – Alice Manager for DCS Archives User Interface Manager Driver User Interface Manager User Interface Manager Event Manager API Manager Control Manager AMANDA Server Archive(s) AMANDA Client PVSS-II Archive Manager(s) Archive Manager(s) Archive Manager Data Manager
Operation of AMANDA 1.After receiving the connection request, AMANDA creates a thread which will handle the client request 2.Due to present limitations of PVSS DM the requests are served sequentially as they arrive (the DM is not multithreaded) 3.AMANDA checks the existence of requested data and returns an error if it is not available 4.AMANDA retrieves data from archive and sends it back to the client in formatted blocks AMANDA adds additional load to the running PVSS system ! Final qualification of AMANDA for use in the production system depends on the user requirements
AMANDA in distributed system The PVSS can directly access only file-based data archives stored by its own data manager In a distributed system also data produced by other PVSS can be accessed, if the connection via DIST manager exists In case of the file-based archival DM of the remote system is always involved in the data transfer In case of RDB archival, the DM can retrieve any data provided by other PVSS within the same distributed system without bothering other DMs It is foreseen, that each PVSS system will run its own AMANDA There will be at least one AMANDA per detector Using more AMANDA servers overcomes some API limitations – some requests can be parallelized (clients need to know which sever has direct access to the requested data)
AMANDA in the distributed environment (archiving to files) UI DM DRV UI EM APICTR AMD PVSS-II UI DM DRV UI EM APICTR AMD PVSS-II UI DM DRV UI EM APICTR AMD PVSS-II DIS
AMANDA in the distributed environment (archiving to ORACLE) UI DM DRV UI EM APICTR AMD PVSS-II ORACLE UI DM DRV UI EM APICTR AMD PVSS-II UI DM DRV UI EM APICTR AMD PVSS-II DIS
Additional CondDB developments Development of a new mechanism for retrieving data from the RDB archive has been launched (Jim Cook – ATLAS) A separate process will access the RDB directly without involving the PVSS this approach will overcome the PVSS API limitations the data gathering process can run outside of the online DCS network The conditions data will be described in the same database (which datapoints should be retrieved, what processing should be applied, etc.) Configuration of the conditions will be done via PVSS panels by the DCS users – unified interface for all detectors Data will be written to the desired destination (root files in case of ALICE, COOL for ATLAS, CMS and LHCB) Parts of AMANDA client could be re-used First prototype available
Performance issues – Database Performance Performance of the DB server was studied The main question was – is the DB server’s performance a bottleneck? Several methods of data insertion were tested
Data Insertion Rate to the DB Server Data was inserted to 2 column table (number(5),varchar2(128) no index) Following results were obtained: OCCI autocommit ~500/s PL/SQL (bind variables) ~10000/s PL/SQL (vararrays) ~>50000/s Comparing the results with estimated performance of PVSS systems we can conclude, that possible limitation could be the interface implementation, but not the server
Data Download from Oracle Server (BLOBs) 150MB of configuration data, 3 concurrent clients, DCS Private network (1Gbit), 100Mbit/s client connection, 1Gbit/s server connection: Bfile~10.59 MB/s Blob~10.40 MB/s Blob, stream~10.93 MB/s Blob, ADO.NET~10.10 MB/s 150MB of configuration data, 1 client, CERN 10Mbit/s network: Bfile~0.81 MB/s Blob~0.78 MB/s Blob, stream~0.81 MB/s Blob, ADO.NET~0.77 MB/s
Data Download Results (BLOBs) The obtained results are comparable with the raw network throughput (direct copy between two computers) The DB server does not add significant overhead, neither for concurrent client session The main bottleneck could be the network There will be non-blocking switches used on the DCS network Further network optimization is possible according to the DB usage pattern (DCS switches are expandable to more performing backbone technologies) If needed, additional DB servers could be installed Some applications could profit from local data caching
Data download from tables Performance was studied using a realistic example – the SPD configuration The FERO data of one SPD readout chip consists of 44 DAC settings/chip There are in total 1200 chips used in the SPD Mask matrix for 8192 pixels/chip 15x120 front-end registers The full SPD configuration can be loaded within ~3sec
FW Configuration Database Tests (1) Development and tests were performed by Piotr Golonka (results are still preliminary, but already convincing) Test configuration consists of CAEN SY1527 with 16 A1511 boards (12 channels each) In the first tests all device properties were written/retrieved to/from the database In the second test only typical properties were used Values: i0, i1, v0, v1 setpoints, switch on/off, ramp speeds, trip time Alerts: overcurrent, overvoltage, trip, undervoltage, hw alert, current current, chanel on, status, current voltage)
FW Configuration Database Tests (2) Test 1 results (all properties) 28 sec for loading the recipe from database 7 sec to apply the recipe to PVSS 27 sec needed to store the recipe in cache 24 sec to apply recipe from cache 17sec to extract data from cache 7 sec to apply the recipe to PVSS
FW Configuration Database Tests (3) Test 2 results (selected properties) 14 sec for loading the recipe from database 5 sec to apply the recipe to PVSS 10 sec needed to store the recipe in cache 10.5 sec to apply recipe from cache 5 sec to extract data from cache 5.5 sec to apply the recipe to PVSS Tests performed with 26 systems showed that the download times are not affected by concurrent sessions
The FW configuration DB - results The total time needed to download a configuration for a CAEN crate (using caching mechanism) is ~10s Significant speed improvement - this number should be compared with the previous time of ~180s Further performance improvements are expected (using prefetching mechanism etc.) Incremental recipes will bring both performance and functionality improvement Incremental recipe is loaded on top of a existing recipe
Present status and Conclusions Configuration DB FW ConfDB concept was redesigned Significant speed improvement New functionality is being added (e.g. recipe switching) FED Configuration should follow the FW ConfDB concepts FERO Configuration performance did not reveal problems Archival DB Present RDB problems are being solved File-based archival will be used in the early stages of the pre- installation AMANDA provides and interface with external clients, independent from archive technology Present ATLAS and future FW developments could provide an efficient method for accessing the RDB archive