The ATLAS TAGs Database - Experiences and further developments Elisabeth Vinek, CERN & University of Vienna on behalf of the TAGs developers group
Outline I. Overview: What are the TAGs? Physics Content TAG Formats II. Components of the TAG system: files, databases and services The TAG database(s) Data distribution Cataloging of data and services ELSSI: the TAG browser III. Experiences... from managing a very large database from maintaining a distributed architecture IV. Further developments On-demand selection of TAG services 2 ATLAS TAG Database
I. ATLAS TAGs: event-level metadata TAGs are event-by-event metadata records containing: key quantities that identify and describe the event, intended to be useful for event selection, and sufficient navigational information to allow access to the event data at all prior processing stages: RAW, ESD, and AOD (and possibly more, e.g., for Monte Carlo data) TAG is not an acronym! Contains more than 200 variables for each event to make event selection easy and efficient Variables decided by 2006 Task force and maintained by the PAT (Physics Analysis Tools) group TAG contents are quite stable since then Data representation evolves only for easier use (e.g. Trigger decision at three levels) ATLAS TAG Database3
I. TAG Content ~200 variables/event: Event identification (run number, event, lumi block,timestamp) Trigger decisions at all three levels (bit encoded) Numbers of electrons, muons, photons, taus, jets pT, eta, phi for highest-pT objects Global quantities (e.g. missing ET) Detector status and quality words For each Physics & Performance group, a 32-bit word is reserved to flag the interesting events for their analysis Sufficient information to point back to the previous processing stages (i.e. RAW/ESD/AOD) ATLAS TAG Database4
I. TAG Formats TAGs are produced in central ATLAS reconstruction RAW Data ESD (Event Summary Data) ~500 kB/event AOD (Analysis Object Data) ~100 kB/event (Egamma) TAG ~1kB/event planned – in reality ~3-4kB/event TAG Formats: File based TAGs are built from AOD content and written into files when AOD files are merged. TAGs are like other file based data. POOL ROOT format, can be browsed by ROOT, but is actually a POOL standard like the data files. Organized into datasets and distributed to the appropriate Tiers of ATLAS. Relational TAGs are uploaded to Oracle databases ATLAS TAG Database5
II. Components of the TAG system ATLAS TAG Database6 TAG ROOT Files TAG site n TAG site2 TAG site1 TAG data & services catalog DATASERVICES skim extract TAG ROOT File ESD/AOD ROOT File lookup queries upload produces site m skim extract Conditions metadata
II. The TAG databases, TAG uploads Relational databases (Oracle 10g/11g) at several sites Data organized by RAW streams within a project The physical „unit“ is a (POOL) collection The upload of the TAGs to all sites is done by the Tier-0 Management System, as the last step in the reconstruction chain „Posttagupload“ running on each DB after the upload to manage the data efficiently: index creation, partitioning, monitoring etc ATLAS TAG Database7
II. Data Distribution Relational TAGs are not distributed to all Tier-1s or Tier-2s as are file- based TAGs; sites (Tier-1s as well as Tier-2s) are hosting them on a volountary basis. Requires providing an Oracle database service on a Terabyte scale. Current TAG sites: CERN, DESY, PIC, BNL, TRIUMF Current data distribution and volumes: ATLAS TAG Database8 CERNDESYPICBNLTRIUMF data TB1.14 TB2.24 TB250 GB670 GB data09320 GB300 GB-360 GB350 GB mc TB--- mc GB---
II. Cataloging data and services A „replica catalog“ is needed to keep track of relational data distribution. Implemented as a database schema, in a DWH-like star design Updated automatically by the Tier-0 management system via an API Used: by the TAG browser to show available data, by all TAG services to establish the connection to the right site hosting the required data to mark data for deletion to gather performance statistics on the upload processed A „services catalog“ is needed to keep track of all services, their deployments, status and metrics. Implemented as a database schema, 2 layers: Stats gathering Aggregation – „offline“ computation of metrics per service ATLAS TAG Database9
II. TAG catalogs ATLAS TAG Database10
II. ELSSI: Event Level Selection Service Interface Interface for querying the TAG databases Selection based on runs, streams, trigger decisions and physics attributes Access with certificate (VO atlas) ATLAS TAG Database11 Table, Histograms, AOD GUIDs extract, skimming
III. Experiences from managing very large databases Much effort has been put in managing the huge amount of data efficiently on the database level: Schema and tablespace strategy to enable easy data deletion Indexing: All indexing strategy since the beginning B-tree and bitmap indexes, depending on variables – space issues! Horizontal table partitioning to group runs in physical units and to allow read/write optimization Compression: saves a lot of space! However, limitations in compression led to tests on vertical partitioning (ongoing) Data deletion kept in synch with file deletion from sites Periodic cross-checks needed on various levels to ensure data consistency and synchonicity with file-based TAGs ATLAS TAG Database12
III. Experiences from managing a distributed architecture Coordination effort to keep software version at several sites in synch Aumation on the DB level, but service deployments are managed locally. Efforts to also deploy web services at other sites than CERN. Data distribution: As for now, all data except MC is at complete CERN -> can be used as reliable data source. In general, a reprocessing pass should always be complete at a site, but this can change. Challenges will come when data is more scattered and not a full reprocessing pass at one site -> a query as requested by the user cannot be made by connecting to only one site -> query distribution? Load balancing between databases and services – until now the user chooses the site to query, but this will change -> automated site picking includes automated load-balancing ATLAS TAG Database13
IV. On demand selection of (TAG) services In an automated load balancing environment, several decisions are to be taken when a user makes a request: Which DB site to connect to? Which browser to connect to? Which web services to use? order to satisfy the user‘s request within a defined time (quality of service baseline) and with ensuring that the whole distributed system is able to satisfy as many requests as possible (load balanced system, avoiding bottlenecks as much as possible). Information needed to take these decisions: Service deployments, status and metrics (-> services catalog) Data distribution (-> data catalog) User input Typical usage patterns and disibution of requests Service selection to take place on demand, at request time. Investigation of approaches and algorithms underway. The model will have to be able to easily adapt to: New services Evolving objective functions ATLAS TAG Database14
Conclusions TAG content and operational processed are stable, but more sites might join to host TAG data and/or services -> evolving infrastructure. As in the file-based world, central catalogs are in place that summarize data and services distribution, as well as services metrics. Experiences since data taking: TAGs are uploaded to the databases without much delay Separate process for upload of reprocessed data Importance of efficient space management (including data deletion) Strategies adopted by ATLAS DBAs to manage the important volume of TAG data proved to be efficient. New use cases arise (event selection) Effort now on automating request distribution and load balancing. Hypernews for questions about TAGs: hn-atlas-physicsMetadata ATLAS TAG Database15