Download presentation
Presentation is loading. Please wait.
Published byGertrude Richard Modified over 9 years ago
1
STAR C OMPUTING ROOT in STAR Torre Wenaus STAR Computing and Software Leader Brookhaven National Laboratory, USA ROOT 2000 Workshop, CERN February 3, 2000
2
STAR C OMPUTING Torre Wenaus, BNL ROOT 2000 Outline STAR and STAR Computing ROOT in STAR Framework and analysis environment §cf. Valery Fine’s talk Event data store §cf. Victor Perevoztchikov’s talk l Complemented by MySQL l OO Event persistency Current status
3
STAR C OMPUTING Torre Wenaus, BNL ROOT 2000 STAR at RHIC RHIC: Relativistic Heavy Ion Collider at Brookhaven National Laboratory Colliding Au - Au nuclei at 100GeV/nucleon Principal objective: Discovery and characterization of the Quark Gluon Plasma (QGP) First year physics run April-August 2000 STAR experiment One of two large experiments at RHIC, >400 collaborators each l PHENIX is the other Hadrons, jets, electrons and photons over large solid angle l Principal detector: 4m TPC drift chamber (operational) l Si tracker (year 2) l Pb/scint EM calorimeter (staged over years 1-3) ~4000+ tracks/event recorded in tracking detectors High statistics per event permit event by event measurement and correlation of QGP signals
5
Summer ’99 Engineering run Beam gas event
6
STAR C OMPUTING Torre Wenaus, BNL ROOT 2000 Computing at STAR Data recording rate of 20MB/sec; 15-20MB raw data per event (~1Hz) 17M Au-Au events (equivalent) recorded in nominal year Relatively few but highly complex events Requirements: 200TB raw data/year; 270TB total for all processing stages 10,000 Si95 CPU/year Wide range of physics studies: ~100 concurrent analyses in ~7 physics working groups Principal facility: RHIC Computing Facility (RCF) at Brookhaven All reconstruction, some analysis, no simulation 20 kSi95 CPU, 50TB disk, 270TB robotic (HPSS) in ’01 Secondary STAR facility: NERSC (LBNL) Scale similar to STAR component of RCF
7
STAR C OMPUTING Torre Wenaus, BNL ROOT 2000 STAR Software Environment C++:Fortran ~ 65%:35% in Offline from ~20%/80% in 9/98 l ~5%/95% in application codes In Fortran: Most simulation and reconstruction In C++: All post-reconstruction physics analysis code New/recent simulation, reconstruction codes Virtually all infrastructure code ~75 packages Online system in C++ Java GUIs Migratory Fortran => C++ software environment central to STAR offline design
8
STAR C OMPUTING Torre Wenaus, BNL ROOT 2000 STAR Offline Framework STAR’s legacy Fortran developed in a migration-friendly Framework – StAF – enforcing IDL-based data structures and component interfaces But proceeding with StAF to a fully OO Framework / analysis environment required major in-house development and support Instead … leverage a very capable tool from the community 11/98 adopted new Framework built over ROOT Origins in the ROOT-based ATLFAST Modular components ‘Makers’ instantiated in a processing chain progressively build (and ‘own’) event components Supports legacy Fortran and IDL based data structures developed in previous StAF framework without change (automated wrapping) Deployed in offline production and analysis since RHIC’s second ‘Mock Data Challenge’, 2-3/99 cf. Valery’s talk for details
9
STAR C OMPUTING Torre Wenaus, BNL ROOT 2000 Event Store Design/Implementation Requirements Flexible partitioning of event components to different streams based on access characteristics Support for both IDL-defined data structures and an OO event model, compatible with Fortran => C++ migration Robust schema evolution (new codes reading old data and vice versa) Easy management and navigation; dynamic addition of event components Named collections of events, production- and user- defined Navigation from from a run/event/component request to the data No requirement for on-demand access Desired event components are specified at start of job, and optimized retrieval of these components is managed for the whole job (Grand Challenge) On-demand access not compatible with job-level optimization of event component retrieval
12
STAR C OMPUTING Torre Wenaus, BNL ROOT 2000 Objectivity or ROOT as Event Data Store? Original (1997 RHIC event store task force) STAR choice: Objectivity Prototype Objectivity event store and conditions DB’s deployed Fall ’98 50GB DST event store tested in first Mock Data Challenge Basically worked OK, but many concerns surrounding Objectivity Decided to develop and deploy ROOT as DST event store in Mock Data Challenge 2 (Feb-Mar ‘99) and make a choice Performed well; met requirements in multiple data streams for different event components Conclusion: A functional and viable solution for an event store Easy decision to adopt ROOT over Objectivity Particularly with CDF’s decision to use ROOT I/O in Run II
13
STAR C OMPUTING Torre Wenaus, BNL ROOT 2000 Features of ROOT-based Event Store Data organized as named components resident in different files constituting a ‘file family’ Successive processing stages add new components No separation between transient and persistent data structures Transient object interfaces do not express the persistency mechanism, but the object implementations are directly persistent Used for our OO/C++ event model StEvent, giving us a direct object store without ROOT appearing in the event model interface Automatic schema evolution implemented Extension of existing manual ROOT schema evolution c.f. Victor’s talk for details
14
STAR C OMPUTING Torre Wenaus, BNL ROOT 2000 STAR Event Model StEvent Introduction of C++/OO into STAR: in physics analysis Until a year ago, essentially no post-reconstruction analysis code intended for real data analysis A community interested and willing to move to C++/OO Allowed us to break completely away from Fortran at the DST StEvent C++/OO event model developed Targeted initially at DST and post-reconstruction analysis To be extended later – upstream to reconstruction and downstream to micro DSTs – now in progress Requirements reflected in StEvent design: Event model seen by application codes is “generic C++” – does not express implementation and persistency choices l Developed initially (deliberately) as a purely transient model – no dependencies on ROOT or persistency mechanisms l Later implemented in ROOT to provide direct persistency
16
STAR C OMPUTING Torre Wenaus, BNL ROOT 2000 ROOT+MySQL Based Data Management Adoption of ROOT I/O for the event store left Objectivity with one role remaining to cover: the true ‘database’ functions of the event store Navigation among event collections, runs/events, event components Data locality (now translates basically to file lookup) Management of dynamic, asynchronous updating of the event store from one end of the processing chain to the other l From initiation of an event collection (run) in online through addition of components in reconstruction, analysis and their iterations and the conditions database But with the concerns and weight of Objectivity it is overkill for this. So we went shopping… with particular attention to the rising wave of Internet-driven tools and open software and came up with MySQL
17
STAR C OMPUTING Torre Wenaus, BNL ROOT 2000 Requirements: STAR 1/00 View (My Version)
18
STAR C OMPUTING Torre Wenaus, BNL ROOT 2000 MySQL as the STAR Database Relational DB, open software, very fast, widely used on the web Not a full featured heavyweight like Oracle Relatively simple and fast No transactions, no unwinding based on journalling Development pace is very fast with a wide range of tools to use Good interfaces to Perl, C/C++, Java Easy and powerful web interfacing Great experience so far Like a quick protyping tool that is also production capable – for the right applications l Metadata and compact data Multiple servers, databases can be used as needed to address scalability, access and locking characteristics
21
STAR C OMPUTING Torre Wenaus, BNL ROOT 2000 MySQL based applications in STAR File catalogs for simulated and real data Catalogues ~22k files, ~10TB of data Being integrated with Grand Challenge Architecture (GCA) Production run log used in datataking Event tag database Good results with preliminary tests of 10M row table, 100bytes/row l 140sec for full SQL query, no indexing (70 kHz) Conditions (constants, geometry, calibrations, configurations) database Production database Job configuration catalog, job logging, QA, I/O file management Distributed (LAN or WAN) processing monitoring system Monitors STAR analysis facilities at BNL; planned extension to NERSC Distributed analysis job editing/management system Web-based browsers for all of the above
22
STAR Databases and Navigation Between Them
23
STAR C OMPUTING Torre Wenaus, BNL ROOT 2000 Current Status Offline software infrastructure, data management, applications and production operations are operational in production and ‘ready’ to receive year 1 physics data ‘Ready’ in quotes because there is much essential work still under way l Reconstruction, analysis software, data mining and analysis operations infrastructure including Grand Challenge deployment We’re in production at year 1 throughput levels this week, in another ‘mini Mock Data Challenge’ exercise Our final Mock Data Challenge prior to physics data is in March Stress testing analysis software and infrastructure, uDSTs Target for an operational Grand Challenge ROOT is a vital and robust component through much of STAR software And is a major reason we’re as ready for data as we are Thank you to the core ROOT team and ROOT developers everywhere!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.