STAR C OMPUTING STAR Overview and OO Experience Torre Wenaus STAR Computing and Software Leader Brookhaven National Laboratory, USA CHEP 2000, Padova February.

Slides:



Advertisements
Similar presentations
Object Persistency & Data Handling Session C - Summary Object Persistency & Data Handling Session C - Summary Dirk Duellmann.
Advertisements

Adding scalability to legacy PHP web applications Overview Mario A. Valdez-Ramirez.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
O. Stézowski IPN Lyon AGATA Week September 2003 Legnaro Data Analysis – Team #3 ROOT as a framework for AGATA.
Objectivity Data Migration Marcin Nowak, CERN Database Group, CHEP 2003 March , La Jolla, California.
Grid Collector: Enabling File-Transparent Object Access For Analysis Wei-Ming Zhang Kent State University John Wu, Alex Sim, Junmin Gu and Arie Shoshani.
Data Management for Physics Analysis in PHENIX (BNL, RHIC) Evaluation of Grid architecture components in PHENIX context Barbara Jacak, Roy Lacey, Saskia.
EventStore Managing Event Versioning and Data Partitioning using Legacy Data Formats Chris Jones Valentin Kuznetsov Dan Riley Greg Sharp CLEO Collaboration.
A Model for Grid User Management Rich Baker Dantong Yu Tomasz Wlodek Brookhaven National Lab.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
Understanding and Managing WebSphere V5
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
STAR C OMPUTING The STAR Databases: Objectivity and Post-Objectivity Torre Wenaus BNL ATLAS Computing Week CERN August 31, 1999.
2/10/2000 CHEP2000 Padova Italy The BaBar Online Databases George Zioulas SLAC For the BaBar Computing Group.
Framework for Automated Builds Natalia Ratnikova CHEP’03.
Fermi Large Area Telescope (LAT) Integration and Test (I&T) Data Experience and Lessons Learned LSST Camera Workshop Brookhaven, March 2012 Tony Johnson.
STAR C OMPUTING ROOT in STAR Torre Wenaus STAR Computing and Software Leader Brookhaven National Laboratory, USA ROOT 2000 Workshop, CERN February 3, 2000.
Central Reconstruction System on the RHIC Linux Farm in Brookhaven Laboratory HEPIX - BNL October 19, 2004 Tomasz Wlodek - BNL.
Miguel Branco CERN/University of Southampton Enabling provenance on large-scale e-Science applications.
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL U.S. ATLAS Physics and Computing Advisory Panel Review Argonne National Laboratory Oct 30, 2001.
PanDA Multi-User Pilot Jobs Maxim Potekhin Brookhaven National Laboratory Open Science Grid WLCG GDB Meeting CERN March 11, 2009.
David N. Brown Lawrence Berkeley National Lab Representing the BaBar Collaboration The BaBar Mini  BaBar  BaBar’s Data Formats  Design of the Mini 
Event Data History David Adams BNL Atlas Software Week December 2001.
HENP Computing at BNL Torre Wenaus STAR Software and Computing Leader BNL RHIC & AGS Users Meeting Asilomar, CA October 21, 1999.
ATLAS, U.S. ATLAS, and Databases David Malon Argonne National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory.
STAR C OMPUTING STAR Computing Infrastructure Torre Wenaus BNL STAR Collaboration Meeting BNL Jan 31, 1999.
MAGDA Roger Jones UCL 16 th December RWL Jones, Lancaster University MAGDA  Main authors: Wensheng Deng, Torre Wenaus Wensheng DengTorre WenausWensheng.
NOVA Networked Object-based EnVironment for Analysis P. Nevski, A. Vaniachine, T. Wenaus NOVA is a project to develop distributed object oriented physics.
Erik Blaufuss University of Maryland Data Filtering and Software IceCube Collaboration Meeting Monday, March 21, 2005.
 2009 Calpont Corporation 1 Calpont Open Source Columnar Storage Engine for Scalable MySQL Data Warehousing April 22, 2009 MySQL User Conference Santa.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
CMS pixel data quality monitoring Petra Merkel, Purdue University For the CMS Pixel DQM Group Vertex 2008, Sweden.
1 GCA Application in STAR GCA Collaboration Grand Challenge Architecture and its Interface to STAR Sasha Vaniachine presenting for the Grand Challenge.
STAR Event data storage and management in STAR V. Perevoztchikov Brookhaven National Laboratory,USA.
STAR C OMPUTING STAR Analysis Operations and Issues Torre Wenaus BNL STAR PWG Videoconference BNL August 13, 1999.
STAR Collaboration, July 2004 Grid Collector Wei-Ming Zhang Kent State University John Wu, Alex Sim, Junmin Gu and Arie Shoshani Lawrence Berkeley National.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
NOVA A Networked Object-Based EnVironment for Analysis “Framework Components for Distributed Computing” Pavel Nevski, Sasha Vanyashin, Torre Wenaus US.
STAR C OMPUTING Plans for Production Use of Grand Challenge Software in STAR Torre Wenaus BNL Grand Challenge Meeting LBNL 10/23/98.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
Managing Enterprise GIS Geodatabases
STAR C OMPUTING The STAR Databases: From Objectivity to ROOT+MySQL Torre Wenaus BNL ATLAS Computing Week CERN August 31, 1999.
STAR C OMPUTING Introduction Torre Wenaus BNL May ‘99 STAR Computing Meeting BNL May 24, 1999.
The ATLAS TAGs Database - Experiences and further developments Elisabeth Vinek, CERN & University of Vienna on behalf of the TAGs developers group.
General requirements for BES III offline & EF selection software Weidong Li.
ATLAS Database Access Library Local Area LCG3D Meeting Fermilab, Batavia, USA October 21, 2004 Alexandre Vaniachine (ANL)
The ATLAS DAQ System Online Configurations Database Service Challenge J. Almeida, M. Dobson, A. Kazarov, G. Lehmann-Miotto, J.E. Sloper, I. Soloviev and.
Magda Distributed Data Manager Prototype Torre Wenaus BNL September 2001.
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
The MEG Offline Project General Architecture Offline Organization Responsibilities Milestones PSI 2/7/2004Corrado Gatto INFN.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL DOE/NSF Review of US LHC Software and Computing Fermilab Nov 29, 2001.
Next-Generation Navigational Infrastructure and the ATLAS Event Store Abstract: The ATLAS event store employs a persistence framework with extensive navigational.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
The Database Project a starting work by Arnauld Albert, Cristiano Bozza.
Barthélémy von Haller CERN PH/AID For the ALICE Collaboration The ALICE data quality monitoring system.
Magda Distributed Data Manager Torre Wenaus BNL October 2001.
Fermilab Scientific Computing Division Fermi National Accelerator Laboratory, Batavia, Illinois, USA. Off-the-Shelf Hardware and Software DAQ Performance.
CMS High Level Trigger Configuration Management
The ZEUS Event Store An object-oriented tag database for physics analysis Adrian Fox-Murphy, DESY CHEP2000, Padova.
Data Lifecycle Review and Outlook
Nuclear Physics Data Management Needs Bruce G. Gibbard
OO-Design in PHENIX PHENIX, a BIG Collaboration A Liberal Data Model
Star Online System Claude A. Pruneau, Mei-li Chen, Adam Kisiel, and Jeff Porter CHEP 2000, Padova, Italy.
Simulation and Physics
McGraw-Hill Technology Education
ATLAS DC2 & Continuous production
Presentation transcript:

STAR C OMPUTING STAR Overview and OO Experience Torre Wenaus STAR Computing and Software Leader Brookhaven National Laboratory, USA CHEP 2000, Padova February 7, 2000

STAR C OMPUTING Torre Wenaus, BNL CHEP 2000 Outline STAR and STAR Computing Framework and analysis environment Data storage and management  Technology choices  OO event model  Database Current status OO Experience Emphasis on offline… c.f. Claude Pruneau’s STAR Online talk

STAR C OMPUTING Torre Wenaus, BNL CHEP 2000 STAR at RHIC RHIC: Relativistic Heavy Ion Collider at Brookhaven National Laboratory  Colliding Au - Au nuclei at 100GeV/nucleon  Principal objective: Discovery and characterization of the Quark Gluon Plasma (QGP)  First year physics run April-August 2000 STAR experiment  One of two large experiments at RHIC, >400 collaborators each l PHENIX is the other  Hadrons, jets, electrons and photons over large solid angle l Principal detector: 4m TPC drift chamber  ~4000+ tracks/event recorded in tracking detectors  High statistics per event permit event by event measurement and correlation of QGP signals

Summer ’99 Engineering run Beam gas event

STAR C OMPUTING Torre Wenaus, BNL CHEP 2000 Computing at STAR Data recording rate of 20MB/sec; 15-20MB raw data per event (~1Hz)  17M Au-Au events (equivalent) recorded in nominal year  Relatively few but highly complex events Requirements:  200TB raw data/year; 270TB total for all processing stages  10,000 Si95 CPU/year  Wide range of physics studies: ~100 concurrent analyses in ~7 physics working groups Principal facility: RHIC Computing Facility (RCF) at Brookhaven  20,000 Si95 CPU, 50TB disk, 270TB robotic (HPSS) in ’01 Secondary STAR facility: NERSC (LBNL)  Scale similar to STAR component of RCF Platforms: Red Hat Linux/Intel and Sun Solaris

STAR C OMPUTING Torre Wenaus, BNL CHEP 2000 STAR Software Environment In Fortran:  Simulation, reconstruction In C++:  All post-reconstruction physics analysis  Recent simu, reco codes  Infrastructure  Online system (+ Java GUIs) ~75 packages ~7 FTEs over 2 years in core offline ~50 regular developers ~70 regular users (140 total) Migratory Fortran => C++ software environment central to STAR offline design C++:Fortran ~ 65%:35% in Offline  from ~20%/80% in 9/98

STAR C OMPUTING Torre Wenaus, BNL CHEP 2000 STAR Offline Framework STAR Offline Framework must support  7+ year investment and experience base in legacy Fortran  OO/C++ offline software environment for new code  Migration of legacy code: concurrent interoperability of old and new Fortran developed in a migration-friendly Framework – StAF – enforcing IDL- based data structures and component interfaces  Evolving StAF to a fully OO Framework / analysis environment judged too expensive in development and support  Instead, leverage a very capable tool from the community… 11/98 adopted new Framework built over ROOT  Modular components ‘Makers’ instantiated in a processing chain progressively build (and ‘own’) event components  Automated wrapping supports Fortran and IDL based data structures without change  Same environment supports reconstruction and physics analysis  In production since RHIC’s second ‘Mock Data Challenge’, Feb-Mar ‘99 and used for all STAR offline software and physics analysis cf. Valery Fine’s “STAR Framework” talk for details

STAR C OMPUTING Torre Wenaus, BNL CHEP 2000 STAR Event Store Technology Choices Original (1997 RHIC event store task force) STAR choice: Objectivity Prototype Objectivity event store and conditions DB deployed Fall ’98  Worked well, BUT growing concerns over Objectivity Decided to develop and deploy ROOT as DST event store in Mock Data Challenge 2 (Feb-Mar ‘99) and make a choice ROOT I/O worked well and selection of ROOT over Objectivity was easy  Other factors: good ROOT team support; CDF decision to use ROOT I/O Adoption of ROOT I/O left Objectivity with one event store role remaining to cover: the true ‘database’ functions  Navigation to run/collection, event, component, data locality  Management of dynamic, asynchronous updating of the event store But Objectivity is overkill for this, so we went shopping…  with particular attention to Internet-driven tools and open software and came up with MySQL

STAR C OMPUTING Torre Wenaus, BNL CHEP 2000 Technology Requirements: My version of 1/00 View

STAR C OMPUTING Torre Wenaus, BNL CHEP 2000 Event Store Characteristics Flexible partitioning of event components to different streams based on access characteristics  Data organized as named components resident in different files constituting a ‘file family’  Successive processing stages add new components Automatic schema evolution  New codes reading old data and vice versa No requirement for on-demand access  Desired components are specified at start of job l permitting optimized retrieval for the whole job l using Grand Challenge Architecture; cf. David Malon’s talk  If additional components found to be needed, event list is output and used as input to new job  Makes I/O management simpler, fully transparent to user c.f. Victor Perevoztchikov’s “STAR Event Data Storage” talk for details

STAR C OMPUTING Torre Wenaus, BNL CHEP 2000 STAR Event Model StEvent C++/OO first introduced into STAR in physics analysis  Essentially no legacy post-reconstruction analysis code  Permitted complete break away from Fortran at the DST StEvent C++/OO event model developed  Targeted initially at DST; now being extended upstream to reconstruction and downstream to micro DSTs  Event model seen by application codes is “generic C++” by design – does not express implementation and persistency choices l Developed initially (deliberately) as a purely transient model – no dependencies on ROOT or persistency mechanisms  Implementation later rewritten using ROOT to provide persistency l Gives us a direct object store – no separation of transient and persistent data structures – without ROOT appearing in the interface

STAR C OMPUTING Torre Wenaus, BNL CHEP 2000 MySQL as the STAR Database Relational DB, open software, very fast, widely used on the web  Not a full featured heavyweight like Oracle  No transactions, no unwinding based on journalling  Good balance between feature set and performance for STAR Development pace is very fast with a wide range of tools to use  Good interfaces to Perl, C/C++, Java  Easy and powerful web interfacing  Like a quick protyping tool that is also production capable – for appropriate applications l Metadata and compact data Multiple hosts, servers, databases can be used (concurrently) as needed to address scalability, access and locking characteristics

STAR C OMPUTING Torre Wenaus, BNL CHEP 2000 MySQL based DB applications in STAR File catalogs for simulated and real data  Catalogues ~22k files, ~10TB of data Being integrated with Grand Challenge Architecture (GCA) Production run log used in datataking Event tag database  Good results with preliminary tests of 10M row table, 100bytes/row l 140sec for full SQL query, no indexing (70 kHz) Conditions (constants, geometry, calibrations, configurations) database Production database  Job configuration catalog, job logging, QA, I/O file management Distributed (LAN or WAN) processing monitoring system  Monitors STAR analysis facilities at BNL; planned extension to NERSC Distributed analysis job editing/management system Web-based browsers for all of the above cf. Sasha Vanyashin’s NOVA talk

STAR Databases and Navigation Between Them

STAR C OMPUTING Torre Wenaus, BNL CHEP 2000 Current Status Offline software infrastructure and applications are operational in production and ‘ready’ to receive year 1 physics data  ‘Ready’ in quotes: there is much essential work still under way l Tuning and ongoing development in reconstruction, physics analysis software, database integration l Data mining and analysis operations infrastructure including Grand Challenge deployment  Successful production at year 1 throughput levels last week, in a ‘mini Mock Data Challenge’ exercise Final Mock Data Challenge prior to physics data is in March  Stress testing analysis software and infrastructure, uDSTs  Target for an operational Grand Challenge

STAR C OMPUTING Torre Wenaus, BNL CHEP 2000 OO ‘and related’ Experience and Lessons C++ very well suited to modular component architecture and ‘natural’ data models mapping well onto a physicist’s view of physics analysis  If the latter is true it should ‘sell itself’ once people have such an analysis environment in their hands, and this we do find l Good response to an OO/C++ analysis environment in a heavily Fortran community l Evolution – vital for continuity and preserving experience base and productivity – is practical and effective  C++ the right choice … despite the still dismal compiler situation l Mainstream, designed for performance, close to maturity (we hope)  Training by practical example and hands-on mentoring required first; formal training found useful later (success with Object Mentor’s advanced OOAD course) STAR’s initial pursuit of Objectivity was a mistake  A monolithic solution, abandoned in favor of a more secure hybrid Success in factorizing data management into distinct object store and database components  Without compromising OO environment seen by users, or maintainability

STAR C OMPUTING Torre Wenaus, BNL CHEP 2000 OO ‘and related’ Experience and Lessons (2) Great success with ROOT as C++/OO tool set and analysis environment available today  Seen as long term solution for STAR  But data model and user code can be shielded from specifics, preserving flexibility Open software tools and technologies of vital and growing importance  STAR’s two major commercial software components (Objectivity and Orbix) both replaced with open software/community tools (ROOT/MySQL and Orbacus) l Commercial product dependencies are a painful burden  A long list of other open software tools employed by STAR l Apache and add-ons, perl, php, XML, LXR code documentation, cvsweb, HyperNews, Debian bug tracking, Samba, cons build management, gcc, Linux,… and others.

STAR C OMPUTING Torre Wenaus, BNL CHEP 2000 We Want You! Taking blatant advantage of this opportunity… I am being replaced! (Moving to ATLAS.) The Computing and Software Leader job is coming open  BNL job posted; hire ASAP  Talk to me for more info!