STAR C OMPUTING STAR Computing Status and Plans Torre Wenaus BNL STAR Collaboration Meeting BNL Jan 31, 1999
STAR C OMPUTING Torre Wenaus, BNL Collaboration meeting 31/1/99 Recent Arrivals New members of the core BNL computing team: Jeff Porter, head of databases; online/offline developer l Started Nov ‘98; joined us from the LBNL group l Working in calibration database development and content definition, online DB development, GC-ROOT integration, Objectivity, Sun system management,... Mei-Li Chen, online developer l Starts Feb 16, 1999; from U Maryland (Super-K, Milagro) l Very experienced online/DAQ expert; will be a strong contributor to online integration and commissioning l Has to come up to speed on C++ No new hires foreseen! I hope (and expect!) the existing team will be in place for years to come… There’s lots of work to do Support is foreseen in operations budget
STAR C OMPUTING Torre Wenaus, BNL Collaboration meeting 31/1/99 Recent Arrivals (2) Postdoc/student involvement is clearly improving Examples of key new core computing contributions: l Dan Russ, CMU: PSC simulation production l Gene Van Buren: ROOT tutorials, online help, migration And continuing contributions: l David Zimmerman, LBNL: GC, TagDB, simu production l Brian Lasiuk, Yale: TRS, C++ infrastructure … and many others working on offline software l We encourage regular BNL visits for close coupling to the software effort, face-to-face help, and greater visibility; come to the center of the action! But we’ve lost the postdoc targeted for MDC2 production help; we need a new source of help We look forward to a continuing influx! l Eg. Very glad to have Dave Hardtke relocated to BNL from LBNL
STAR C OMPUTING Torre Wenaus, BNL Collaboration meeting 31/1/99 Outstanding Personnel issues Online software in need of additional personnel Online systems/infrastructure support; solution being sought at BNL New EMC rep: Wayne State New TPC rep (must be BNL-resident): LBNL is working on it Biggest staffing hole is dedicated database/event store effort Existing low level of effort is resulting in delay, compromise and greater risk l We are late in deploying and exercising a full event store solution; our Objectivity and ROOT usage modes need to be stress tested in real production/public use Essentially no prospect of getting a new hire; have to do the best with what we have l Greater and longer-term use of interim flat-file solutions l Expanding ROOT I/O capabilities means more viable ROOT-based solutions, leveraging the community We need SVT and EMC software coordinators!
STAR C OMPUTING Torre Wenaus, BNL Collaboration meeting 31/1/99 Departures Nathan Stone left the BNL group for a software professional job at the Pittsburgh Supercomputer Center Mark Gilkes left Purdue for a new life in California l but continues to be productive from home in Santa Barbara where he is still participating in online system development (at least until March; possibly longer) Mark Pollack left PHENIX for a software professional job in NYC; not a STAR departure, but an unfortunate loss for all of RHIC Best of luck to them all, and to us in trying to overcome their absence!
STAR C OMPUTING Torre Wenaus, BNL Collaboration meeting 31/1/99 Major Events Since July Collaboration meeting (July): ROOT framework deployed; call for 3mos of user & MDC1 trials T3E production (late August): T3E GSTAR production and RCF-HPSS transfer operational MDC1 (Sep-Oct): Yr1 & Yr2 detector configurations; EMC, FTPC not in reco/DST Production stable by second production release; exceeded `official’ goal of 100k Au-Au events through the production chain Primary (~95%) production in STAF; remainder in ROOT Simulated data:1.7TB of 200GeV/n Au-Au HIJING (217k events) Reconstructed DSTs: 600GB XDF files (168k events) Objectivity event store: 50GB, stopped when disk space exhausted Grand Challenge: managed HPSS access, queries, CAS integration
STAR C OMPUTING Torre Wenaus, BNL Collaboration meeting 31/1/99 Major Events (2) Software workshops and PWG/Computing meetings (Nov) SW decisions: scope of ROOT use; C++ data model Online review (Dec 8): Rapid progress and effective design commended, but loss of personnel must be addressed by STAR and system integration/performance needs to be demonstrated (no report issued yet) DAQ/offline interface API for TPC ADC data established (mid Dec) First application as interface between TRS and TPC clustering code l TRS interface exists; TPC code interface in progress Still to be exercised, completed and finalized Transient data model design milestone (Dec 15) Thomas’ design led to much interesting discussion! Start of simulation production for MDC2 (early Jan) Field map, RICH & other geometry updates; production already well advanced
STAR C OMPUTING Torre Wenaus, BNL Collaboration meeting 31/1/99 Major Events (3) Calibration database definition and implementation effort begun in earnest (Dec-Jan) Transient data model implementation milestone (Jan 15) basic StEvent coded up by the 15th; then extended to load content from MDC1 DST tables StEventReaderMaker now loading StEvent from MDC1 DSTs via standard ROOT chain Usable now in ‘guinea pig’ mode; should be ‘end user’ available in at most a week Full production chain migration to ROOT before MDC2 … and now MDC2 is almost upon us.
STAR C OMPUTING Torre Wenaus, BNL Collaboration meeting 31/1/99 MDC2: Feb 22 - Mar 8 Principal MDC2 goals: Full production exercise with ROOT-based chain TPC production chain including detailed response simu (TRS?) & clustering Better subsystem coverage on DST: EMC, FTPC, trigger,... Testing and evaluating CRS production systems to finalize effective operations scheme Integration of QA evaluation into reconstruction production ‘Exercising our options’ for event store in production to make an informed Year 1 decision Greater analysis activity; definition and generation of analysis tags; uDSTs OO transient data model in place and in use GC architecture in production use for managed HPSS retrieval in DST analysis, query index selection, multiple & non-Objy event components
STAR C OMPUTING Torre Wenaus, BNL Collaboration meeting 31/1/99 After MDC2 Goal was to have infrastructure stabilized & turn to other things Realizable, I hope, in some areas… l ROOT framework, Maker/Chain infrastructure, transient data model infrastructure … but much work still to do in others l Event store, conditions database, real data handling, online integration With lessening of infrastructure load, ramp-up of activity in other areas (particularly in the BNL group) Reconstruction, physics analysis Reconstruction meeting Feb 21 is to review subsystem software status -- particularly reconstruction, including global reconstruction -- relative to year 1 requirements establish priority areas for new work and improvement l establish strategies in these areas: new vs. upgraded software; parallel vs. fully unified efforts at least begin to define new/expanded efforts and deliverables
STAR C OMPUTING Torre Wenaus, BNL Collaboration meeting 31/1/99 Will There Be an MDC3? No! (I hope not, at least.) Focus after MDC2 shifts to real data and detector-driven milestones But, what about a ‘Mock Physics Challenge?’ Original idea of Mock Data Challenges included hiding physics signal in the simulation and testing analysis by trying to extract it blind Not done in MDC1 and probably not done in MDC2 An important test of computing+analysis capability Possible milestone for Thomas and the PWGs? l Circa September? l Drawing on computing of course, but not a computing milestone Just a suggestion the PWGs may want to consider
STAR C OMPUTING Torre Wenaus, BNL Collaboration meeting 31/1/99 After MDC2: Get Real! Real data is coming soon... Online integration with DAQ/trigger/FEE will deliver (starting March?) pseudo-real (simu data in DAQ) and real data to offline in real software environment Many items to complete to make this work Complete and finalize DAQ-offline interface Integration of DAQ raw data in event store Initiation of event store (event collections, online/DAQ/trigger tags) in online, and its propagation to offline Integration of offline event store effort with online event pool; online event delivery to offline-environment monitors Database integration: online DB including slow control; offline conditions/calibrations Define DAQ/online/offline/RCF communication/data paths; security configuration Subsystem definition/creation of online monitors
STAR C OMPUTING Torre Wenaus, BNL Collaboration meeting 31/1/99 Documentation and Help Immediate call after Nov ROOT decisions… ROOT documentation! Expand existing documentation/tutorial work by Valery, Yuri to include contributions with an ‘end user’ perspective l Akio Ogawa’s ROOT diary l Gene Van Buren developing and accumulating tutorial, how-to, FAQ material in a ‘ROOT help desk’ function l Herb Ward and other ROOT users also contributing how-tos and examples Tutorial page is clearing-house for all this Many offline tutorials updated to reflect ROOT environment SOFI postings and/or bug reports remain the preferred question/help mechanism l Please do not use personal to experts for questions, help; give us the ability to spread the response load around and monitor what sort of questions are being asked Monthly (1st Friday) tutorials, tutorials/C++ next week, OOAD course in Mar
STAR C OMPUTING Torre Wenaus, BNL Collaboration meeting 31/1/99 Software Usability Environment has been relatively quiet, from which we conclude things are reasonably functional? (Real work does seem to be getting done) New ‘new’ coming out imminently after a period of intensive development Should provide a stable and capable ‘MDC2 prep’ release If you have problems or concerns, make them public, or at least make them known; mumbling and rumbling privately (especially to people other than computing principals) is less likely to help
STAR C OMPUTING Torre Wenaus, BNL Collaboration meeting 31/1/99 Offsite Computing RCF is foreseeing a need for offsite data export via tape, of at least two flavors Bulk export to large remote centers (eg. LBNL) Self-serve tape export (a physicist carrying a uDST sample home) But we need to understand the plans, needs, constraints of the institutes better Local computer, disk, tape resources and how you expect to use them What central services (especially any beyond those already available) you need from STAR and/or RCF Still have to get my survey out! Offsite computing meeting, chaired by Peter Jacobs, tomorrow 7pm at Brookhaven Center South Room Will be able to carry in pitchers of beer from the pub
STAR C OMPUTING Torre Wenaus, BNL Collaboration meeting 31/1/99 Conclusions Basically good progress along the expected lines, with no nasty surprises MDC1 experience good Migration of reconstruction production chain to ROOT proceeded well; good participation Simulation Some MDC2 elements coming in late but hopefully ‘just in time’: DST available via StEvent ROOT, Objectivity event store for MDC2 Other elements need a lot of work, and have felt the effect of personnel losses and shortfalls Databases: conditions, calibrations, online Online integration and real data readiness We rely on the wider involvement of the collaboration -- which has been growing -- to continue to grow