Grand Challenge and PHENIX Report post-MDC2 studies of GC software –feasibility for day-1 expectations of data model –simple robustness tests –Comparisons.

Slides:

Advertisements

Similar presentations

Object Persistency & Data Handling Session C - Summary Object Persistency & Data Handling Session C - Summary Dirk Duellmann.

Advertisements

I/O Management and Disk Scheduling Chapter 11. I/O Driver OS module which controls an I/O device hides the device specifics from the above layers in the.

Lectures on File Management

March 24-28, 2003Computing for High-Energy Physics Configuration Database for BaBar On-line Rainer Bartoldus, Gregory Dubois-Felsmann, Yury Kolomensky,

Status of the new CRS software (update) Tomasz Wlodek June 22, 2003.

23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.

David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.

EventStore Managing Event Versioning and Data Partitioning using Legacy Data Formats Chris Jones Valentin Kuznetsov Dan Riley Greg Sharp CLEO Collaboration.

SUMS Storage Requirement 250 TB fixed disk cache 130 TB annual increment for permanently online data 100 TB work area (not controlled by SUMS) 2 PB near-line.

Component-Based Software Engineering Introducing the Bank Example Paul Krause.

What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.

NovaBACKUP 10 xSP Technical Training By: Nathan Fouarge

Experiences Deploying Xrootd at RAL Chris Brew (RAL)

Guide to Linux Installation and Administration, 2e 1 Chapter 9 Preparing for Emergencies.

Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.

Distributed File Systems

Module 7: Fundamentals of Administering Windows Server 2008.

Bonrix SMPP Client. Index Introduction Software and Hardware Requirements Architecture Set Up Installation HTTP API Features Screen-shots.

Central Reconstruction System on the RHIC Linux Farm in Brookhaven Laboratory HEPIX - BNL October 19, 2004 Tomasz Wlodek - BNL.

D0 Farms 1 D0 Run II Farms M. Diesburg, B.Alcorn, J.Bakken, T.Dawson, D.Fagan, J.Fromm, K.Genser, L.Giacchetti, D.Holmgren, T.Jones, T.Levshina, L.Lueking,

Farm Management D. Andreotti 1), A. Crescente 2), A. Dorigo 2), F. Galeazzi 2), M. Marzolla 3), M. Morandin 2), F.

Jeff LandgrafSTAR Trigger Workshop: Oct 21, 2002 Trigger Accounting for 2002.

D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.

A User’s Introduction to the Grand Challenge Software STAR-GC Workshop Oct 1999 D. Zimmerman.

An Overview of PHENIX Computing Ju Hwan Kang (Yonsei Univ.) and Jysoo Lee (KISTI) International HEP DataGrid Workshop November 8 ~ 9, 2002 Kyungpook National.

Nov 1, 2000Site report DESY1 DESY Site Report Wolfgang Friebel DESY Nov 1, 2000 HEPiX Fall

Jerome Lauret RCF Advisory Committee Meeting The Data Carousel what problem it’s trying to solve the data carousel and the grand challenge the bits and.

1 Wenguang WangRichard B. Bunt Department of Computer Science University of Saskatchewan November 14, 2000 Simulating DB2 Buffer Pool Management.

Write-through Cache System Policies discussion and A introduction to the system.

Introduction to AFS IMSA Intersession 2003 AFS Servers and Clients Brian Sebby, IMSA ‘96 Copyright 2003 by Brian Sebby, Copies of these.

9 February 2000CHEP2000 Paper 3681 CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts Hardware and Resources.

NOVA Networked Object-based EnVironment for Analysis P. Nevski, A. Vaniachine, T. Wenaus NOVA is a project to develop distributed object oriented physics.

Using Bitmap Index to Speed up Analyses of High-Energy Physics Data John Wu, Arie Shoshani, Alex Sim, Junmin Gu, Art Poskanzer Lawrence Berkeley National.

LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.

1 GCA Application in STAR GCA Collaboration Grand Challenge Architecture and its Interface to STAR Sasha Vaniachine presenting for the Grand Challenge.

INFORMATION SYSTEM-SOFTWARE Topic: OPERATING SYSTEM CONCEPTS.

Thank you. Harel Ben Attia Senior Software Engineer River A data workflow management system.

Introduction CMS database workshop 23 rd to 25 th of February 2004 Frank Glege.

Lee Lueking 1 The Sequential Access Model for Run II Data Management and Delivery Lee Lueking, Frank Nagy, Heidi Schellman, Igor Terekhov, Julie Trumbo,

The Process Manager in the ATLAS DAQ System G. Avolio, M. Dobson, G. Lehmann Miotto, M. Wiesmann (CERN)

STAR C OMPUTING STAR Analysis Operations and Issues Torre Wenaus BNL STAR PWG Videoconference BNL August 13, 1999.

The KLOE computing environment Nuclear Science Symposium Portland, Oregon, USA 20 October 2003 M. Moulson – INFN/Frascati for the KLOE Collaboration.

STAR Collaboration, July 2004 Grid Collector Wei-Ming Zhang Kent State University John Wu, Alex Sim, Junmin Gu and Arie Shoshani Lawrence Berkeley National.

January 26, 2003Eric Hjort HRMs in STAR Eric Hjort, LBNL (STAR/PPDG Collaborations)

PHENIX and the data grid >400 collaborators 3 continents + Israel +Brazil 100’s of TB of data per year Complex data with multiple disparate physics goals.

Grand Challenge in MDC2 D. Olson, LBNL 31 Jan 1999 STAR Collaboration Meeting

STAR C OMPUTING Plans for Production Use of Grand Challenge Software in STAR Torre Wenaus BNL Grand Challenge Meeting LBNL 10/23/98.

Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,

1 fileCatalog, tagDB and GCA A. Vaniachine Grand Challenge STAR fileCatalog, tagDB and Grand Challenge Architecture A. Vaniachine presenting for the Grand.

ClearQuest XML Server with ClearCase Integration Northwest Rational User’s Group February 22, 2007 Frank Scholz Casey Stewart

UTA MC Production Farm & Grid Computing Activities Jae Yu UT Arlington DØRACE Workshop Feb. 12, 2002 UTA DØMC Farm MCFARM Job control and packaging software.

RHIC/US ATLAS Tier 1 Computing Facility Site Report Christopher Hollowell Physics Department Brookhaven National Laboratory HEPiX Upton,

Workflows and Data Management. Workflow and DM Run3 and after: conditions m LHCb major upgrade is for Run3 (2020 horizon)! o Luminosity x 5 ( )

Batch Software at JLAB Ian Bird Jefferson Lab CHEP February, 2000.

Alignment in real-time in current detector and upgrade 6th LHCb Computing Workshop 18 November 2015 Beat Jost / Cern.

T3g software services Outline of the T3g Components R. Yoshida (ANL)

D0 Farms 1 D0 Run II Farms M. Diesburg, B.Alcorn, J.Bakken, R. Brock,T.Dawson, D.Fagan, J.Fromm, K.Genser, L.Giacchetti, D.Holmgren, T.Jones, T.Levshina,

BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.

Hans Wenzel CDF CAF meeting October 18 th -19 th CMS Computing at FNAL Hans Wenzel Fermilab  Introduction  CMS: What's on the floor, How we got.

1 Efficient Data Access for Distributed Computing at RHIC A. Vaniachine Efficient Data Access for Distributed Computing at RHIC A. Vaniachine Lawrence.

1 5/4/05 Fermilab Mass Storage Enstore, dCache and SRM Michael Zalokar Fermilab.

Simulation Production System Science Advisory Committee Meeting UW-Madison March 1 st -2 nd 2007 Juan Carlos Díaz Vélez.

Apr. 25, 2002Why DØRAC? DØRAC FTFM, Jae Yu 1 What do we want DØ Regional Analysis Centers (DØRAC) do? Why do we need a DØRAC? What do we want a DØRAC do?

The HENP Grand Challenge Project and initial use in the RHIC Mock Data Challenge 1 D. Olson DM Workshop SLAC, Oct 1998.

Online Control Program: a summary of recent discussions

The COMPASS event store in 2002

Conditions Data access using FroNTier Squid cache Server

Advanced Operating Systems

OO-Design in PHENIX PHENIX, a BIG Collaboration A Liberal Data Model

Presentation transcript:

Grand Challenge and PHENIX Report post-MDC2 studies of GC software –feasibility for day-1 expectations of data model –simple robustness tests –Comparisons to data carousel … not yet GC meeting highlights, plans Possible additions/enhancements of GC software –HPSS savvy –Distributed processing???

Cache use studies –Single long processing time query ~0.2MB/s Cache kept near 1GB processing continuous

GC Test conditions: Singles Single Queries –100 events/file – MB/file –Varying cache size processing time query size HPSS status –1 tape drive –fast purge policy (5 minutes) to isolate GC capabilities –pftp of files generally from 4-8 MB/s though total rate closer to 1-3 MB/s –HPSS savvy should improve by factor of 2-3!

Multiple queries single query –0.0<rndm<0.2 time ~ 2000s overlapping queries: –0.0<rndm<0.2 –0.1<rndm<0.3 time ~ 3000s overlapping files staged to disk first

Test conditions: Doubles Double Queries –identical queries at same time delayed with different processing times –overlapping queries

Robustness Start query –normal 8 GB/37 files –ctrl-C after first 2 files 3rd file staged to cache stops cleanly –start identical query –ctrl-C at 14 files 15th file staged to cache stops cleanly –different query –etc. etc. –Very robust! –Troubles only when Objectivity lockserver fails!!

GC meeting highlights Post MDC2 the GC commandeered HPSS and performed some tests: –robustness, correctness, tape drive dependencies, 1 P.I.P. link to user code, etc. CAS plans. How does this affect/change GC capabilities. Interface of GC with physics analyses. Scalability issues -- tests to commence in July –1000’s of files, 10,000 events/file, 5 components, 2TB total Generic interfaces –quest to make the GCA as independent of our specific problem as possible -- usability for other HEP expts, climate modeling, combustion studies, etc.

First year offline configuration As best I can tell fundamentally different configuration than STAR: –Day1 (many sub-detectors): ~10 Detector sub-groups with their own files for calibration purposes –Year 1 (many physics analyses): 60 X rate of STAR -> smaller DST events (~100 kB/event) –no physical separation of events into components (maybe hits???) » single component at least for year 1 (caveat on later slides) couple thousand events/file since any physics analysis will in general cut no tighter than 1%, unless we filter events to separate files according to physics cuts, every major physics analysis query will want every file. –Prefiltering adds excessive complications and the need to correct for biases, etc. »possible exception: centrality presorting bins »according to trigger conditions, detector configuration, date, etc.

Projected use of G.C. Intimately related to expectations of CAS machines –Day1 (many sub-detectors): Separate calibration/special run files Separate machines Different instances of G.C./D.C. Separate cache, etc. –Year 1 (Physics Analyses): some small/separate analyses –on CAS machines/server –but usually on micro-DST’s A few large jobs: need all/most files Running on CAS/server? Possibility of cache over several disks? –Distributed processing –10’s GB each –Data spread out -- send analysis code to machine HPSS CAS HPSS Multi CPU BIG DISK

Partial Query Biases Possible troubles with partial queries if they introduce a physics bias However, only a problem if we presort our data in files according to physics signals May presort according to centrality in day-1/year-1.

Components? A couple possible ways to separate events into components Problems? –each component corresponds to a separate file on tape too many tape mounts? Event HitsTracksRawGlobal Event RawmDST1DSTmDST2

Objectivity Woes Surgically remove Objectivity “We strongly recommend against using an ODBMS in those applications that are handled perfectly well with relational database(s)” -- Choosing an Object Database, Objectivity –Lockserver problems restarting often –Movement of disk to rnfs01 rebuilding of objy tagDB –Possible alternatives? Root? –Robustness: multiple accesses »each node of the farm and CAS machines accessing same file »layer between reconstruction nodes and DB? –Scalability: »100 GB tagDB? -- chain of files?

Carousel “ORNL” software carousel server mySQL database filelist HPSS tape HPSS cache pftp rmds03 carousel client pftp CAS rmds01 rmds02 “stateless”

Data Carousel: “Strip Mining” Written and tested by Jerome Lauret and Morrison –layer in front of ORNL batch transfer software mostly in perl administration organization throttling integrate multiple user requests for specific files (not events) maintain disk FIFO staging area 5-7 MB/s integrated rate!! –G.C. is near 1-3 MB/s »tape optimization should clean that up Missing: –Robust disk cache –Interface to physics analyses –error checking –etc.

Interface with Physics Analysis Offline code: –DST’s written in ROOT –tagDB also in ROOT (probably) not OBJY -- no better alternative Interface to GC –return file number/event number –capability to run query at root prompt? –Possibility to return list of events when a bundle is cached? To keep continual interaction with GC to a minimum