Presentation is loading. Please wait.

Presentation is loading. Please wait.

J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April 2009 1 ATLAS Grid Activities Preparing for Data Analysis Jim Shank.

Similar presentations


Presentation on theme: "J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April 2009 1 ATLAS Grid Activities Preparing for Data Analysis Jim Shank."— Presentation transcript:

1 J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April 2009 1 ATLAS Grid Activities Preparing for Data Analysis Jim Shank

2 J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April 2009 Overview lATLAS Monte Carlo produciton in 2008 lData (cosmic and single beam) in 2008 lProduction and Distributed Analysis (PandDA) system lSome features of the ATLAS Computing Model nAnalysis model for the US lDistributed Analysis Worldwide: Ganga/PanDA and Hammercloud + other readiness tests lTier 3 centers in the US 2

3 J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April 2009 Beam Splash Event 3

4 J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April 2009 First ATLAS Beam Events, 10 Sept. 2008 Data Exports to T1s Throughput in MB/s Effect of concurrent data access from centralized transfers and user activity (overload of disk server) Number of errors 4 CERN Storage system overload. DDM worked. Subsequently we limited user access to the storage system.

5 J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April 2009 December 2008 Reprocessing 5

6 J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April 2009 PanDA production (Monte Carlo Simulation/Reconstruction) 2008 6 Grouped by Cloud = Tier 1 center + all it’s associated Tier 2 centers

7 J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April 2009 U.S. Production in 2008 7 More than our share—indicates others not delivering their expected levels

8 J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April 2009 8 DDM : Data Replication Datasets subscription intervals Data replication to Tier-2s US Tier2s ATLAS Beam and Cosmics data replication from CERN to Tier-1s and calibration Tier-2s. Sep-Nov 2008 BNL&AGLT2

9 J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April 2009 9 DDM : Data replication between Tier-1s Functional Test. Tier-1-Tier-1s data replication status. FZK experienced problems with dCache. Data export is affected Tier-1-Tier-1s and prestaging data replication status. Data reprocessing. All Tier-1s Operational. Red : data transfer completion on 95% (data staging at CNAF)

10 Torre Wenaus, BNL 10 PanDA Overview Launched 8/05 by US ATLAS to achieve scalable data-driven WMS Designed for analysis as well as production Insulates users from distributed computing complexity Low entry threshold US ATLAS production since late ‘05 US analysis since Spring ’06 ATLAS-wide production since early ‘08 ATLAS-wide analysis still rolling out OSG WMS program since 9/06 Launched 8/05 by US ATLAS to achieve scalable data-driven WMS Designed for analysis as well as production Insulates users from distributed computing complexity Low entry threshold US ATLAS production since late ‘05 US analysis since Spring ’06 ATLAS-wide production since early ‘08 ATLAS-wide analysis still rolling out OSG WMS program since 9/06 Workload management system for Production ANd Distributed Analysis

11 J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April 2009 11 Panda/pathena Users 4 million jobs in last 6 months 473 users in last 6 months 352 users in last 3 months 90 users in last month 271 users with >1000 jobs 96 users with >10000 jobs

12 J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April 2009 ATLAS ANALYSIS 12

13 J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April 2009 ATLAS Data Types lStill evolving… 13

14 J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April 2009 ATLAS Analysis Data Flow 14

15 J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April 2009 15

16 J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April 2009 16

17 J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April 2009 17 US US T2 sites Analysis Readiness Tests

18 Nurcan Ozturk 18 Ideas for a Stress Test (1)   Initiated by Jim Cochran (US ATLAS Analysis Support Group Chair).   Below is a summary of plans from Akira Shibata (March 10th).   Goal: Stress testing of the analysis queues in the Tier2 sites with analysis jobs as realistic as possible both in volume and quality. We would like to make sure that the Tier2 sites are ready to accept real data and analysis queues to analyze them.   Time scale: sometime near the end of May 2009.   Outline of this exercise:  To make this exercise more useful and interesting we will generate and simulate (Atlfast-II) a large amount of mixed sample at Tier2’s.  We are currently trying to define the jobs for this exercise and we expect this to be finalized after the BNL jamboree this week.  The mixed sample is a blind mix of all Standard Model processes, which we call "data" in this exercise.  For the one day stress test, we will invite people with existing analysis to try and analyze the data using Tier2 resources only.  We will compile a list of people who have the ability to participate.

19 Nurcan Ozturk 19 Ideas for a Stress Test (2)   Estimate of data volume: A very rough estimate of the data volume is 100M-1B events. Assuming 100kB/event (realistic considering no truth info and no trigger info), this sets an upper limit of 100TB in total (split among 5 Tier2’s). This is probably an upper-limit from the current availability of USER/GROUP disk on Tier2 (which is in addition to MC/DATA/PROD and CALIB disk).   Estimate of computing capability: There are "plenty" of machines assigned to analysis though the current load of analysis queue is rather low. The computing nodes are usually shared between production and analysis and typically configured with upper limit and priority. For example MWT2 has 1200 cores and setup to run analysis jobs with priority with an upper limit of 400 cores. If production jobs are not coming in, the number of running analysis jobs can exceed this limit.   Site configuration: Site configuration varies among the Tier2 sites. We will compile a table showing configuration of each analysis queue; direct reading versus local copying, xrootd versus dcache, etc. We will compare the performance of queues based on their configuration.

20 J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April 2009 Four Types of Tier 3 Systems lT3gs nT3 with Grid Services Details in next slides lT3g nT3 with Grid Connectivity details in next slides lT3w nTier 3 Workstation  unclustered workstations...OSG, DQ2 client, root, etc lT3af nTier 3 system built into lab or university analysis facility 20

21 J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April 2009 21

22 J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April 2009 22

23 J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April 2009 Conclusions lMonte Carlo Simulation/Reconstruction working well world wide with PanDA submission system lData reprocessing with PanDA working, but need further tests of file staging from tape. lAnalysis Model still evolving nIn the U.S., big emphasis on getting T3’s up and running nAnalysis stress test coming in May-June lReady for collision data in late 2009 23

24 J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April 2009 Backup 24

25 Torre Wenaus, BNL 25 PanDA Operation T. Maeno Data management Data management ATLAS production Analysis

26 Torre Wenaus, BNL 26 PanDA Production Dataflow/Workflow

27 Torre Wenaus, BNL 27 Analysis with PanDA: pathena Tadashi Maeno Running the ATLAS software: Locally: athena PanDA: pathena --inDS --outDS Running the ATLAS software: Locally: athena PanDA: pathena --inDS --outDS Outputs can be sent to xrootd/PROOF farm, directly accessible for PROOF analysis


Download ppt "J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April 2009 1 ATLAS Grid Activities Preparing for Data Analysis Jim Shank."

Similar presentations


Ads by Google