Data Analysis w ith PROOF, PQ2, Condor Data Analysis w ith PROOF, PQ2, Condor Neng Xu, Wen Guan, Sau Lan Wu University of Wisconsin-Madison 30-October-09.

Slides:



Advertisements
Similar presentations
Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.
Advertisements

ATLAS T1/T2 Name Space Issue with Federated Storage Hironori Ito Brookhaven National Laboratory.
CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.
Parasol Architecture A mild case of scary asynchronous system stuff.
Duke and ANL ASC Tier 3 (stand alone Tier 3’s) Doug Benjamin Duke University.
GSIAF "CAF" experience at GSI Kilian Schwarz. GSIAF Present status Present status installation and configuration installation and configuration usage.
Israel Cluster Structure. Outline The local cluster Local analysis on the cluster –Program location –Storage –Interactive analysis & batch analysis –PBS.
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
Setup your environment : From Andy Hass: Set ATLCURRENT file to contain "none". You should then see, when you login: $ bash Doing hepix login.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
Neng XU University of Wisconsin-Madison X D.  This instruction is for beginners to setup and test an Xrootd/PROOF pool quickly.  Following up each step.
Zhiling Chen (IPP-ETHZ) Doktorandenseminar June, 4 th, 2009.
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
BaBar MC production BaBar MC production software VU (Amsterdam University) A lot of computers EDG testbed (NIKHEF) Jobs Results The simple question:
Central Reconstruction System on the RHIC Linux Farm in Brookhaven Laboratory HEPIX - BNL October 19, 2004 Tomasz Wlodek - BNL.
Compiled Matlab on Condor: a recipe 30 th October 2007 Clare Giacomantonio.
Multi-Tiered Storage with Xrootd at ATLAS Western Tier 2 Andrew Hanushevsky Wei Yang SLAC National Accelerator Laboratory 1CHEP2012, New York
Experiences with a HTCondor pool: Prepare to be underwhelmed C. J. Lingwood, Lancaster University CCB (The Condor Connection Broker) – Dan Bradley
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
PROOF work progress. Progress on PROOF The TCondor class was rewritten. Tested on a condor pool with 44 nodes. Monitoring with Ganglia page. The tests.
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
PROOF Cluster Management in ALICE Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CAF / PROOF Workshop,
Support in setting up a non-grid Atlas Tier 3 Doug Benjamin Duke University.
Turning science problems into HTC jobs Wednesday, July 29, 2011 Zach Miller Condor Team University of Wisconsin-Madison.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
M. Schott (CERN) Page 1 CERN Group Tutorials CAT Tier-3 Tutorial October 2009.
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.
Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April
Redirector xrootd proxy mgr Redirector xrootd proxy mgr Xrd proxy data server N2N Xrd proxy data server N2N Global Redirector Client Backend Xrootd storage.
Tier 3 architecture Doug Benjamin Duke University.
ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.
Nurcan Ozturk University of Texas at Arlington US ATLAS Transparent Distributed Facility Workshop University of North Carolina - March 4, 2008 A Distributed.
Factors affecting ANALY_MWT2 performance MWT2 team August 28, 2012.
ATLAS XRootd Demonstrator Doug Benjamin Duke University On behalf of ATLAS.
Outline: Status: Report after one month of Plans for the future (Preparing Summer -Fall 2003) (CNAF): Update A. Sidoti, INFN Pisa and.
PERFORMANCE AND ANALYSIS WORKFLOW ISSUES US ATLAS Distributed Facility Workshop November 2012, Santa Cruz.
Improving Software with the UW Metronome Becky Gietzel Todd L Miller.
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
Doug Benjamin Duke University. 2 ESD/AOD, D 1 PD, D 2 PD - POOL based D 3 PD - flat ntuple Contents defined by physics group(s) - made in official production.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
PROOF tests at BNL Sergey Panitkin, Robert Petkus, Ofer Rind BNL May 28, 2008 Ann Arbor, MI.
PROOF Benchmark on Different Hardware Configurations 1 11/29/2007 Neng Xu, University of Wisconsin-Madison Mengmeng Chen, Annabelle Leung, Bruce Mellado,
ANALYSIS TOOLS FOR THE LHC EXPERIMENTS Dietrich Liko / CERN IT.
T3g software services Outline of the T3g Components R. Yoshida (ANL)
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Dynamic staging to a CAF cluster Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CAF / PROOF Workshop,
Data transfers and storage Kilian Schwarz GSI. GSI – current storage capacities vobox LCG RB/CE GSI batchfarm: ALICE cluster (67 nodes/480 cores for batch.
Testing Infrastructure Wahid Bhimji Sam Skipsey Intro: what to test Existing testing frameworks A proposal.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
Western Tier 2 Site at SLAC Wei Yang US ATLAS Tier 2 Workshop Harvard University August 17-18, 2006.
CCJ introduction RIKEN Nishina Center Kohei Shoji.
Good user practices + Dynamic staging to a CAF cluster Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CUF,
Wouter Verkerke, NIKHEF 1 Using ‘stoomboot’ for NIKHEF-ATLAS batch computing What is ‘stoomboot’ – Hardware –16 machines, each 2x quad-core Pentium = 128.
ATLAS TIER3 in Valencia Santiago González de la Hoz IFIC – Instituto de Física Corpuscular (Valencia)
Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI.
New Features of Xrootd SE Wei Yang US ATLAS Tier 2/Tier 3 meeting, University of Texas, Arlington,
Atlas Tier 3 Overview Doug Benjamin Duke University.
STORAGE EXPERIENCES AT MWT2 (US ATLAS MIDWEST TIER2 CENTER) Aaron van Meerten University of Chicago Sarah Williams Indiana University OSG Storage Forum,
BaBar & Grid Eleonora Luppi for the BaBarGrid Group TB GRID Bologna 15 febbraio 2005.
a brief summary for users
Kevin Thaddeus Flood University of Wisconsin
Real Time Fake Analysis at PIC
Global Data Access – View from the Tier 2
Virtualisation for NA49/NA61
A full demonstration based on a “real” analysis scenario
Virtualisation for NA49/NA61
Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011.
Ákos Frohner EGEE'08 September 2008
Presentation transcript:

Data Analysis w ith PROOF, PQ2, Condor Data Analysis w ith PROOF, PQ2, Condor Neng Xu, Wen Guan, Sau Lan Wu University of Wisconsin-Madison 30-October-09 ATLAS Tier 3 Meeting at ANL 3/14/2016 Neng Xu, Univ. of Wisconsin-Madison 1

Outline Dataset management with PQ2 tools. Data analysis at Wisconsin. Running PROOF and Condor at BNL. 3/14/2016 Neng Xu, Univ. of Wisconsin-Madison 2

Introduction of PQ2 tools It’s a local data management system using PROOF dataset management function. Users only deal with datasets instead of file lists. File information is stored locally on PROOF master. No database is needed. It can be used for both PROOF and Condor data analysis job submission. The files are pre-located and pre-validated. This will save lots of overhead for the PROOF sessions, especially for the large datasets. (One of our users was running a PROOF session with files, validation time is almost 15 minutes.) 3/14/2016 Neng Xu, Univ. of Wisconsin- Madison 3

How to PQ2 tools PQ2 is part of ROOT(newer than 5.24) now. It’s in $ROOTSYS/etc/proof/util/pq2. To setup the environment: 1. Setup the ROOT environment. 2. export PATH= $ROOTSYS/etc/proof/util/pq2:$PATH 3. export 3/14/2016 Neng Xu, Univ. of Wisconsin-Madison 4

Basic Commands For system admin: ◦ pq2-put (Register a dataset.) ◦ pq2-verify (File location check and pre-validate.) ◦ pq2-info-server (list all the storage nodes' information.) ◦ pq2-ls-files-server (list all the files on one of the storage nodes.) ◦ pq2-rm (remove datasets) For users: ◦ pq2-ls (List the datasets.) ◦ pq2-ls-files (List the file information of a dataset.) 3/14/2016 Neng Xu, Univ. of Wisconsin-Madison 5

Example of pq2-ls 3/14/2016 Neng Xu, Univ. of Wisconsin-Madison 6 Datasets information Example of pq2-ls

Example of pq2-ls-files 3/14/2016 Neng Xu, Univ. of Wisconsin-Madison 7 File location Validation information

How do users use PQ2 tools For PROOF users, they just need to put the name of the dataset like this: p-> Process("/default/nengxu/mc PythiaH190zz4l.merge.NTUP.e384_s462_r635_t53_tid056686", "CollectionTree.C+",options); For Condor-dagman users, they submit their analysis jobs like this: python submit.py --initdir=/home/wguan/dag_pq2/test3 --parentScript=/home/wguan/dag_pq2/dump.sh -- childScript=/home/wguan/dag_pq2/merge.sh --dataset=/default/nengxu/mc PythiaQCDbb2l.fullsim.v transferInput=/home/wguan/dag_pq2/input.tgz --jobsPerMachine=2 3/14/2016 Neng Xu, Univ. of Wisconsin-Madison 8

Advantages of PQ2 Easy to maintain. No database needed. Add/remove datasets is very simple. Fast access to the file information. Avoid some overhead of PROOF jobs. Easy to make load balance for the PROOF master. Not only work with Xrootd but also NFS, Dcache, CASTOR or even local files. … 3/14/2016 Neng Xu, Univ. of Wisconsin-Madison 9

Our Tier3 Facility at Wisconsin We are using BestmanSrm + Xrootd for data transfer and storage. Our hardware is 50x(8core+16GB+8x750GB). Our network is 10Gb uplink and 1Gb to each node. We are using PATHENA, PROOF and Condor-dagman for data analysis. We run Xrootd and PROOF separately but on same storage pool. We are using PQ2 tools for local file management. Recently PQ2 need to work with XrootdFS to get the file list. All of our users are working from CERN 3/14/2016 Neng Xu, Univ. of Wisconsin-Madison 10

Use PQ2 tools with DQ2/LFC in Wisconsin Since we are running DQ2 site service there. we wrote a script which reads out all the file information from DQ2/LFC and converts srm url of PFN to xrootd url. For each dataset, we create a simple text file. pq2-put command takes those text files as inputs and registers them into PROOF. 3/14/2016 Neng Xu, Univ. of Wisconsin-Madison 11

Analysis with Condor-dagman The submit script talks with PQ2 server to get file location and define the Condor- dagman JDLs. Check the machine status before submit the jobs. Limit the number of jobs to each node. Multi-level collecting or merging the output files. Example script is here: /14/2016 Neng Xu, Univ. of Wisconsin-Madison 12

Our Tier3 experience at BNL We are going to have some CPUs and storage at BNL. How to use them is our new project. We are working on two ways to run analysis jobs at BNL: ◦ Using Condor to start a PROOF pool. ◦ Using Condor-dagman to run analysis jobs. All the input files are in the Dcache. The PROOF jobs read via the Xrootd door and Condor-dagman jobs read directly from DCache. We registered the datasets we need to a PQ2. For PROOF jobs, we use SSH tunnel from CERN to BNL without login to BNL machines. Very convenient for users. 3/14/2016 Neng Xu, Univ. of Wisconsin-Madison 13

Use PQ2 tools at BNL We can run dq2 command to get the datasets information and file SURL. We just simply create a text file with the xrootd url converted from SURL. pq2-put takes those text files to register the files into the PROOF redirector. Users can access the file information via SSH tunnel between CERN and BNL. 3/14/2016 Neng Xu, Univ. of Wisconsin-Madison 14

Some performance problems at BNL Since PROOF reads the Dcache files via Xrootd door, the performance was limited by Xrootd door. The maximum speed is 8MB/s. PROOF developers are working on direct access Dcache files. We also don’t want to overload the Dcache system at BNL because PROOF is very aggressive on I/O. There should be a way to limit maximum network traffic. 3/14/2016 Neng Xu, Univ. of Wisconsin-Madison 15

Summary PQ2 works quite good for our local data management. If your Tier3 want to put machines at BNL or SLAC, the way we setup PROOF over Condor could be good example. It also works with other batch system like LSF, PBS. Condor-dagman is good for those users who don’t want to modify their code. Multi-level merging can reduce the load on submit nodes.. 3/14/2016 Neng Xu, Univ. of Wisconsin-Madison 16

Thanks to ROOT/PROOF team. Condor team. DDM team at BNL. SLAC Xrootd team. 3/14/2016 Neng Xu, Univ. of Wisconsin-Madison 17