Presentation is loading. Please wait.

Presentation is loading. Please wait.

April HEPCG Workshop 2006 GSI

Similar presentations


Presentation on theme: "April HEPCG Workshop 2006 GSI"— Presentation transcript:

1 K.Schwarz@gsi.de April 27 2006 HEPCG Workshop 2006 GSI
The German HEP-Grid initiative HEP CG WP3: Distributed Data Analysis Tools for ALICE April HEPCG Workshop 2006 GSI

2 HEP CG WP3: Distributed interactive data analysis
Coordination P.Malzacher , GSI (LMU, GSI, unfunded: LRZ, MPI M, RZ Garching, Uni Karlsruhe, MPI Heidelberg) Optimize application specific job scheduling Analyse and test of software environment required Job management and Bookkeeping of distributed analysis Distribution of analysis, sum-up of results Interactive Analysis: Creation of dedicated analysis clusters Dynamic partitioning of Grid analysis clusters

3 Motivation: ALICE Computing
The main aim of this study is to investigate whether we can use “ALICE–like” interactive analysis in the Grid environment without requiring AliEn everywhere. As a good basis we took gLite middleware, since it is going to be general middleware for LHC experiments in near future (in terms of EGEE project). In the same time we try to proof that the same can be done with other types of Grid middleware like Globus4, for instance.

4 SC4 and PDC06 April/May: Generate 1 M Pb-Pb and 100 M p-p events on 7500 KSI2k of cpu, 20% at CERN, 29% at T1s (11% GridKa), 51% T2. Network/reconstruction stress test July: repeat network/reconstruction stress test September: scheduled analysis test, user analysis

5 ALICE: A simple environment
ROOT AliRoot STEER Virtual MC G3 G4 FLUKA HIJING MEVSIM PYTHIA6 PDF EVGEN HBTP HBTAN ISAJET EMCAL ZDC ITS PHOS TRD TOF RICH ESD AliAnalysis AliReconstruction PMD CRT FMD MUON TPC START STRUCT AliSimulation AliEn AliEn AliRoot & Co WLCG OSG NGDF

6 ALICE Analysis concepts
Analysis Models Prompt analysis at T0 using PROOF (+file catalogue) infrastructure; batch Analysis using Grid infrastructure; interactive Analysis using PROOF (+Grid) infrastructure. PROOF/ROOT Single-/Multi- tier static and dynamic PROOF cluster; Grid API class TGrid (virtual) ---> TAliEn (implementation). User Interface ALICE users access any Grid Infrastructure via AliEn or ROOT/PROOF UI. AliEn Native and “Grid on a Grid” (LCG/EGEE, ARC, OSG); integrate as much as possible common components LFC, FTS, WMS, MonALISA ...

7 Tasks GAP Analysis Application specific job scheduling
Interactive Analysis ... We decided to concentrate first on “Interactive Analysis”

8 Start with Gap Analysis
Analysis based on PROOF Investigating different versions of PROOF clusters Connect ROOT and gLite: TGlite Developing a ROOT interface for gLite in Poster Session (K. Schwarz) class TGrid : public TObject { public: virtual TGridResult *Query ( … static TGrid *Connect ( const char *grid, const char *uid = 0, const char *pw = 0 ClassDef(TGrid,0) };

9 Parallel Analysis of Event Data
#proof.conf slave node1 slave node2 slave node3 slave node4 Local PC Remote PROOF Cluster proof proof = master server root stdout/obj proof proof = slave server ana.C proof proof proof *.root TFile node1 ana.C TNetFile *.root $ root root [0] tree.Process(“ana.C”) $ root root [0] tree.Process(“ana.C”) root [1] gROOT->Proof(“remote”) root [2] dset->Process(“ana.C”) $ root root [0] tree.Process(“ana.C”) root [1] gROOT->Proof(“remote”) $ root node2 TFile *.root node3 TFile *.root node4

10 Various possibilities to set up a PROOF Cluster
Static: proofd started via xinetd on dedicated PROOF Cluster (e.g ALICE PROOF Analysis CERN) Dynamic: proofd integrated in local batch farm ( GSI) Dynamic 2: proofd being sent as Grid Jobs to sites were the data to be analysed can be found (query of FC) realized: PROOF/AliEn soon to be: PROOF/gLite necessary for this: ROOT/Grid communication via TGridXXXX interface provided by ROOT

11 Gap Analysis To achieve our aim we need to
get access to a gLite testbed or install it ourselves; investigate available gLite API and tools; implement TGridXXXX ROOT interfaces for gLite; try to reproduce “ALICE-like” analysis using our new basis; learn and understand the tools; investigate possible gaps in TGrid and in gLite.

12 Project timeline For testing purposes a complete gLite 1.5 testbed including central services has been installed on virtual Xen hosts at GSI A first alpha release of TGliteXXXX implementation is expected in the end of March 2006.

13 Set of ROOT interfaces to the Grid
TGrid Abstract base class defining interface to common Grid services. TGridResult Abstract base class defining interface to a Grid result. Objects of this class are created by TGrid methods. TGridJob Pure Abstract class. Abstract base class defining interface to a Grid job. TGridJDL Pure Abstract class. To generate JDL files for job submission to the Grid. TGridJobStatus Pure Abstract class. Abstract base class containing the status of a Grid job. TGridCollection Class which manages collection files on the Grid. TAlienXXXX TGLiteXXXX “realized”

14 Grid Access using ROOT facility
Abstract Base Classes UI to Grid Service + Files Grid Plug-in Classes Alien API Service TGrid::Connect(“alien://...) TGrid TAlien gLite C++ API TGrid TGLite TGrid::Connect(“glite://...) Globus4 API TGrid::Connect(“globus4...) TGrid TGlobus4 Future development ? // Connect TGrid grid = TGrid::Connect(“alien://”); // Connect(“glite…” // Query TGridResult *res = grid.Query(“/home/test_user/analysis/”, ”*.root“); // List of files TList *listf = res->GetFileInfoList(); // Create chain TChain chain(“Events", “session"); Chain.AddFileInfoList(listf); // Start PROOF TProof proof(“remote”); // Process your query Chain.Process(“selector.C”); TGrid example with AliEn/gLite

15 Application specific job scheduling
Still in an early stage Need to get more experience with applications first More interactive applications in the LHC environment need to be in running state Applications need to be able to get monitoring information.

16 Application specific scheduling: basic ideas
Monitor resources (CPU, Main Memory, I/O and network traffic) on an event type and analysis type basis. Use this information for forecasting in scheduling decisions. Application needs to be able to access Grid monitoring information on the fly.

17 Application specific scheduling Potential synergy effects
WP1: scheduling  HEP-CG Scheduling Architecture (Lars Schley) WP2: monitoring  talks of Ralph Müller-Pfefferkorn  Job Execution Monitor (Markus Mechtel)  Online steering of HEP applications (Daniel Lorenz)

18 Status and Outview GAP Analysis: advanced
Interactive Analysis: advanced Application specific job scheduling: started Application specific job scheduling will come after Interactive Analysis due to non existing standard monitoring interface in the middleware.


Download ppt "April HEPCG Workshop 2006 GSI"

Similar presentations


Ads by Google