Presentation is loading. Please wait.

Presentation is loading. Please wait.

ATLAS Distributed Analysis Dietrich Liko IT/GD. Overview  Some problems trying to analyze Rome data on the grid Basics Metadata Data  Activities AMI.

Similar presentations


Presentation on theme: "ATLAS Distributed Analysis Dietrich Liko IT/GD. Overview  Some problems trying to analyze Rome data on the grid Basics Metadata Data  Activities AMI."— Presentation transcript:

1 ATLAS Distributed Analysis Dietrich Liko IT/GD

2 Overview  Some problems trying to analyze Rome data on the grid Basics Metadata Data  Activities AMI DIAL Production system GANGA DIANE

3 Distributed Analysis  Today some results on AOD analysis Many small files at few places  But there is also … ESD analysis  Many larger files at many places TAG data analysis  Lets see …. Detector analysis  will be a dominant activity when ATLAS is starting up  Calibration database

4 What do we want to do for analysis  We want to run jobs where the data is  We want to run a bunch of short jobs for relatively fast response  We want to profit from GRID resources

5 Basics  Three grids No consistent UI installation  Simple jobs to all grids One can find examples and guides at various places … There should be examples and a high level guide

6 Metadata  Difficult to understand which data should be around “rfdir on CATSTOR” AMI Wiki Prod sys DB  Work has been done Its less a technical issue Its missing the final push

7 Data  A lot of data around Definitely distributed Now many files at CERN and at BNL  We would like to run jobs where the data is We have to replicate it to more places  Catalog consistency ? GUID mismatches Has to be studied further …

8 Data cont.  User data at random Just do a rfdir on /castor/cern.ch/grid/atlas We need at least some conventions …  In general In my opinion data management is the biggest problem for analysis We have to follow closely the datamanagement

9 Turnaround on the GRID  GRID has been set up to run production Even short jobs can need some time to start Theoretical minimum is maybe 5 mins Usually is rather several hours (on LCG)  Cannot at all be compared with the short queue on lxbatch  We have to improve this situation Better understanding of the working of the grid Ranking Batch queues Interaction with middleware/grid operations

10 In the following  I assume that the activities are known in general  Some thoughts on the status  How to proceed ?

11 AMI  It would be very useful to have lists of good data files as some files are missing in datasets Work has been done to synchronize with Production database The information should be there – I think  Questions: Is there a communication problem ???? Do we still want AMI to provide this information?

12 AMI – how to continue ?  We need something users like in a very short time Could be few php scripts See Guido’s pages …  Emphasis on the basic tasks List of datasets List of files in a datasets  Long term strategy is a question for Database team

13 DIAL – strong points  True service architecture Secure service, delegation, etc  Significant progress has been made in the last months  Hard to beat LSF based DIAL services at BNL O(100) PCs, Fast queues Its hard to imagine that a grid based service could do better Ideal place for user evaluations

14 DIAL – weak points  Deployment on many grid sites Today worker nodes need a full dial installation  Access to log files You need to know the person running the server Similar to production system  Integration with other ATLAS activities AMI, GANGA, Prodsys Priorities were on delivering DIAL

15 Production System  Run Analysis jobs using Prodsys Has just been presented Data access is the crucial point Important experiences from CTB support New Java GUI tools are definitely promising  I am also happy on the progress of the DIAL Prodsys interface To some part complimentary activities

16 GANGA  Combined project between LHCb & ATLAS  GAUDI & Grid Alliance  A number of activities have been done Athena job handlers AMI Integration Basics DIAL Integration  In my opinion it has not been sufficiently exposed to the ATLAS community

17 GANGA Advantages  Is backed by a quite large development team  Very flexible architecture  Connecting GANGA to DIAL is only one of options Main effort was in that direction and it should be delivered  A strength on GANGA is the definition of Transformation (or task) using a GUI with integration in the build environment. That’s a feature that is currently missing

18 GANGA Activities  We have to understand how to continue within the GANGA project Development of the GUI component Requires some commitment of ATLAS  For me the number one goal Run analysis jobs one of the rome datasets Fast feedback to the ATLAS community Choose the fastest approach

19 DIANE  DIstributed ANalysis Environment  Lightweight distributed framework for parallel scientific applications in master-worker model  Hope that it could improve reliability of job execution on grid sites No problem if some workers do not start No problem if some workers crash No problem if a job lands on a slow node On batch system the gain in execution time was about 50%  Interfaced to gLite/LCG Need still to demonstrate large datasets  Interest More stable execution for analysis – could be useful for any project Local clusters

20 DIANE

21 DIANE on gLite running Athena

22 Running on the large dataset ~ 6000 AOD files 16 workers (8 on LSF, 8 on gLite CE at CERN) The elapsed time > 15 hrs. need more investigations on the details about job scheduling data access

23 Status Overview  DIAL Ready for users Large scale deployment on the grid is an open issue  Production system Runs analysis jobs on the grid Not yet an end user tool  GANGA We are waiting eagerly for the release of GANGA4  DIANE Runs analysis jobs on the gLite/LCG large dataset has not yet been demonstrated

24 Do we have too many projects ?  Each project has some persons with interest in that particular project Every project is strong in different aspects Push for collaboration and discussion Distributed Analysis meetings are the appropriate forum for such discussions Increase the developer and user community  Competition is good for the customer

25 Where are the customers ?  Lets be honest: Not so many analysis users yet Why should they move to the grid today ? We all know that they will have to in the future  We need to push our projects to the level that non- experts can really use them Example CTB support ARDA team is prepared to provide (some) user support  We have to establish a procedure for systematic and periodic user feedback Run your jobs on the grid and you get a laptop for free!


Download ppt "ATLAS Distributed Analysis Dietrich Liko IT/GD. Overview  Some problems trying to analyze Rome data on the grid Basics Metadata Data  Activities AMI."

Similar presentations


Ads by Google