Download presentation
Presentation is loading. Please wait.
Published byJanis Arnold Modified over 9 years ago
1
ATLAS Distributed Analysis Dietrich Liko IT/GD
2
Overview Some problems trying to analyze Rome data on the grid Basics Metadata Data Activities AMI DIAL Production system GANGA DIANE
3
Distributed Analysis Today some results on AOD analysis Many small files at few places But there is also … ESD analysis Many larger files at many places TAG data analysis Lets see …. Detector analysis will be a dominant activity when ATLAS is starting up Calibration database
4
What do we want to do for analysis We want to run jobs where the data is We want to run a bunch of short jobs for relatively fast response We want to profit from GRID resources
5
Basics Three grids No consistent UI installation Simple jobs to all grids One can find examples and guides at various places … There should be examples and a high level guide
6
Metadata Difficult to understand which data should be around “rfdir on CATSTOR” AMI Wiki Prod sys DB Work has been done Its less a technical issue Its missing the final push
7
Data A lot of data around Definitely distributed Now many files at CERN and at BNL We would like to run jobs where the data is We have to replicate it to more places Catalog consistency ? GUID mismatches Has to be studied further …
8
Data cont. User data at random Just do a rfdir on /castor/cern.ch/grid/atlas We need at least some conventions … In general In my opinion data management is the biggest problem for analysis We have to follow closely the datamanagement
9
Turnaround on the GRID GRID has been set up to run production Even short jobs can need some time to start Theoretical minimum is maybe 5 mins Usually is rather several hours (on LCG) Cannot at all be compared with the short queue on lxbatch We have to improve this situation Better understanding of the working of the grid Ranking Batch queues Interaction with middleware/grid operations
10
In the following I assume that the activities are known in general Some thoughts on the status How to proceed ?
11
AMI It would be very useful to have lists of good data files as some files are missing in datasets Work has been done to synchronize with Production database The information should be there – I think Questions: Is there a communication problem ???? Do we still want AMI to provide this information?
12
AMI – how to continue ? We need something users like in a very short time Could be few php scripts See Guido’s pages … Emphasis on the basic tasks List of datasets List of files in a datasets Long term strategy is a question for Database team
13
DIAL – strong points True service architecture Secure service, delegation, etc Significant progress has been made in the last months Hard to beat LSF based DIAL services at BNL O(100) PCs, Fast queues Its hard to imagine that a grid based service could do better Ideal place for user evaluations
14
DIAL – weak points Deployment on many grid sites Today worker nodes need a full dial installation Access to log files You need to know the person running the server Similar to production system Integration with other ATLAS activities AMI, GANGA, Prodsys Priorities were on delivering DIAL
15
Production System Run Analysis jobs using Prodsys Has just been presented Data access is the crucial point Important experiences from CTB support New Java GUI tools are definitely promising I am also happy on the progress of the DIAL Prodsys interface To some part complimentary activities
16
GANGA Combined project between LHCb & ATLAS GAUDI & Grid Alliance A number of activities have been done Athena job handlers AMI Integration Basics DIAL Integration In my opinion it has not been sufficiently exposed to the ATLAS community
17
GANGA Advantages Is backed by a quite large development team Very flexible architecture Connecting GANGA to DIAL is only one of options Main effort was in that direction and it should be delivered A strength on GANGA is the definition of Transformation (or task) using a GUI with integration in the build environment. That’s a feature that is currently missing
18
GANGA Activities We have to understand how to continue within the GANGA project Development of the GUI component Requires some commitment of ATLAS For me the number one goal Run analysis jobs one of the rome datasets Fast feedback to the ATLAS community Choose the fastest approach
19
DIANE DIstributed ANalysis Environment Lightweight distributed framework for parallel scientific applications in master-worker model Hope that it could improve reliability of job execution on grid sites No problem if some workers do not start No problem if some workers crash No problem if a job lands on a slow node On batch system the gain in execution time was about 50% Interfaced to gLite/LCG Need still to demonstrate large datasets Interest More stable execution for analysis – could be useful for any project Local clusters
20
DIANE
21
DIANE on gLite running Athena
22
Running on the large dataset ~ 6000 AOD files 16 workers (8 on LSF, 8 on gLite CE at CERN) The elapsed time > 15 hrs. need more investigations on the details about job scheduling data access
23
Status Overview DIAL Ready for users Large scale deployment on the grid is an open issue Production system Runs analysis jobs on the grid Not yet an end user tool GANGA We are waiting eagerly for the release of GANGA4 DIANE Runs analysis jobs on the gLite/LCG large dataset has not yet been demonstrated
24
Do we have too many projects ? Each project has some persons with interest in that particular project Every project is strong in different aspects Push for collaboration and discussion Distributed Analysis meetings are the appropriate forum for such discussions Increase the developer and user community Competition is good for the customer
25
Where are the customers ? Lets be honest: Not so many analysis users yet Why should they move to the grid today ? We all know that they will have to in the future We need to push our projects to the level that non- experts can really use them Example CTB support ARDA team is prepared to provide (some) user support We have to establish a procedure for systematic and periodic user feedback Run your jobs on the grid and you get a laptop for free!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.