Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Report on CHEP 2007 Raja Nandakumar. 2 Synopsis  Two classes of talks and posters ➟ Computer hardware ▓ Dominated by cooling / power consumption ▓

Similar presentations


Presentation on theme: "1 Report on CHEP 2007 Raja Nandakumar. 2 Synopsis  Two classes of talks and posters ➟ Computer hardware ▓ Dominated by cooling / power consumption ▓"— Presentation transcript:

1 1 Report on CHEP 2007 Raja Nandakumar

2 2 Synopsis  Two classes of talks and posters ➟ Computer hardware ▓ Dominated by cooling / power consumption ▓ Mostly in the plenary sessions ➟ Software ▓ Grid job workload management systems  Job submission by the experiments  Site Job handling, monitoring  Grid operations (Monte Carlo production, glexec, interoperability, …)  Data integrity checking  …. ▓ Storage systems  Primarily concerning dCache and DPM  Distributed storage systems  Parallel session : Grid middleware and tools

3 3 Computing hardware  Power requirements of LHC computing ➟ Important for running costs ▓ ~330W to provision for 100W of electronics ➟ Some sites running with air or water cooled racks Electronics100 W Server fans13 W Voltage regulation22 W Case power supply48 W Room power distribution4 W UPS18 W Room cooling125 W

4 4 High performance and multi-core computing  Core Frequencies ~ 2-4 GHz, will not change significantly  Power ➟ 1,000,000 cores at 25 W / core = 25 MW ▓ Just for the cpu ➟ Have to bring core power down by multiple orders of magnitude ▓ reduces chip frequency, complexity, capability  Memory Bandwidth ➟ As we add cores to a chip, it is increasingly difficulty to provide sufficient memory bandwidth ➟ Application tuning to manage memory bandwidth becomes critical  Network and I/O Bandwidth, data integrity, reliability ➟ A Petascale computer will have Petabytes of Memory ➟ Current Single File Servers achieve 2-4 GB/s ▓ 70+ hours to checkpoint 1 Petabyte ➟ IO management is a major challenge  Memory Cost ➟ Can’t expect to maintain current memory / core numbers at petascale. ▓ 2GB/core for ATLAS / CMS

5 5 Grid job submission  Most new developments were on pilot agent based grid systems ➟ Implement job scheduling based on “pull” scheduling paradigm ➟ The only method for grid job submission LHCb ▓ DIRAC (> 3 years experience) ▓ Ganga is the user analysis front end ➟ Also used in Alice (and Panda and Magic) ▓ AliEn since 2001 ➟ Used for production, user analysis, data management in LHCb & Alice ➟ New developments for others ▓ Panda : Atlas, Charmm  Central server based on Apache ▓ GlideIn : Atlas, CMS, CDF  Based on Condor ▓ Used for production and analysis ➟ Very successful implementations ▓ Real-time view of the local environment ▓ Pilot agents can have some intelligence built into the system  Useful for heterogeneous computing environment ▓ Recently Panda to be used for all Atlas production  One talk on distributed batch systems

6 6 Pilot agents  Pilot agents submitted on demand ➟ Reserve the resource for immediate use ▓ Allows checking of the environment before job scheduling ▓ Only bidirectional network traffic ▓ Unidirectional connectivity ➟ Terminates gracefully if no work is available ➟ Also called GlideIn-s  LCG jobs are essentially pilot jobs for the experiment

7 7 DIRAC WMS

8 8 Panda WMS

9 9 Alice (AliEn / MonaLisa) History plot of running jobs

10 10 LHCb (Dirac) Max running jobs snapshot

11 11 Glexec  A thin layer to change Unix domain credentials based on grid identity and attribute information  Different modes of operation ➟ With or without setuid ▓ Ability to change the user id of the final job  Enable VO to ➟ Internally manage job scheduling and prioritisation ➟ Late binding of user jobs to pilots  In production at Fermilab ➟ Code ready and tested, awaiting full audit

12 12 LSF universus LSFPBSSGECCE Cluster/Desktops LSF Scheduler Web PortalJob Scheduler Cluster/Desktops LSF Scheduler MultiCluster

13 13 LSF universus  Commercial extension of LSF ➟ Interface to multiple clusters ➟ Centralised scheduler, but sites retain local control ➟ LSF daemons installed on head nodes of remote cluster ➟ Kerberos for user, host and service authentication ➟ Scp for file transfer  Currently deployed in ➟ Sandia National labs to link OpenPBS, PBS Pro and LSF clusters ➟ Singapore national grid to link PBS Pro, LSF and N1GE clusters ➟ Distributed European Infrastructure for Supercomputing Applications (DEISA)

14 14 Grid interoperability  Many different grids ➟ WLCG, Nordugrid, Teragrid, … ➟ Experiments span the various grids  Short term solutions have to be ad-hoc ➟ Maintain parallel infrastructures by the user, site or both  For the medium term setup adaptors and translators  In the long term adopt common standards and interfaces ➟ Important in security, information, CE, SE ➟ Most grids use X509 standard ➟ Multiple “common” standards … ➟ GIN (Grid interoperability now) group working on some of this SRM Storage Control Protocol GSI/VOMS GridFTP GLUE v1 LDAP/GIIS GRAM OSG GSI/VOMS Security GridFTP Storage Transfer Protocol GLUE v1.2ARCSchema LDAP/BDIILDAP/GIISService Discovery GRAMGridFTPJob Submission EGEEARC

15 15 Distributed storage  GridPP organised into 4 regional Tier-2s in the UK  Currently a job follows data into a site ➟ Consider disk at one site as close to cpu at another site ▓ Eg. Disk at Edinburgh vs cpu at Glasgow ➟ Pool resources for efficiency and ease of use ➟ Jobs need to access storage directly from the worker node

16 16  RTT between Glasgow and Edinburgh ~ 12 s  Custom rfio client ➟ Normal : One call / read ➟ Readbuf : Fills internal buffer to service request ➟ Readahead : Reads till EOF ➟ Streaming : Separate streams for control & data  Tests using single DPM server  Atlas expects ~ 10 MiB/s / job  Better performance with dedicated light path  Ultimately a single DPM instance to span Glasgow and Edinburgh sites

17 17 Data Integrity  Large number of components performing data management in an experiment  Two approaches to checking data integrity ➟ Automatic agents continuously performing checks ➟ Checks in response to special events  Different catalogs in LHCb : Bookkeeping, LFC, SE  Issues seen : ➟ zero size files: ➟ missing replica information ➟ missing replica information: ➟ wrong SAPath ➟ wrong SE host: ➟ wrong protocol ▓sfn, rfio, bbftp… ➟ mistakes in files registration ▓blank spaces on the surl path ▓carriage returns ▓presence of port number in the surl path..

18 18 Summary  Many experiments have embraced the grid  Many interesting challenges ahead ➟ Hardware ▓ Reduce the power consumed by cpu-s ▓ Applications need to manage with lesser RAM ➟ Software ▓ Grid interoperability ▓ Security with generic pilots / glexec ▓ Distributed grid network  And many opportunities ➟ To test solutions to above issues ➟ Stress test the grid infrastructure ▓ Get ready for data taking ▓ Implement lessons in other fields  Biomed … ➟ Note : 1 fully digitised film = 4 PB and needs 1.25 GB/s to play


Download ppt "1 Report on CHEP 2007 Raja Nandakumar. 2 Synopsis  Two classes of talks and posters ➟ Computer hardware ▓ Dominated by cooling / power consumption ▓"

Similar presentations


Ads by Google