Presentation is loading. Please wait.

Presentation is loading. Please wait.

9/16/2000Ian Bird/JLAB1 Planning for JLAB Computational Resources Ian Bird.

Similar presentations


Presentation on theme: "9/16/2000Ian Bird/JLAB1 Planning for JLAB Computational Resources Ian Bird."— Presentation transcript:

1 9/16/2000Ian Bird/JLAB1 Planning for JLAB Computational Resources Ian Bird

2 9/16/2000Ian Bird/JLAB2 Overview Present facilities overview Planned growth – 3 years, 5 years Projects  PPDG, Globus, Grids, Planning for Hall D  Issues  Facilities, infrastructure

3 9/16/2000Ian Bird/JLAB3

4 9/16/2000Ian Bird/JLAB4 Existing facilities Storage  Tape  STK silo – 6000 slots  8 Redwood drives – 50 GB @ 10 MB/s, helical, expensive, very sensitive, unreliable, many failures  10 9840 drives – 20 GB @ 10 MB/s, linear, mid-load, work very reliably, 1/5 cost of Redwood, tape cost is the same  Disk  Large (5 TB) NFS RAID 5 (Symbios) – 18 – 9 c/MB Load can kill them – upgrades not available and expensive  Stage disks – host attached – 2 TB  Linux file servers (Cache and DST) – RAID 0 – 3c/MB Dual PIII, 12 73GB disks, GB Ethernet Excellent match performance/network I/O/Capacity 5 x 800 GB, 4 x 400 GB Near future expansion of disk farm

5 9/16/2000Ian Bird/JLAB5 Existing facilities – 2 Computing  Experimental program  Farm ~ 6000 SPECint95 – 250 Linux CPU Mostly rack mounted dual processor, 100 MB ethernet  Lattice QCD  20 + 8 Alpha – myrinet, MPI  Small brother (1024 DSP) to Columbia-BNL-Riken  Use essentially no storage or network bandwidth Networks  Local  GB Ethernet backbone everywhere, 100 MB switched to desktops  GB (trunked) between storage servers and farm (24 port switches), interactive servers  WAN  OC-3 to Esnet installed (+ 2 weeks), OC-12 capable

6 9/16/2000Ian Bird/JLAB6 Existing facilities – 3 Space & power  Space limited – CC sufficient for next few years(?)  Power – UPS upgrade – sufficient for anything could install in available space Software  Storage  OSM – replacement in hand Tapeserver – disk pool/cache managers, remote file copies, –Expand to wide area, parallel file copies  Batch  LSF + wrappers  PBS in LQCD clusters (development for wide area clusters)

7 9/16/2000Ian Bird/JLAB7 Expected expansion – current program Experimental program  Storage  Add drives (9840 higher capacity,rate as available) -> 30, replace Redwoods  Double disk storage yearly – SAN (back end) if feasible, cost effective and useful  CPU  Level now is as required – modest increase and replacement of older systems  Network  No real changes anticipated – add more GB links, trunk

8 9/16/2000Ian Bird/JLAB8 Expansion – 2 LQCD & FEL  256-node (300 Gigaflop) cluster proposed (1 yr)  Aim for >1 Teraflop LQCD & > 1 Teraflop FEL  >= FY02  Context of a wider DOE advanced scientific computing program

9 9/16/2000Ian Bird/JLAB9 Associated projects Grids  Particle Physics Data Grid  Other – Eurogrid, alliance Supporting technologies  Globus  LSF, Condor, etc

10 9/16/2000Ian Bird/JLAB10 Particle Physics Data Grid Goals  The delivery of an infrastructure for very widely distributed analysis of particle physics data at multi-petabyte scales by hundreds to thousands of physicists,  The acceleration of the development of network and middleware infrastructure aimed broadly at data-intensive collaborative science. Method  Design, develop, and deploy a network and middleware infrastructure capable of supporting data analysis and data flow patterns common to the many particle physics experiments represented.  Application-specific software will be adapted to operate in this wide-area environment and to exploit this infrastructure.

11 9/16/2000Ian Bird/JLAB11 Planning for Hall D Issues:  Data rate  Storage capacity – tape/disk  Data access and distribution  CPU requirements  Networking  Physical facilities

12 9/16/2000Ian Bird/JLAB12 Data rates 100 MB/s – is done today at RHIC  Could do now with parallel streams @ 10 MB/s  Expect drives of 40-60MB/s – 1,2 drives as now for CLAS  Same timescale LHC will have GB/s

13 9/16/2000Ian Bird/JLAB13 CPU & Network needs CPU  Not an issue  May need farms of > 500 CPU  No big difference from current & planned  Size is dropping – 2 proc in 1U, 4,8-way systems Networking  10 GB Ethernet soon  ESnet plans sufficient for needs  University and overseas links will be OK too (LHC)

14 9/16/2000Ian Bird/JLAB14 Storage Disk  Capacity, I/O rates will increase, costs drop  Suspect it will not be an issue Tape  Largest uncertainty – capacity and rates  Recent experience is not as expected (slower)  STK roadmap (Feb 2000)  >300 GB tapes  60 MB/s

15 9/16/2000Ian Bird/JLAB15 Tape storage STK silos (6000 slots)  Assume 300 GB -> 2 silo/year  Thus need 4 for comfort  Rest of lab needs 2  Tape access - # drives depends on speed and tape capacity Options:  STK silos – 4-6 with ~ 10 drives each  “FNAL” – expensive ADIC silo with lots of “commodity” drives – AIT, DLT, Mammoth, etc..  Reliability of drives, cost of silo

16 9/16/2000Ian Bird/JLAB16 Space A new Computer Center building is in the plan for 5 years  Recognized as essential for 12 GeV upgrade Lab could have significant computing facilities:  Hall D + CLAS-2 + others  5 silos, 800 node farms  LQCD + FEL programs @ 5 – 10 Tflop each  ~500 nodes each Essential to get building planned soon

17 9/16/2000Ian Bird/JLAB17 Conclusions Facilities – commodity components Need to initiate planning now – buildings and physical infrastructure Wide-area access, export – have not done well  Encourage collaborative development projects  Practical uses now – solve future needs  PPDG – we will be more effective if we use the technology – our local tapeserver technology


Download ppt "9/16/2000Ian Bird/JLAB1 Planning for JLAB Computational Resources Ian Bird."

Similar presentations


Ads by Google