Presentation is loading. Please wait.

Presentation is loading. Please wait.

RAL Site report John Gordon ITD October 1999

Similar presentations


Presentation on theme: "RAL Site report John Gordon ITD October 1999"— Presentation transcript:

1 RAL Site report John Gordon ITD October 1999 J.C.Gordon@rl.ac.uk

2 Summary Linux Farm NT Farm (Monday) Suns for BaBar (Friday) Disk and Tape Security(Wednesday) Y2K

3 Linux Linux in use in most parts of CLRC Formed a user group to share experiences For central HEP systems more Linux cpu power than any other system

4 Hardware Configuration Twenty built to measure PCs –SuperMicro Dual Motherboard (with SCSI) –Two Pentium II 450 –10GB 5400rpm IDE HDA –256MB ECC memory –100Mbit Ethernet (tulip or Intel) –Cheap graphics card –Usually run without monitor - (BIOS must allow this)

5 Pounds per MHz

6

7 Hardware Costs Per Dual System Dual CPU System: £1450 Shelving: £10 Power: £30 Network: £50 Software: £0 Cost Per Pentium 450 CPU = £900 (inc VAT)

8 Cloning Presently trivial but labour intensive –System image created by dd onto SCSI tape –memory resident Linux system run from floppy (Tom’s Root and Boot) –dd from tape to system disk Need to become smarter! –Kickstart? –Drive Image (or similar software)? –Any other suggestions?

9 Software Redhat 5.2 (kernel 2.0.36) ARLA (Free AFS Software) Generic NQS 5.4 - Free but not recommended - evaluating commercial products Mainly Fortran 77 - therefore use g77 compiler (egcs 1.1.1). Some C++ autorpm for system updates

10 Summary Procurement Lessons System Monitoring Redhat 5.2 needed several changes Problems

11 Procurement (lessons) Procurement was run as two tenders 4 months apart. Hardware is (and will continue to be) a moving target. Detailed (but not detailed enough) specification of all components. Watch out for warranty terms! Need to pin down details. Acceptance tests vital (ours are still evolving). Not all H/W delivered identical (as required)

12 System Monitoring Service needs to be highly reliable. lm_sensors (Hardware monitoring). System monitoring scripts, check filesystem occupancy, load average, batch system status… Operations staff notified via our automated operations system (SURE) System logs spool to system logger for security monitoring. What about SMART, ECC, SERR/PERR…?

13 Tuning/Tweaks (for Redhat 5.2) Large disks need geometry setting explicitly Memory autosizing is unreliable. Set explicitly at boot (mem=256M) NFS (user level) client is poor (better in kernel 2.2). Set rsize/wsize explicitly (also in 6.0?). NFS implementation buggy - mount with option timeo=0. (Directory cache timeout) disk dma transfer mode on Insufficient VFS inodes - Increase

14 Current Problems Hardware/OS reliability is fair - 1 break every 40 PC weeks. Not as good as HP-UX NQS is buggy - needed to eyeball/hack code ARLA is buggy - occasional cache hangs Process accounting is buggy - accounting files become corrupted Most of this does not impact the users - managed by monitoring/workarounds.

15 Plans Need to move to Redhat 6.0 (or 6.1) Disk mirroring for interactive service? Next expansion will be late Autumn. Probably based on dual Pentium 600. Possibly further expansion early next year (probably Pentium) Further expansion 2H2000 when Multi- processor AMD Athlon systems will be extremely interesting possibility.

16 Disk Always growing 1.25TB general user disk servers 4.5TB for BaBar Plan to test an IDE server

17 Tape 30TB IBM3590 in 3494 robot STK robot idle - considering upgrade to Eagle drives.


Download ppt "RAL Site report John Gordon ITD October 1999"

Similar presentations


Ads by Google