TRIUMF Site Report for HEPiX, JLAB, October 9-13, 2006 – Corrie Kost TRIUMF SITE REPORT Corrie Kost & Steve McDonald Update since Hepix Spring 2006.

Slides:



Advertisements
Similar presentations
Chris Brew RAL PPD Site Report Chris Brew SciTech/PPD.
Advertisements

Southgrid Status Pete Gronbech: 27th June 2006 GridPP 16 QMUL.
Linux Clustering A way to supercomputing. What is Cluster? A group of individual computers bundled together using hardware and software in order to make.
S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.
Novell Server Linux vs. windows server 2008 By: Gabe Miller.
TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005 TRIUMF SITE REPORT Corrie Kost Update since Hepix Spring 2005.
Data Storage Willis Kim 14 May Types of storages Direct Attached Storage – storage hardware that connects to a single server Direct Attached Storage.
1 RAL Status and Plans Carmine Cioffi Database Administrator and Developer 3D Workshop, CERN, November 2009.
Peter Stefan, NIIF 29 June, 2007, Amsterdam, The Netherlands NIIF Storage Services Collaboration on Storage Services.
BNL Oracle database services status and future plans Carlos Fernando Gamboa RACF Facility Brookhaven National Laboratory, US Distributed Database Operations.
TRIUMF SITE REPORT – Corrie Kost April Catania (Italy) Update since last HEPiX/HEPNT meeting.
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Michal Kwiatek, Juraj Sucik, Rafal.
Storage Area Networks The Basics. Storage Area Networks SANS are designed to give you: More disk space Multiple server access to a single disk pool Better.
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel.
Online Systems Status Review of requirements System configuration Current acquisitions Next steps... Upgrade Meeting 4-Sep-1997 Stu Fuess.
ASGC 1 ASGC Site Status 3D CERN. ASGC 2 Outlines Current activity Hardware and software specifications Configuration issues and experience.
UTA Site Report Jae Yu UTA Site Report 4 th DOSAR Workshop Iowa State University Apr. 5 – 6, 2007 Jae Yu Univ. of Texas, Arlington.
CERN - IT Department CH-1211 Genève 23 Switzerland t Tier0 database extensions and multi-core/64 bit studies Maria Girone, CERN IT-PSS LCG.
GridKa SC4 Tier2 Workshop – Sep , Warsaw Tier2 Site.
Databases at TRIUMF Andrew Wong CANADA’S NATIONAL LABORATORY FOR PARTICLE AND NUCLEAR PHYSICS Owned and operated as a joint venture by a consortium of.
HEPiX, CASPUR, April 3-7, 2006 – Steve McDonald Steven McDonald TRIUMF Network & Computing Services Canada’s National Laboratory.
Oxford Update HEPix Pete Gronbech GridPP Project Manager October 2014.
TRIUMF Site Report for HEPiX/HEPNT, Vancouver, Oct20-24/2003 – Corrie Kost TRIUMF SITE REPORT Corrie Kost Head Scientific Computing.
Sandor Acs 05/07/
JLab Scientific Computing: Theory HPC & Experimental Physics Thomas Jefferson National Accelerator Facility Newport News, VA Sandy Philpott.
RAL Site Report Andrew Sansum e-Science Centre, CCLRC-RAL HEPiX May 2004.
IST Storage & Backup Group 2011 Jack Shnell Supervisor Joe Silva Senior Storage Administrator Dennis Leong.
São Paulo Regional Analysis Center SPRACE Status Report 22/Aug/2006 SPRACE Status Report 22/Aug/2006.
TRIUMF a TIER 1 Center for ATLAS Canada Steven McDonald TRIUMF Network & Computing Services iGrid 2005 – San Diego Sept 26 th.
HEPix April 2006 NIKHEF site report What’s new at NIKHEF’s infrastructure and Ramping up the LCG tier-1 Wim Heubers / NIKHEF (+SARA)
LFC Replication Tests LCG 3D Workshop Barbara Martelli.
US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory.
ATLAS Tier 1 at BNL Overview Bruce G. Gibbard Grid Deployment Board BNL 5-6 September 2006.
Site report HIP / CSC HIP : Helsinki Institute of Physics CSC: Scientific Computing Ltd. (Technology Partner) Storage Elements (dCache) for ALICE and CMS.
Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Implementation of a reliable and expandable on-line storage for compute clusters Jos van Wezel.
PC clusters in KEK A.Manabe KEK(Japan). 22 May '01LSCC WS '012 PC clusters in KEK s Belle (in KEKB) PC clusters s Neutron Shielding Simulation cluster.
CERN Database Services for the LHC Computing Grid Maria Girone, CERN.
CERN Computer Centre Tier SC4 Planning FZK October 20 th 2005 CERN.ch.
The following is a collection of slides from a few recent talks on computing for ATLAS in Canada, plus a few new ones. I might refer to all of them, I.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
ClinicalSoftwareSolutions Patient focused.Business minded. Slide 1 Opus Server Architecture Fritz Feltner Sept 7, 2007 Director, IT and Systems Integration.
Site Report: Prague Jiří Chudoba Institute of Physics, Prague WLCG GridKa+T2s Workshop.
RHIC/US ATLAS Tier 1 Computing Facility Site Report Christopher Hollowell Physics Department Brookhaven National Laboratory HEPiX Upton,
Tier 1 at Brookhaven (US / ATLAS) Bruce G. Gibbard LCG Workshop CERN March 2004.
Materials for Report about Computing Jiří Chudoba x.y.2006 Institute of Physics, Prague.
BNL Oracle database services status and future plans Carlos Fernando Gamboa, John DeStefano, Dantong Yu Grid Group, RACF Facility Brookhaven National Lab,
Maria Girone CERN - IT Tier0 plans and security and backup policy proposals Maria Girone, CERN IT-PSS.
RAL PPD Tier 2 (and stuff) Site Report Rob Harper HEP SysMan 30 th June
US ATLAS Tier 1 Facility Rich Baker Deputy Director US ATLAS Computing Facilities October 26, 2000.
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Juraj Sucik, Michal Kwiatek, Rafal.
A UK Computing Facility John Gordon RAL October ‘99HEPiX Fall ‘99 Data Size Event Rate 10 9 events/year Storage Requirements (real & simulated data)
Database CNAF Barbara Martelli Rome, April 4 st 2006.
PIC port d’informació científica Luis Diaz (PIC) ‏ Databases services at PIC: review and plans.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
RAL Plans for SC2 Andrew Sansum Service Challenge Meeting 24 February 2005.
STORAGE EXPERIENCES AT MWT2 (US ATLAS MIDWEST TIER2 CENTER) Aaron van Meerten University of Chicago Sarah Williams Indiana University OSG Storage Forum,
Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.
12/19/01MODIS Science Team Meeting1 MODAPS Status and Plans Edward Masuoka, Code 922 MODIS Science Data Support Team NASA’s Goddard Space Flight Center.
Managing Explosive Data Growth
Storage Area Networks The Basics.
Video Security Design Workshop:
LCG Service Challenge: Planning and Milestones
Paul Kuipers Nikhef Site Report Paul Kuipers
Mattias Wadenstein Hepix 2012 Fall Meeting , Beijing
IT-DB Physics Services Planning for LHC start-up
Christof Hanke, HEPIX Spring Meeting 2008, CERN
JDAT Production Hardware
Scalable Database Services for Physics: Oracle 10g RAC on Linux
Presentation transcript:

TRIUMF Site Report for HEPiX, JLAB, October 9-13, 2006 – Corrie Kost TRIUMF SITE REPORT Corrie Kost & Steve McDonald Update since Hepix Spring 2006

TRIUMF Site Report for HEPiX, JLAB, October 9-13, 2006 – Corrie Kost Update since Hepix Spring 2006 LHC Optical Private Network Map (Sep 13/2006)

TRIUMF Site Report for HEPiX, JLAB, October 9-13, 2006 – Corrie Kost Update since Hepix Spring 2006

TRIUMF Site Report for HEPiX, JLAB, October 9-13, 2006 – Corrie Kost Update since Hepix Spring 2006 Summary of TRIUMF WAN Connections Added 2 additional 1Gb wavelengths for ATLAS Canada Tier2 sites Expect TRIUMF-CERN 10Gb lightpath Nov 1/2006 Brings total to six (6) 1Gb wavelengths & one (1) 10Gb wavelength from TRIUMF to BCnet gigapop (regional area network) Multiple wavelengths are harder to debug! New CWDM: nm Old CWDM: nm Problem with loss being wavelength sensitive (see OTDR plot below)

TRIUMF Site Report for HEPiX, JLAB, October 9-13, 2006 – Corrie Kost Update since Hepix Spring 2006

TRIUMF Site Report for HEPiX, JLAB, October 9-13, 2006 – Corrie Kost Update since Hepix Spring 2006

TRIUMF Site Report for HEPiX, JLAB, October 9-13, 2006 – Corrie Kost Update since Hepix Spring 2006 TRSHARE 4TB RAID5 (14*300GB SCSI) Dell Storage for TRSHARE Colubris for site wireless

TRIUMF Site Report for HEPiX, JLAB, October 9-13, 2006 – Corrie Kost AoE - ATA over Ethernet CORAID SR SATA EtherDrive Storage - 15 EtherDrive 3U blades, currently 8 with 750GB SATA drives - Cost ~ $4k (shell) + $4k for 8*750GB drives - 7 drives as RAID5, 1 spare - Seen by Linux as block devices References:

TRIUMF Site Report for HEPiX, JLAB, October 9-13, 2006 – Corrie Kost AoE - ATA over Ethernet Comments: - ideal for non-critical / low-cost storage - easy to configure (although web interface missing!) - handles Jumbo frames ( ifconfig ethx mtu 9000 up ) - R/W(XFS) ~ 60MB/sec (blockdev --setra 8192 /dev/etherd/eth0.0) (without above – kernels have setra 256 and achieve ~ 5MB/sec !) -ethX.Y where Y:slot 0-14 and X:chassis so…max 61,425 disks Current limit: 61425*750GB ~ 44PB for about $45million (before volume discounts & Moore’s law)

TRIUMF Site Report for HEPiX, JLAB, October 9-13, 2006 – Corrie Kost Atlas at TRIUMF CFI funded -Major purchase of cluster: Spring 2007 Oracle RAC (Real Application Cluster): Sep 2006 Oracle replication of Tier0 ATLAS data : Oct/2006 More RAC nodes & storage arrays in 2008 & 2009

TRIUMF Site Report for HEPiX, JLAB, October 9-13, 2006 – Corrie Kost Atlas at TRIUMF Power for Blades Two HP Proliant DL380 MSA20 Storage Array -12*500GB SATA HP MSA (Modular Smart Array) U 14 Blade IBM Dual CPU/Dual Core 3.0GHz Woodcrest with 8GB Memory 7U 10 Blade Dell Dual CPU/Dual Core 3.0GHz Woodcrest with 8GB Memory Two (redundant) HP StorageWorks FC Switches LCG Compute Element (Scheduler…) dCache nodes MON (RGMA) dCache Pool Nodes LFC VOBOX FTS ADMIN SRM dCache dCache Pool node Dual drive SDLT-I Dual drive SDLT-II

TRIUMF Site Report for HEPiX, JLAB, October 9-13, 2006 – Corrie Kost Atlas (ORACLE RAC) at TRIUMF RAC: Real Application Cluster Two HP Proliant DL380 Gen 5 Single CPU Dual-Core Woodcrest HP MSA(Modular Smart Array) 1500 (Fibre Channel I/O Controller MSA20 Storage Array -12*500GB Sata Two (redundant) HP StorageWorks Sanswitch 4/8 Brocade SilkWorm 200E FC Switches

TRIUMF Site Report for HEPiX, JLAB, October 9-13, 2006 – Corrie Kost Atlas at TRIUMF Move to SL4 (Oct 2006) TRIUMF (unofficial) Tier2 Members (Sep 13/2006) SFU U of Toronto U of Montreal U of Alberta U of Victoria ATLAS Milestones: Calibration Data ChallengeNovember 2006 Full Dress RehearsalSummer 2007 LHC First CollisionsNovember 2007

TRIUMF Site Report for HEPiX, JLAB, October 9-13, 2006 – Corrie Kost Atlas at TRIUMF The agreed fractions and the rates of the data to be distributed to Tier1s are as follows: Tier-1LocationFraction.RAWESDAODm1Total rate BNLBrookhaven SARAAmsterdam CCIN2P3Lyon FZKKarlsruhe RALDidcot ASGCTaipei CNAFBologna NDGF(distributed) PICBarcelona TRIUMFVancouver Total % MB/s

TRIUMF Site Report for HEPiX, JLAB, October 9-13, 2006 – Corrie Kost Atlas at TRIUMF Need to test full re-import from T0 (from a possible h/w and or s/w problem) Schedule full recovery test for applications. Perform streams re-sync recovery procedures Perform (corrupt) database recovery Details: LCG 3D Sep 13/14 workshop NEAR TERM TESTS

TRIUMF Site Report for HEPiX, JLAB, October 9-13, 2006 – Corrie Kost Atlas (hardware) at TRIUMF Blade solution favored over “pizza box” ● share a common infrastructure (chassis) ● space saving ( ~ 50% less) ● power saving ( ~ 35% less) ● cabling (power & networking, ~ 70% less) ISAC-II facility Available Floor Space: 40’ x 22’ (880sq-ft) No false floor – use of hot/cold aisles Power Estimates – to end of 2009 CPU “blades” ~ 175kW Disks ~ 95kW Tape, servers, network, etc ~ 30kW

TRIUMF Site Report for HEPiX, JLAB, October 9-13, 2006 – Corrie Kost Possible Cluster Configuration (up to 15 units / DS4500) CE: Compute Element FTS: File Transfer Service LFC: LCG File Catalog RGMA: Accounting Facility SRM: Storage Resource Manager VOBOX:ATLAS Data Management

TRIUMF Site Report for HEPiX, JLAB, October 9-13, 2006 – Corrie Kost Repeated reads on same set of (typically 16) files (at ~ 600MB/sec) – during ~ 300 days ~ 15 PB (total since started ~20PB – no reboot for >300 days)

TRIUMF Site Report for HEPiX, JLAB, October 9-13, 2006 – Corrie Kost Update since Hepix Spring 2006 Sony HD Camcorder (HDR-HC3) replaces video presenter 4 MegaPixel Stills HDV or DC 1920*1080i

TRIUMF Site Report for HEPiX, JLAB, October 9-13, 2006 – Corrie Kost Update since Hepix Spring 2006 Still: 2304 x 1768 Full image Cropped images

TRIUMF Site Report for HEPiX, JLAB, October 9-13, 2006 – Corrie Kost Misc Material….