NL Cloud Meeting, 5 April 2011 Israel ATLAS Tier2 Status 1 Israel ATLAS TIER-2 Status April 2011 Lorne Levinson.

Slides:



Advertisements
Similar presentations
© 2010 VMware Inc. All rights reserved Confidential Performance Tuning for Windows Guest OS IT Pro Camp Presented by: Matthew Mitchell.
Advertisements

Southgrid Status Pete Gronbech: 27th June 2006 GridPP 16 QMUL.
IHEP Site Status Jingyan Shi, Computing Center, IHEP 2015 Spring HEPiX Workshop.
Bondyakov A.S. Institute of Physics of ANAS, Azerbaijan JINR, Dubna.
BNL Oracle database services status and future plans Carlos Fernando Gamboa RACF Facility Brookhaven National Laboratory, US Distributed Database Operations.
Status Report on Tier-1 in Korea Gungwon Kang, Sang-Un Ahn and Hangjin Jang (KISTI GSDC) April 28, 2014 at 15th CERN-Korea Committee, Geneva Korea Institute.
COMSATS Institute of Information Technology, Islamabad PK-CIIT Grid Operations in Pakistan COMSATS Ali Zahir Site Admin / Faculty Member ALICE T1/2 March.
ASGC 1 ASGC Site Status 3D CERN. ASGC 2 Outlines Current activity Hardware and software specifications Configuration issues and experience.
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.
UTA Site Report Jae Yu UTA Site Report 4 th DOSAR Workshop Iowa State University Apr. 5 – 6, 2007 Jae Yu Univ. of Texas, Arlington.
Preparation of KIPT (Kharkov) computing facilities for CMS data analysis L. Levchuk Kharkov Institute of Physics and Technology (KIPT), Kharkov, Ukraine.
CERN - IT Department CH-1211 Genève 23 Switzerland t Tier0 database extensions and multi-core/64 bit studies Maria Girone, CERN IT-PSS LCG.
SouthGrid Status Pete Gronbech: 2 nd April 2009 GridPP22 UCL.
INTRODUCTION The GRID Data Center at INFN Pisa hosts a big Tier2 for the CMS experiment, together with local usage from other HEP related/not related activities.
Paul Scherrer Institut 5232 Villigen PSI HEPIX_AMST / / BJ95 PAUL SCHERRER INSTITUT THE PAUL SCHERRER INSTITUTE Swiss Light Source (SLS) Particle accelerator.
Oxford Update HEPix Pete Gronbech GridPP Project Manager October 2014.
INDIACMS-TIFR Tier 2 Grid Status Report I IndiaCMS Meeting, April 05-06, 2007.
Astro-WISE & Grid Fokke Dijkstra – Donald Smits Centre for Information Technology Andrey Belikov – OmegaCEN, Kapteyn institute University of Groningen.
RAL PPD Computing A tier 2, a tier 3 and a load of other stuff Rob Harper, June 2011.
Main title ERANET - HEP Group info (if required) Your name ….
Main title HEP in Greece Group info (if required) Your name ….
14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.
UKI-SouthGrid Update Hepix Pete Gronbech SouthGrid Technical Coordinator April 2012.
Atlas Tier 3 Virtualization Project Doug Benjamin Duke University.
Company LOGO “ALEXANDRU IOAN CUZA” UNIVERSITY OF IAŞI” Digital Communications Department Status of RO-16-UAIC Grid site in 2013 System manager: Pînzaru.
ATLAS Tier 1 at BNL Overview Bruce G. Gibbard Grid Deployment Board BNL 5-6 September 2006.
Grid DESY Andreas Gellrich DESY EGEE ROC DECH Meeting FZ Karlsruhe, 22./
11 November 2010 Natascha Hörmann Computing at HEPHY Evaluation 2010.
CERN Database Services for the LHC Computing Grid Maria Girone, CERN.
SiGNET – Slovenian Production Grid Marko Mikuž Univ. Ljubljana & J. Stefan Institute on behalf of SiGNET team ICFA DDW’06 Kraków, 10 th October 2006.
IHEP(Beijing LCG2) Site Report Fazhi.Qi, Gang Chen Computing Center,IHEP.
UKI-SouthGrid Overview and Oxford Status Report Pete Gronbech SouthGrid Technical Coordinator HEPSYSMAN – RAL 10 th June 2010.
January 30, 2016 RHIC/USATLAS Computing Facility Overview Dantong Yu Brookhaven National Lab.
Doug Benjamin Duke University. 2 ESD/AOD, D 1 PD, D 2 PD - POOL based D 3 PD - flat ntuple Contents defined by physics group(s) - made in official production.
STATUS OF KISTI TIER1 Sang-Un Ahn On behalf of the GSDC Tier1 Team WLCG Management Board 18 November 2014.
SA1 operational policy training, Athens 20-21/01/05 Presentation of the HG Node “Isabella” and operational experience Antonis Zissimos Member of ICCS administration.
BNL Oracle database services status and future plans Carlos Fernando Gamboa, John DeStefano, Dantong Yu Grid Group, RACF Facility Brookhaven National Lab,
Maria Girone CERN - IT Tier0 plans and security and backup policy proposals Maria Girone, CERN IT-PSS.
IAG – Israel Academic Grid, EGEE and HEP in Israel Prof. David Horn Tel Aviv University.
BaBar Cluster Had been unstable mainly because of failing disks Very few (
Eygene Ryabinkin, on behalf of KI and JINR Grid teams Russian Tier-1 status report May 9th 2014, WLCG Overview Board meeting.
Ole’ Miss DOSAR Grid Michael D. Joy Institutional Analysis Center.
RALPP Site Report HEP Sys Man, 11 th May 2012 Rob Harper.
Overview of ATLAS Israel Computing RECFA, April Overview of ATLAS Israel Computing Overview of ATLAS Israel Computing RECFA Meeting Tel Aviv University,
A. Mohapatra, T. Sarangi, HEPiX-Lincoln, NE1 University of Wisconsin-Madison CMS Tier-2 Site Report D. Bradley, S. Dasu, A. Mohapatra, T. Sarangi, C. Vuosalo.
Evangelos Markatos and Charalampos Gkikas FORTH-ICS Athens, th Mar Institute of Computer Science - FORTH Christos.
Data transfers and storage Kilian Schwarz GSI. GSI – current storage capacities vobox LCG RB/CE GSI batchfarm: ALICE cluster (67 nodes/480 cores for batch.
INRNE's participation in LCG Elena Puncheva Preslav Konstantinov IT Department.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
Western Tier 2 Site at SLAC Wei Yang US ATLAS Tier 2 Workshop Harvard University August 17-18, 2006.
UTA Site Report Jae Yu UTA Site Report 7 th DOSAR Workshop Louisiana State University Apr. 2 – 3, 2009 Jae Yu Univ. of Texas, Arlington.
IT-INFN-CNAF Status Update LHC-OPN Meeting INFN CNAF, December 2009 Stefano Zani 10/11/2009Stefano Zani INFN CNAF (TIER1 Staff)1.
The RAL PPD Tier 2/3 Current Status and Future Plans or “Are we ready for next year?” Chris Brew PPD Christmas Lectures th December 2007.
Status of GSDC, KISTI Sang-Un Ahn, for the GSDC Tier-1 Team
The Grid in Israel Presentation to the Athens meeting, April 06 David Horn Tel Aviv University.
GRID IL Tel Aviv, G.Mikenberg2 General Comments on Israeli Education and Research (2005… but not far from now..) Israeli Population 6.86 Millions.
Instituto de Biocomputación y Física de Sistemas Complejos Cloud resources and BIFI activities in JRA2 Reunión JRU Española.
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
Activities and Perspectives at Armenian Grid site The 6th International Conference "Distributed Computing and Grid- technologies in Science and Education"
INFSO-RI Enabling Grids for E-sciencE Turkish Tier-2 Site Report Emrah AKKOYUN High Performance and Grid Computing Center TUBITAK-ULAKBIM.
November 28, 2007 Dominique Boutigny – CC-IN2P3 CC-IN2P3 Update Status.
The status of IHEP Beijing Site WLCG Asia-Pacific Workshop Yaodong CHENG IHEP, China 01 December 2006.
Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.
LCG Service Challenge: Planning and Milestones
IT-DB Physics Services Planning for LHC start-up
Vanderbilt Tier 2 Project
Статус ГРИД-кластера ИЯФ СО РАН.
BEIJING-LCG2 Site Report
The INFN Tier-1 Storage Implementation
Presentation transcript:

NL Cloud Meeting, 5 April 2011 Israel ATLAS Tier2 Status 1 Israel ATLAS TIER-2 Status April 2011 Lorne Levinson

NL Cloud Meeting, 5 April 2011 Israel ATLAS Tier2 Status 2 Israel HEP community ATLAS is the only LHC experiment in which we participate –also Phenix (Heavy ILC, ZEUS –Israel is “1.35% of ATLAS” (MoU pledge, authors, common fund) –25-30 people doing physics analysis 3 sites: –Tel Aviv University, Tel Aviv (1956) a university –The Technion Israel Institute of Technology, Haifa (1924) a university –Weizmann Institute of Science, Rehovot (1934) a research institute for Biology, Chemistry, Physics, Math & CS) with graduate school (no undergrads) longest travel is Weizmann  Technion 2 hours office-to-office

NL Cloud Meeting, 5 April 2011 Israel ATLAS Tier2 Status 3 Organization we are a distributed Tier2/Tier3 each site combines Tier2 and Tier3 resources in the same cluster –all resources shared flexibly between T2 and T3 (Lustre/Storm) single management and budget, single purchasing three sites as identical as possible Steering Committee for overall policy Management & Operations team for the three sites stable funding approved until 2012

Storage Continues to be the biggest reliability issue. Our hardware is now stable: –replaced DDN 6620’s with DDN 9900 Fully redundant, 300 disk slots, 8x8Gb/s FC ports  5GB/s –two Lustre “OSS” servers –WI servers with 10Gb/s to cluster, TAU, Tech will install 10G in April Gave up on Thumpers+Lustre and Thumpers+iSCSI+Lustre. –We NFS mount Thumpers with Solaris+ZFS for extra "archive" storage, home directories or /opt/exp_soft Lustre + Storm  problem is Storm team does not test new Storm releases on Lustre –Storm-Lustre community must solve this NL Cloud Meeting, 5 April 2011 Israel ATLAS Tier2 Status 4

NL Cloud Meeting, 5 April 2011 Israel ATLAS Tier2 Status 5 Storm/Lustre Storm allows LCG SRM storage and our local global file name space to share the same physical storage. –No rigid boundary –Jobs in cluster can do Linux file io to read SRM files Storm can run over Lustre (open source) or GPFS (IBM) Lustre: –Object Storage Targets serve (stripes of) file data –Meta-Data Server holds directories redundant failover of MDS’s will soon be supported

Storage – installed SRM + local capacity TAUTechnionWeizmannTotal purchase Total Heavy Ion 3Q NL Cloud Meeting, 5 April 2011 Israel ATLAS Tier2 Status 6 Net TB

NL Cloud Meeting, 5 April 2011 Israel ATLAS Tier2 Status 7 Group disks We are hosting four ATLASGROUPDISK areas –Muon performance (Technion) –Top (Weizmann) –Heavy Ion (Weizmann) –Standard Model (TAU) (empty)

CPU Last purchase was dual Intel E5520 quad core May delivery purchase is dual Intel X5650 hex-core –again 4 motherboards per 2U box with redundant power supply NL Cloud Meeting, 5 April 2011 Israel ATLAS Tier2 Status 8 coresTel AvivTechnionWeizmannTotal Now May We benefit a lot that some other groups place some cores in our cluster: * Weizmann: ATLAS+Phenix/Heavy-Ion, HEP Theory, Condensed matter * Technion: HEP Theory and Bio-informatics * TAU includes: HEP Theory

Services nodes Virtualize most services Two 8-core servers, 48GB Failover Easier management –VM images –Roll-back –Image sharing –Easier testing: temp machines May delivery of HW Deciding among: VMware, Xen, Citrix, KVM SE not included ServiceWhere gLite CEper site gLite site-BDIIper site gLite MONper site glite APELper site ELOG electronic log bookWI Zenoss fabric monitoringper site LDAP, DNS, DHCP, syslogper site Frontier DB cacheper site VOMS (for Israel)TAU gLite WMS, LB (for Israel)WI gLite myproxy (for Israel)WI gLite Top-BDII (for Israel)WI gLite NAGIOS for Israel grid service monitoring WI Mantis issue trackerTech Managers’ Wiki pagesTech NL Cloud Meeting, 5 April 2011 Israel ATLAS Tier2 Status 9

Networking Our networking is not good Geant connection is 2 x 1.5G (subscribed on 2 x 2.5G infrastructure) “Political” limits: TAU 500M, Technion 350M, WI 400M –Because a 1G line is shared with institute traffic and the shared router is not really able to do 1G duplex We suspect that the gross mismatch with SARA/NIKHEF’s 10G causes failed connections due to dropped packets. –Lowering the # of files & streams to avoid dropped packets leaves us with even worse net BW Expensive because it is an undersea fiber and one (Italian) company owns the fibers. –An Israeli competitor is installing another fiber now NL Cloud Meeting, 5 April 2011 Israel ATLAS Tier2 Status 10

Networking NL Cloud Meeting, 5 April 2011 Israel ATLAS Tier2 Status 11

GEANT NL Cloud Meeting, 5 April 2011 Israel ATLAS Tier2 Status 12

Networking plans May 2011(?): Increase international connection: from 3Gb/s to 4Gb/s. –5G might be possible later this year, but not budgeted. Replace old routers at entrances to institutes with 10G capable equipment. –This should increase our thru’put and reliability and allow us to actually use a major share of the 1G BW to the sites Negotiating 10G academic backbone Could have 10G to Geant in spring 2012 NL Cloud Meeting, 5 April 2011 Israel ATLAS Tier2 Status 13

SAM/NAGIOS Our NGI did not take on the SAM/NAGIOS monitoring responsibility After the new NAGIOS tests replaced SAM tests, we received no alerts on failed tests. This was a severe problem Finally in December it was agreed with EGI, our NGI and us that we would deploy a NAGIOS test service for Israel, until our NGI succeeded to do it. –The only functioning grid sites in Israel are our 3 ATLAS sites Our NAGIOS service was up and running in January. NL Cloud Meeting, 5 April 2011 Israel ATLAS Tier2 Status 14

Upcoming work Deploy Zenoss fabric and service monitor on all three clusters –currently in-test at Weizmann Deploy Puppet configuration system on all three clusters –We gave up on Quattor after having finally succeeded in getting it to run, Clear that it was unsustainable –Currently for work nodes at Weizmann –Needs to include gLite nodes Virtualization of services (excl SE) Address Storm “untested new version” problem NL Cloud Meeting, 5 April 2011 Israel ATLAS Tier2 Status 15

NL Cloud Meeting, 5 April 2011 Israel ATLAS Tier2 Status 16 End