HEPiX FSWG Progress Report (Phase 2) Andrei Maslennikov November 2007 – St. Louis.

Slides:



Advertisements
Similar presentations
Storage Review David Britton,21/Nov/ /03/2014 One Year Ago Time Line Apr-09 Jan-09 Oct-08 Jul-08 Apr-08 Jan-08 Oct-07 OC Data? Oversight.
Advertisements

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Jos van Wezel Doris Ressmann GridKa, Karlsruhe TSM as tape storage backend for disk pool managers.
EGEE is a project funded by the European Union under contract IST Using SRM: DPM and dCache G.Donvito,V.Spinoso INFN Bari
Distributed Tier1 scenarios G. Donvito INFN-BARI.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Project Overview GridPP Storage J Jensen GridPP storage workshop RHUL, April 2010.
Storage Task Force Intermediate pre report. History GridKa Technical advisory board needs storage numbers: Assemble a team of experts. 04/05 At HEPiX.
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
HEPiX IPv6 Working Group David Kelsey (STFC-RAL, UK) 4 May 2011 HEPiX, GSI, Darmstadt david.kelsey at stfc.ac.uk.
Filesytems and file access Wahid Bhimji University of Edinburgh, Sam Skipsey, Chris Walker …. Apr-101Wahid Bhimji – Files access.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
LHCC Comprehensive Review – September WLCG Commissioning Schedule Still an ambitious programme ahead Still an ambitious programme ahead Timely testing.
Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.
CASPUR Site Report Andrei Maslennikov Sector Leader - Systems Catania, April 2001.
CERN IT Department CH-1211 Genève 23 Switzerland t Plans and Architectural Options for Physics Data Analysis at CERN D. Duellmann, A. Pace.
HEPiX FSWG Progress Report Andrei Maslennikov Michel Jouvin GDB, May 2 nd, 2007.
EMI INFSO-RI SA2 - Quality Assurance Alberto Aimar (CERN) SA2 Leader EMI First EC Review 22 June 2011, Brussels.
SRM 2.2: status of the implementations and GSSD 6 th March 2007 Flavia Donno, Maarten Litmaath INFN and IT/GD, CERN.
OSG Area Coordinator’s Report: Workload Management April 20 th, 2011 Maxim Potekhin BNL
CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 1 Tier0 Status Tony Cass LCG-LHCC Referees Meeting 18 th November 2008.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Grid Lab About the need of 3 Tier storage 5/22/121CHEP 2012, The need of 3 Tier storage Dmitri Ozerov Patrick Fuhrmann CHEP 2012, NYC, May 22, 2012 Grid.
08/30/05GDM Project Presentation Lower Storage Summary of activity on 8/30/2005.
Meeting Wrap-up Spring 2012 FZU Prague Michel Jouvin / Sandy Philpott.
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.
Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April
Your university or experiment logo here GridPP Storage Future Jens Jensen GridPP workshop RHUL, April 2010.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
CERN-IT Oracle Database Physics Services Maria Girone, IT-DB 13 December 2004.
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Pre-GDB Storage Classes summary of discussions Flavia Donno Pre-GDB.
CASPUR Site Report Andrei Maslennikov Lead - Systems Rome, April 2006.
1 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November T2 storage issues M. Biasotto – INFN Legnaro.
Scientific Storage at FNAL Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015.
BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
Evolution of storage and data management Ian Bird GDB: 12 th May 2010.
Busy Storage Services Flavia Donno CERN/IT-GS WLCG Management Board, CERN 10 March 2009.
The CMS Top 5 Issues/Concerns wrt. WLCG services WLCG-MB April 3, 2007 Matthias Kasemann CERN/DESY.
LCG Report from GDB John Gordon, STFC-RAL MB meeting February24 th, 2009.
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
Report from GSSD Storage Workshop Flavia Donno CERN WLCG GDB 4 July 2007.
CASTOR project status CASTOR project status CERNIT-PDP/DM October 1999.
Daniele Spiga PerugiaCMS Italia 14 Feb ’07 Napoli1 CRAB status and next evolution Daniele Spiga University & INFN Perugia On behalf of CRAB Team.
GridKa December 2004 Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Doris Ressmann dCache Implementation at FZK Forschungszentrum Karlsruhe.
1 Update at RAL and in the Quattor community Ian Collier - RAL Tier1 HEPiX FAll 2010, Cornell.
T3g software services Outline of the T3g Components R. Yoshida (ANL)
Status of W2K at INFN Gian Piero Siroli, Dept. of Physics, Univ. of Bologna and INFN HEPiX-HEPNT 2000, Jefferson Lab.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
Grid Deployment Board 5 December 2007 GSSD Status Report Flavia Donno CERN/IT-GD.
The Grid Storage System Deployment Working Group 6 th February 2007 Flavia Donno IT/GD, CERN.
ILC EDMS Selection Committee Progress Report Tom Markiewicz SLAC 29 November 2005.
Andrea Manzi CERN EGI Conference on Challenges and Solutions for Big Data Processing on cloud 24/09/2014 Storage Management Overview 1 24/09/2014.
The HEPiX IPv6 Working Group David Kelsey (STFC-RAL) EGI OMB 19 Dec 2013.
A. Sim, CRD, L B N L 1 Production Data Management Workshop, Mar. 3, 2009 BeStMan and Xrootd Alex Sim Scientific Data Management Research Group Computational.
HEPiX spring 2013 report HEPiX Spring 2013 CNAF Bologna / Italy Helge Meinhard, CERN-IT Contributions by Arne Wiebalck / CERN-IT Grid Deployment Board.
HEPiX FSWG – Final Report
Road map SC3 preparation
CASTOR: possible evolution into the LHC era
Vincenzo Spinoso EGI.eu/INFN
Status of the SRM 2.2 MoU extension
gLite->EMI2/UMD2 transition
CC - IN2P3 Site Report Hepix Spring meeting 2011 Darmstadt May 3rd
CERN Lustre Evaluation and Storage Outlook
Proposal for obtaining installed capacity
Olof Bärring LCG-LHCC Review, 22nd September 2008
Patrick Fuhrmann (DESY) Benjamin Ertl (KIT) Maciej Brzezniak (PSNC)
Ákos Frohner EGEE'08 September 2008
TCG Discussion on CE Strategy & SL4 Move
Presentation transcript:

HEPiX FSWG Progress Report (Phase 2) Andrei Maslennikov November 2007 – St. Louis

AM 11/072 Summary Reminder: raison d’être Active members Workflow June October 2007 Phase 2 revelations Plans for Phase 3 Discussion

AM 11/073 Reminder: raison d’être Commissioned by IHEPCCC in the end of 2006 Officially supported by the HEP IT managers The goal is to review the available file system solutions and storage access methods, and to divulge the know-how among HEP organizations and beyond Timescale: Feb 2007 – April 2008 Milestones: 2 progress reports (Spring 2007, Fall 2007 ), 1 final report (Spring 2008)

AM 11/074 Active members Currently we have 22 people on the list, but only these 17 participated in conference calls and/or actually did something during the phase 2: CASPUR A.Maslennikov (Chair), M.Calori (Web Master) CEA J-C.Lafoucriere CERN B.Panzer-Steindel DESY M.Gasthuber, P.van der Reest FZK J.van Wezel IN2P3 L.Tortay INFN G.Donvito, V. Sapunenko LAL M.Jouvin NERSC/LBL C.Whitney RALN.White RZGH.Reuter SLACR.Melen, A.May U.EdinburghG.A.Cowan Held 9 phone conferences during this phase

AM 11/075 Workflow June – October 2007 Started to look at the previously reduced set of data access solutions Are concentrating only on two classes of data areas:  Shared Home Directories, currently: AFS, NFS  Large Scalable Shared Areas suitable for batch farms, currently: GPFS, Lustre, dCache, CASTOR, HPSS, Xrootd Will be mentioning disk/tape migration means and underlying hardware Collected information is being archived in a newly established technology tracking web site Try to evidentiate the issues common for all architectures and capture the general trends

AM 11/076 HEPiX Storage Technology Web Site Consultable at Meant as a storage reference site for HEP Not meant to become yet another storage Wikipedia Requires time, is being filled on the best effort basis Volunteers wanted!

AM 11/077 Volunteers wanted!

AM 11/078 Observed trends - general Most sites foresee an increase in Transparent File Access (TFA) storage in the next future. Lustre and GPFS are dominating the field. HSM functionality is seen as an important addition for TFA storage. Combinations like Lustre/HPSS, GPFS/HPSS, GPFS/TSM are being considered. TFA-based solutions are proving to be competitive. See, for instance, these interesting talks of G.Donvito and V.Sapunenko: 1) GPFS vs dCache and xrootd 2) GPFS vs CASTOR v2

AM 11/079 Observed trends – Tier-1 sites Balanced data streams handling proves to be the most difficult part. This includes the disk/tape migration scheduling and load dependent data distribution over the storage elements. Plenty of factors are to be accounted for. Optimized real time monitoring of system components is a key input for automated data migration and data access decisions. These decisions have to be made in a very fast manner.  There is still a lot of room for improvement in this area On-site competence is of crucial importance for Tier-1 sites that choose technologies like CASTOR or dCache.

AM 11/0710 Observed trends – Tier-2 sites These sites are mostly disk-based and hence have a bigger freedom of choice of the data access solutions. Selection of the storage technology is influenced by the need of integration with other sites. More specifically, the SRM interface is requested. This essentially reduces the circle to the 4 practicable solutions which dictate the ways in which the hardware is to be used: 1) dCache (built-in SRM interface, aggregates multiple disk server nodes) 2) Disk Pool Manager (same strategy as dCache) 3) Multiple Xrootd servers with a stand-alone SRM interface like BestMan 4) A solid distributed TFA file system + a stand-alone SRM interface like StoRM Only a few comparative studies exist so far for a subset of these solutions (see the aforementioned talks of Donvito and Sapunenko). They however reveal interesting facts and indicate a pronounced need to perform more investigations in this direction.

AM 11/0711 Plans for Phase 3 (T1 oriented) Will perform an assessment of the mixed disk/tape data access solutions adopted at Tier-1 sites (performance+cost). Will try to present things as they are and leave the readers to draw their own conclusions. Results will make into a chapter of our final report. The overall picture is positive, but an accurate independent analysis of the situation may be of great help for the T1 IT managers. The same hardware base allows for multiple solutions, so it will never be late to review or improve.

AM 11/0712 Plans for Phase 3 (T2 oriented) Will perform a comparative analysis of the two most diffused file systems (Lustre and GPFS), on the same hardware base. Tests will have to include a significant number of disk servers and file system clients. Possible test sites: IN2P3, CERN, FZK, DESY or CNAF. Will try to perform a comparative analysis of dCache, DPM, Xrootd and one of the TFA FS implementations on the same hardware base. The test results will be used to provide practical recommendations for the T2 sites which is the main goal of this working group. Will try to sort the acceptable T2 solutions in function of their performance and TCO. Will try to indicate the approriate architecture(s) for different types of T2 sites.

AM 11/0713 Plans for Phase 3 (general) Will continue working on the storage technology tracking pages on the best effort basis (Volunteers welcome!) Will complete / update the Questionnaire. The summary numbers are to appear in the final report The final report is due for the Spring 2008 meeting at CERN. We count on the active collaboration of all the sites involved!

AM 11/0714 Discussion