EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Deployment and Management of Grid Services.

Slides:



Advertisements
Similar presentations
Storage: Futures Flavia Donno CERN/IT WLCG Grid Deployment Board, CERN 8 October 2008.
Advertisements

NWfs A ubiquitous, scalable content management system with grid enabled cross site data replication and active storage. R. Scott Studham.
High Performance Computing (HPC) at Center for Information Communication and Technology in UTM.
Data oriented job submission scheme for the PHENIX user analysis in CCJ Tomoaki Nakamura, Hideto En’yo, Takashi Ichihara, Yasushi Watanabe and Satoshi.
ScotGrid: a Prototype Tier-2 Centre – Steve Thorn, Edinburgh University SCOTGRID: A PROTOTYPE TIER-2 CENTRE Steve Thorn Authors: A. Earl, P. Clark, S.
Profiling Grid Data Transfer Protocols and Servers George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Configuring and Maintaining EGEE Production.
StoRM Some basics and a comparison with DPM Wahid Bhimji University of Edinburgh GridPP Storage Workshop 31-Mar-101Wahid Bhimji – StoRM.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks C. Loomis (CNRS/LAL) M.-E. Bégin (SixSq.
Data Management The GSM-WG Perspective. Background SRM is the Storage Resource Manager A Control protocol for Mass Storage Systems Standard protocol:
F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference,
14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks G. Quigley, B. Coghlan, J. Ryan (TCD). A.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks C. Loomis (CNRS/LAL) M.-E. Bégin (SixSq.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMSMonitor: a tool to monitor gLite WMS/LB.
Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April
 CASTORFS web page - CASTOR web site - FUSE web site -
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
II IFIC Álvaro Fernández CasaníISGC2012, Taipei, 2th March 2012 Evolution of the Atlas data and computing model for a Tier-2 in the EGI infrastructure.
INFSO-RI Enabling Grids for E-sciencE gLite Data Management and Interoperability Peter Kunszt (JRA1 DM Cluster) 2 nd EGEE Conference,
July 29' 2010INDIA-CMS_meeting_BARC1 LHC Computing Grid Makrand Siddhabhatti DHEP, TIFR Mumbai.
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Pre-GDB Storage Classes summary of discussions Flavia Donno Pre-GDB.
USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Services for advanced workflow programming.
Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Implementation of a reliable and expandable on-line storage for compute clusters Jos van Wezel.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Implementation and performance analysis of.
CERN Database Services for the LHC Computing Grid Maria Girone, CERN.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks MSG - A messaging system for efficient and.
December 26, 2015 RHIC/USATLAS Grid Computing Facility Overview Dantong Yu Brookhaven National Lab.
Scientific Storage at FNAL Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarksEGEE-III INFSO-RI Astro-Wise and EGEE.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Lofar Information System on GRID A.N.Belikov. Lofar Long Term Archive Prototypes: EGEE Astro-WISE Requirements to data storage Tiers Astro-WISE adaptation.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid2Win : gLite for Microsoft Windows Roberto.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
SA1 operational policy training, Athens 20-21/01/05 Presentation of the HG Node “Isabella” and operational experience Antonis Zissimos Member of ICCS administration.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Spanish National Research Council- CSIC Isabel.
Database CNAF Barbara Martelli Rome, April 4 st 2006.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Response of the ATLAS Spanish Tier2 for.
J Jensen/J Gordon RAL Storage Storage at RAL Service Challenge Meeting 27 Jan 2005.
LIOProf: Exposing Lustre File System Behavior for I/O Middleware
EGEE-II INFSO-RI Enabling Grids for E-sciencE Overview of gLite, the EGEE middleware Mike Mineter Training Outreach Education National.
StoRM + Lustre Proposal YAN Tian On behalf of Distributed Computing Group
ATLAS TIER3 in Valencia Santiago González de la Hoz IFIC – Instituto de Física Corpuscular (Valencia)
IHEP Computing Center Site Report Gang Chen Computing Center Institute of High Energy Physics 2011 Spring Meeting.
Enabling Grids for E-sciencE EGEE-II INFSO-RI Status of SRB/SRM interface development Fu-Ming Tsai Academia Sinica Grid Computing.
INFSO-RI Enabling Grids for E-sciencE Turkish Tier-2 Site Report Emrah AKKOYUN High Performance and Grid Computing Center TUBITAK-ULAKBIM.
Open Science Grid Consortium Storage on Open Science Grid Placing, Using and Retrieving Data on OSG Resources Abhishek Singh Rana OSG Users Meeting July.
G. Russo, D. Del Prete, S. Pardi Frascati, 2011 april 4th-7th The Naples' testbed for the SuperB computing model: first tests G. Russo, D. Del Prete, S.
BeStMan/DFS support in VDT OSG Site Administrators workshop Indianapolis August Tanya Levshina Fermilab.
EGRID Project: Experience Report Implementation of a GRID Infrastructure for the Analysis of Economic and Financial data.
Status of BESIII Distributed Computing
Experience of Lustre at QMUL
SuperB – INFN-Bari Giacinto DONVITO.
WP18, High-speed data recording Krzysztof Wrona, European XFEL
StoRM: a SRM solution for disk based storage systems
Vincenzo Spinoso EGI.eu/INFN
Diskpool and cloud storage benchmarks used in IT-DSS
Added value of new features of the ATLAS computing model and a shared Tier-2 and Tier-3 facilities from the community point of view Gabriel Amorós on behalf.
Introduction to Data Management in EGI
Bernd Panzer-Steindel, CERN/IT
Experience of Lustre at a Tier-2 site
STORM & GPFS on Tier-2 Milan
Christof Hanke, HEPIX Spring Meeting 2008, CERN
The INFN Tier-1 Storage Implementation
GridTorrent Framework: A High-performance Data Transfer and Data Sharing Framework for Scientific Computing.
Presentation transcript:

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Deployment and Management of Grid Services for a data intensive complex infrastructure Álvaro Fernández ( IFIC – Valencia ) 5 th User Forum Uppsala, April 2010

Enabling Grids for E-sciencE EGEE-III INFSO-RI Contents Motivation for the talk Proposed Solution Computing and Storage Hardware Configuration and Management Developments Conclusions and Future To change: View -> Header and Footer 2

Enabling Grids for E-sciencE EGEE-III INFSO-RI Motivation Several user communities require access to computational and storage resources in efficient manner. –National Grid Initiative applications –Grid-CSIC –Atlas Tier2 and Tier3 –Local users Different kinds of applications and data access patterns –Parallel and sequential Applications –Public and private data. Need for sharing policies –File size varies, sequential and random access To change: View -> Header and Footer 3

Enabling Grids for E-sciencE EGEE-III INFSO-RI Proposed solution To use grid technologies that we have been working with, and Hardware and Software configurations that alow to grow in the future. Storage and Network Hardware based on a NAS Lustre as a parallel backend filesystem Storm as a Storage Resource Manager for our disk based storage Grid Services based on glite Development of plugin, Tools and High level applications for user commodity To change: View -> Header and Footer 4

Enabling Grids for E-sciencE EGEE-III INFSO-RI Computing Hardware To change: View -> Header and Footer 5 Grid-CSIC resources Sequential: 106 nodes 2xQuad Core Xeon 16 Gbytes Ram 2xHD SAS 134 BG RAID 0 Parallel: 48 nodes Same as before with Infiniband Mellanox Technologies MT25418 DDR 4x Atlas VO resources nodes HP DL160 2xQuad Core Xeon 16 GB RAM 2xHD SAS 134 BG RAID 0 Total cores : 1704

Enabling Grids for E-sciencE EGEE-III INFSO-RI Storage Hardware Tape library for long time storage (with legacy castor sw) Disk servers to store online data (as part of Tier-2 requirements) –Fast access and always available To change: View -> Header and Footer 6 Sun X4500/X4540 server 48 disks (500Gb/1Tb), with 2 disks for OS Redundant power supply 4x1Gb Ethernet ports Scientific Linux IFIC v4.4 with Lustre kernel Ext3 system (volumes until 18/36 Tib)

Enabling Grids for E-sciencE EGEE-III INFSO-RI Storage servers configuration IFIC tested best configuration: –Disks in raid 5 (5x8 + 1x6). Usage ratio of 80% –Best performance with 1 raid per disk controller –Bonnie++ aggregated test results:  Write: 444,585 KB/s  Read: 1,772,488 KB/s Setup: –We have 21 servers providing a raw capacity of around 646 TB To change: View -> Header and Footer 7

Enabling Grids for E-sciencE EGEE-III INFSO-RI Network To change: View -> Header and Footer 8 Cisco 4500 – core centre infrastructure. Cisco 6500 – scientific computing infrastructure Data servers with 1GB connection. Channel bonding tests were made. We aggregate posibly 2 channels in future. WNs and GridFTP servers with 1GB Data Network based on gigabyte ethernet. 10GB uplink to the backbone network reach 1Gbit per data server

Enabling Grids for E-sciencE EGEE-III INFSO-RI We chose Lustre because is a Petabyte scale posix filesystem, with ACL and quotas support. Opensource and supported by SUN/Oracle. In our installation with a single namespace ( /lustre/ific.uv.es): MGS: Management Server ( 1 per cluster ) + MDS: Metadata Server. ( 1 per filesystem ) –2xQuadCore Intel(R) Xeon(R) CPU 2.83GHz –32 GB. –Planned to add HA –Currently 4 Filesystems (MDT) OSS: Object Storage Server. ( Several per cluster guaranteeing scalability) 1 in every SUN X4500. –6 OSTs per server (raid-5 5x8 + 1x6) Clients. ( SL5 kernel with lustre patches), mainly User Interfaces (UI) and Worker Nodes (WN) with configurations in RW and RO mode Lustre installation To change: View -> Header and Footer 9

Enabling Grids for E-sciencE EGEE-III INFSO-RI Filesystems and Pools With the Lustre release 1.8.1, we have added pool capabilities to the installation. Pools Allow us to partition the HW inside a given Filesystem –Better data management –Assign determined OSTs to a application/group of users –Can separate heterogeneous disks in the future To change: View -> Header and Footer 10 4 Filesystems with various pools: /lustre/ific.uv.es Read Only on WNs and UI. RW on GridFTP + SRM /lustre/ific.uv.es/sw. Software: ReadWrite on WNs, UI /lustre/ific.uv.es/grid/atlas/t3 Space for T3 users: ReadWrite on WNs and UI on /rhome type lustre. Shared Home for users and mpi applications: ReadWrite on WNs and UI

Enabling Grids for E-sciencE EGEE-III INFSO-RI Lustre management Quotas –Applyed per Filesystem –For the Users in the Tier3 space –For the Shared Home Directory –Not being applied in general VO pools/spaces because all data is common for the VO ACLs –Applied in the diferent VO pools/spaces to allow policies to be implemented (ie: only some managers can write, all users can read) –In the User and shared home directories –In the SW directories –This to be respected by the higher Grid Middleware Layers (STORM part of the presentations) To change: View -> Header and Footer 11

Enabling Grids for E-sciencE EGEE-III INFSO-RI StoRM as an SRM server Interface of the storage network with the GRID. For disk based Storage Elements Implementing SRMv2.2 specification. Allows accessing Lustre filesystem with the posix interface. To change: View -> Header and Footer 12

Enabling Grids for E-sciencE EGEE-III INFSO-RI Storm local implementation & policies Access like a local file system, so it can create and control all the data available in disk with a SRM interface. (BE + FE Machine) Coordinate data transfers, real data streams are transferred with a gridFTPserver in another physical machine. (Pool of gridfts) Enforce authorization policies defined by the site and the VO. (users are mapped to common vo account) We Developed Authorization plugin to respect local file system with the corresponding user mappings and ACL's To change: View -> Header and Footer 13 # file: atlasproddisk # owner: storm # group: storm user::rwx group::r-x group:atlas:r-x group:atlp:rwx mask::rwx other::--- default:user::rwx default:group::r-x default:group:atlas:r-x default:mask::rwx default:other::--- LocalAuthorizationSource available > Storm 1.4

Enabling Grids for E-sciencE EGEE-III INFSO-RI Other Services: SEView Besides posix CLI: IFIC users additionally can access data through a WEB interface. Ubiquitous and X.509 secured easy read-only access. Developed with GridSite To change: View -> Header and Footer 14 WEB Interface to check datasets (Atlas groups of files) at the Distributed Tier-2

Enabling Grids for E-sciencE EGEE-III INFSO-RI Conclusions & Future Presented a data management approach for multi-VO in a computer centre based on grid technologies and mature software. Tested well in stress tests (Step09,UAT) more challenges to come, prepared to manage problems Lustre gives good performance and scalability to grown next year until Petabyte scale. Storm presents our disk-based storage to the grid. Performed very well, and suits our necessities. We would like it to continue not to being a complex system In the future, deploy an implementation for HSM and Lustre: –Follow CEA/Oracle work on this, these days:

Enabling Grids for E-sciencE EGEE-III INFSO-RI Final words More information: eStoRM ebHome Thanks for your attention !