Michael Doherty RAL UK e-Science AHM 2-4 September 2003 SRB in Action.

Slides:



Advertisements
Similar presentations
National University Community Research Institute (NUCRI) NU Community Research Institute (NUCRI) HASTAC (higher education)/HASS grid National School Board.
Advertisements

National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
Jens G Jensen Atlas Petabyte store Supporting Multiple Interfaces to Mass Storage Providing Tape and Mass Storage to Diverse Scientific Communities.
The Storage Resource Broker and.
The Storage Resource Broker and.
Peter Berrisford RAL – Data Management Group SRB Services.
Welcome to Middleware Joseph Amrithraj
Data Grid: Storage Resource Broker Mike Smorul. SRB Overview Developed at San Diego Supercomputing Center. Provides the abstraction mechanisms needed.
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE SAN DIEGO SUPERCOMPUTER CENTER Particle Physics Data Grid PPDG Data Handling System Reagan.
San Diego Supercomputer CenterNational Partnership for Advanced Computational Infrastructure1 Grid Based Solutions for Distributed Data Management Reagan.
VL-e PoC Introduction Maurice Bouwhuis VL-e work shop, April 7 th, 2006.
Chronopolis: Preserving Our Digital Heritage David Minor UC San Diego San Diego Supercomputer Center.
OxGrid, A Campus Grid for the University of Oxford Dr. David Wallom.
Applying Data Grids to Support Distributed Data Management Storage Resource Broker Reagan W. Moore Ian Fisk Bing Zhu University of California, San Diego.
Jean-Yves Nief, CC-IN2P3 Wilko Kroeger, SCCS/SLAC Adil Hasan, CCLRC/RAL HEPiX, SLAC October 11th – 13th, 2005 BaBar data distribution using the Storage.
UMIACS PAWN, LPE, and GRASP data grids Mike Smorul.
Data Grid: GRASP Mike Smorul. Grid Retrieval and Search Platform Based on concepts developed in the Earth Science Data Interface (ESDI) developed at the.
Magda – Manager for grid-based data Wensheng Deng Physics Applications Software group Brookhaven National Laboratory.
Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
Jan Storage Resource Broker Managing Distributed Data in a Grid A discussion of a paper published by a group of researchers at the San Diego Supercomputer.
Rule-Based Data Management Systems Reagan W. Moore Wayne Schroeder Mike Wan Arcot Rajasekar {moore, schroede, mwan, {moore, schroede, mwan,
ESP workshop, Sept 2003 the Earth System Grid data portal presented by Luca Cinquini (NCAR/SCD/VETS) Acknowledgments: ESG.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Data Grid Services/SRB/SRM & Practical Hai-Ning Wu Academia Sinica Grid Computing.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
National Center for Supercomputing Applications NCSA OPIE Presentation November 2000.
12th November 2003LHCb Software Week1 UK Computing Glenn Patrick Rutherford Appleton Laboratory.
Grid tool integration within the eMinerals project Mark Calleja.
DATABASE MANAGEMENT SYSTEMS IN DATA INTENSIVE ENVIRONMENNTS Leon Guzenda Chief Technology Officer.
Production Data Grids SRB - iRODS Storage Resource Broker Reagan W. Moore
1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004.
Mainframe (Host) - Communications - User Interface - Business Logic - DBMS - Operating System - Storage (DB Files) Terminal (Display/Keyboard) Terminal.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002.
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
WP8 Meeting Glenn Patrick1 LHCb Grid Activities in UK Grid WP8 Meeting, 16th November 2000 Glenn Patrick (RAL)
Kurt Mueller San Diego Supercomputer Center NPACI HotPage Updates.
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Management of Distributed Data Reagan W. Moore.
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Archive for the NSDL Reagan W. Moore Charlie Cowart.
Building the e-Minerals Minigrid Rik Tyer, Lisa Blanshard, Kerstin Kleese (Data Management Group) Rob Allan, Andrew Richards (Grid Technology Group)
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
The Global Land Cover Facility is sponsored by NASA and the University of Maryland.The GLCF is a founding member of the Federation of Earth Science Information.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
UK Grid Meeting Glenn Patrick1 LHCb Grid Activities in UK Grid Prototype and Globus Technical Meeting QMW, 22nd November 2000 Glenn Patrick (RAL)
1 e-Science AHM st Aug – 3 rd Sept 2004 Nottingham Distributed Storage management using SRB on UK National Grid Service Manandhar A, Haines K,
Introduction to The Storage Resource.
E-Curator: A Web-based Curatorial Tool Ian Brown, Mona Hess Sally MacDonald, Francesca Millar Yean-Hoon Ong, Stuart Robson Graeme Were UCL Museums & Collections.
RHIC/US ATLAS Tier 1 Computing Facility Site Report Christopher Hollowell Physics Department Brookhaven National Laboratory HEPiX Upton,
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
SDSC Storage Resource Broker & Meta-data Catalog SRB Archives HPSS, ADSM, UniTree, DMF Databases DB2, Oracle, Sybase File Systems Unix, NT, Mac OSX Application.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
The Storage Resource Broker and.
1 GridFTP and SRB Guy Warner Training, Outreach and Education Team, Edinburgh e-Science.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Collection-Based Persistent Archives Arcot Rajasekar, Richard Marciano, Reagan Moore San Diego Supercomputer Center Presented by: Preetham A Gowda.
The UK National Grid Service Andrew Richards – CCLRC, RAL.
Preservation Data Services Persistent Archive Research Group Reagan W. Moore October 1, 2003.
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
1 eScience Grid Environments th May 2004 NESC - Edinburgh Deployment of Storage Resource Broker at CCLRC for E-science Projects Ananta Manandhar.
Building Preservation Environments from Federated Data Grids Reagan W. Moore San Diego Supercomputer Center Storage.
Compute and Storage For the Farm at Jlab
Data services on the NGS
Grid Portal Services IeSE (the Integrated e-Science Environment)
Arcot Rajasekar Michael Wan Reagan Moore (sekar, mwan,
Presentation transcript:

Michael Doherty RAL UK e-Science AHM 2-4 September 2003 SRB in Action

Michael Doherty RAL Introduction What is SRB CCLRC and SRB Using SRB in CMS Using SRB in the e-Minerals Mini-Grid Future Projects Questions

Michael Doherty RAL Managing Data Historically data has been STORED rather than MANAGED Problems arising from this include –Scaling –Distribution –Access Control, Authentication, Security –Data Migration –Data Curation

Michael Doherty RAL What is SRB? Storage Resource Broker (SRB) is a software product developed by the San Diego Supercomputing Centre (SDSC). Allows users to access files and database objects across a distributed environment. Actual physical location and way the data is stored is abstracted from the user Allows the user to add user defined metadata describing the scientific content of the information

Michael Doherty RAL How SRB Works MCAT Database MCAT Server SRB A Server SRB B Server SRB Client a b cd e f g 4 major components: –The Metadata Catalogue(MCAT) –The MCAT SRB Server –The SRB Server –The SRB Client

Michael Doherty RAL The MCAT Database MCAT database is a metadata repository that provides a mechanism for storing information used by the SRB system. Includes both –Internal system data required for running the system –Application (user) metadata regarding data sets being brokered by SRB.

Michael Doherty RAL The MCAT Server At least one SRB Server must be installed on the node that can access the MCAT database. This is known as the MCAT SRB Server. MCAT SRB Server works directly against the MCAT database to provide SRB Services All other SRB Servers interact through the MCAT

Michael Doherty RAL The SRB Server The SRB Server is a middleware application that accepts requests from clients and obtains/queries/manages the necessary data sets. It queries the MCAT SRB Server to gather information on datasets and supplies this back to the SRB client.

Michael Doherty RAL SRB Clients Provides a user interface to send requests to the SRB server. 4 main implementations of this: –Command line –MS Windows (InQ) –Web based (MySRB). –Java Web Services (MATRIX) in beta –Precursor to GRID Services

Michael Doherty RAL Concepts Location: A physical node running an SRB Server Physical Resource: A storage area managed by an SRB Server Logical Resource: One or more Physical Resources – can be distributed Collection – Data abstraction of resources

Michael Doherty RAL SRB in Detail SRB Archives HPSS, ADSM, UniTree, DMF Databases DB2, Oracle, PostgreSQL File Systems Unix, NT, Mac OSX Application C, C++, Linux I/O Unix Shell Dublin Core Resource, User Defined Application Meta-data Remote Proxies DataCutter Third-party copy Java, NT Browsers Web Prolog Python MCAT HRM

Michael Doherty RAL Administration Users/Locations/Resources must be managed Two methods for doing this –Java MCAT Admin Tool –Command line tools

Michael Doherty RAL CCLRC and SRB The Data Management Group in CCLRC started working with SRB in November 2002 after a fact finding mission to the USA. There was an immediate requirement for a storage based product that allowed the addition of searchable metadata Generated lots of internal interest, which led to a number of projects with SRB

Michael Doherty RAL SRB Example: CMS Largest project using CCLRC SRB services at present is the CERN CMS experiment. SRB Chosen for ‘Pre-Challenge Production’, producing data for Data Challenge 2003/2004 (DC03/DC04) Need to prove data can be transferred, replicated and stored at LHC rates Need to coordinate data in the DC

Michael Doherty RAL CMS: Why SRB Not originally planned to SRB for DC –Planned RLS based system Nothing available CMS-wide on the timescale (end of year) for DC2003 Needed stable and supported product for 6 months continuously

Michael Doherty RAL Architecture MCAT DBADS MCAT ServerCSF Server CCLRC CERN FNAL BRISTOL ~13 sites

Michael Doherty RAL MCAT: CCLRC Database Service MCAT requires professionally run database Two IBM x440 clusters, one based at Daresbury Laboratory and the other at Rutherford Appleton Laboratory. The clusters connect to their own 1TB RAID 5 storage arrays via a independent fibre channel Storage Area Networks (SAN). Run Oracle Real Application Clusters software on Redhat Advanced Server for high availability/scalability RDBMS CMS MCAT hosted by 2 nodes Can load balance

Michael Doherty RAL Atlas Datastore (ADS) The Datastore currently has an on-line, nominal capacity of 1 Petabyte Based on an STK 9930 (PowderHorn) tape robot. The ADS team recently migrated from IBM 3590 (10GB) tape drives to STK 9940B (200GB) drives. CCLRC have written a custom SRB driver for this for CMS

Michael Doherty RAL ADS Driver for SRB Implemented Storage System Driver Implement (most) of the 16 standard calls that implement the driver layer such as copy, move, delete and create. Some functions have no equivalent in ADS

Michael Doherty RAL CMS Results (So far) How long does it take to register and replicate 500GB between to widely separated locations? –Tested 200GB which took approximately 6 hours, almost completely network limited How long does it take to register and replicate 50k 10kB files? –SRB has a bulk file registration mode which they have clocked at 400 files per second. Registered and replicated 1000 files in a few seconds What is the maximum sustainable transfer rate out of a single server and what is the maximum rate a server can accept data from three servers? –About 80-90% of network speed for 5 streams x number of servers How many files can be registered in a day? –No inherent limits How many parallel streams can a server accept? –Unknown, very small load on CPU with 10 streams

Michael Doherty RAL SRB Example: e-Minerals [Environment from the Molecular Level] UK e-science project for modelling the atomistic processes involved in environmental issues

Michael Doherty RAL e-Minerals Requirements Data Management Requirements –Scientists want to store input and output files from simulations in different locations –manage their own files/data via the web –give access to other project members –give temporary access to others

Michael Doherty RAL Architecture Daresbury App Server Cambridge SRB Resource Reading SRB Resource Bath SRB Resource Eminerals MiniGrid Daresbury Database server MCAT SRB Server Oracle Client MySRB Web Browser Application server runs SRB software Database server holds locations of files

Michael Doherty RAL Future Projects E-Materials Project: A mini grid similar to E- minerals above. National Crystallographic Service (Southampton University): For both data management and storage in the ADS. British Atmospheric Data Centre (BADC) archive. For both data management and storage in the ADS. NERC Data Grid. Details to be confirmed.

Michael Doherty RAL Future Directions? 3 major areas that CCLRC are commencing work with in collaboration with SDSC. –Web/Grid Services –High Performance Testing –Database Advice –Training

Michael Doherty RAL Summary Have established links with SRB community and SDSC Implemented SRB on projects Set up test systems for new projects Can help community with –SRB Test Systems –SRB Productions Systems –SRB Local Support Will contribute to future versions

Michael Doherty RAL Questions ?