Large Scale Test of a storage solution based on an Industry Standard

Slides:



Advertisements
Similar presentations
Tom Hamilton – America’s Channel Database CSE
Advertisements

Copy on Demand with Internal Xrootd Federation Wei Yang SLAC National Accelerator Laboratory Create Federated Data Stores for the LHC IN2P3-CC,
Ceph: A Scalable, High-Performance Distributed File System
Ceph: A Scalable, High-Performance Distributed File System Priya Bhat, Yonggang Liu, Jing Qin.
Managing storage requirements in VMware Environments October 2009.
Chapter 9 Designing Systems for Diverse Environments.
Symantec De-Duplication Solutions Complete Protection for your Information Driven Enterprise Richard Hobkirk Sr. Pre-Sales Consultant.
1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Matei Ripeanu.
Module – 7 network-attached storage (NAS)
“Better together” PowerVault virtualization solutions
Enterprise Storage Our Journey Thus Far John D. Halamka MD CIO, Harvard Medical School and Beth Israel Deaconess Medical Center.
© 2013 Mellanox Technologies 1 NoSQL DB Benchmarking with high performance Networking solutions WBDB, Xian, July 2013.
1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Advisor: Professor.
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
© Copyright 2010 Hewlett-Packard Development Company, L.P. 1 HP + DDN = A WINNING PARTNERSHIP Systems architected by HP and DDN Full storage hardware and.
HEAnet Centralised NAS Storage Justin Hourigan, Senior Network Engineer, HEAnet Limited.
Unit – I CLIENT / SERVER ARCHITECTURE. Unit Structure  Evolution of Client/Server Architecture  Client/Server Model  Characteristics of Client/Server.
FlashSystem family 2014 © 2014 IBM Corporation IBM® FlashSystem™ V840 Product Overview.
Multi-Tiered Storage with Xrootd at ATLAS Western Tier 2 Andrew Hanushevsky Wei Yang SLAC National Accelerator Laboratory 1CHEP2012, New York
Large Scale Test of a storage solution based on an Industry Standard Michael Ernst Brookhaven National Laboratory ADC Retreat Naples, Italy February 2,
AlphaServer UNIX Resource Consolidation.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Large Scale Parallel File System and Cluster Management ICT, CAS.
Ceph: A Scalable, High-Performance Distributed File System
USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.
Terascala – Lustre for the Rest of Us  Delivering high performance, Lustre-based parallel storage appliances  Simplifies deployment, management and tuning.
Scientific Storage at FNAL Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015.
TACTIC | Workflow: Project Management OSS on Microsoft Azure Helps Enterprises to Create Streamline, Manage, and Track Digital Content MICROSOFT AZURE.
Padova, 5 October StoRM Service view Riccardo Zappi INFN-CNAF Bologna.
Parallel IO for Cluster Computing Tran, Van Hoai.
Tackling I/O Issues 1 David Race 16 March 2010.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
Chapter 16 Client/Server Computing Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles, 6/E William.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
GPFS Parallel File System
Testing the Zambeel Aztera Chris Brew FermilabCD/CSS/SCS Caveat: This is very much a work in progress. The results presented are from jobs run in the last.
High Performance Storage System (HPSS) Jason Hick Mass Storage Group HEPiX October 26-30, 2009.
Distributed Systems Architectures. Topics covered l Client-server architectures l Distributed object architectures l Inter-organisational computing.
Compute and Storage For the Farm at Jlab
Building a Data Warehouse
Parallel Virtual File System (PVFS) a.k.a. OrangeFS
Introduction to Gluster
Univa Grid Engine Makes Work Management Automatic and Efficient, Accelerates Deployment of Cloud Services with Power of Microsoft Azure MICROSOFT AZURE.
Organizations Are Embracing New Opportunities
Diskpool and cloud storage benchmarks used in IT-DSS
Distribution and components
The Client/Server Database Environment
Introduction to client/server architecture
Storage Virtualization
Module – 7 network-attached storage (NAS)
Scalable SoftNAS Cloud Protects Customers’ Mission-Critical Data in the Cloud with a Highly Available, Flexible Solution for Microsoft Azure MICROSOFT.
Hyperconvergence Your Way
Data Security for Microsoft Azure
Unitrends Enterprise Backup Solution Offers Backup and Recovery of Data in the Microsoft Azure Cloud for Better Protection of Virtual and Physical Systems.
CloneManager® Helps Users Harness the Power of Microsoft Azure to Clone and Migrate Systems into the Cloud Cost-Effectively and Securely MICROSOFT AZURE.
Objective Understand the concepts of modern operating systems by investigating the most popular operating system in the current and future market Provide.
CLUSTER COMPUTING.
AWS Cloud Computing Masaki.
Distributed computing deals with hardware
Quasardb Is a Fast, Reliable, and Highly Scalable Application Database, Built on Microsoft Azure and Designed Not to Buckle Under Demand MICROSOFT AZURE.
NeST: Network Storage Technologies
PRESENTER GUIDANCE: These charts provide data points on how IBM BaaS mid-market benefits a client with the ability to utilize a variety of backup software.
Cost Effective Network Storage Solutions
Database System Architectures
Objective Understand the concepts of modern operating systems by investigating the most popular operating system in the current and future market Provide.
PerformanceBridge Application Suite and Practice 2.0 IT Specifications
Enterprise Class Virtual Tape Libraries
FlashSystem 9100 with Epic Workloads
OpenStack for the Enterprise
Presentation transcript:

Large Scale Test of a storage solution based on an Industry Standard Michael Ernst Brookhaven National Laboratory ADC Retreat Naples, Italy February 2, 2011

Motivation Though dCache at BNL supports distributed analysis well (up to 5000 concurrent analysis jobs) we are looking into ways to improve the usability of our environment, e.g. include large-scale interactive analysis. We want to explore to what extent we can build on industrial products as far as the facility part is concerned w/o having to rely on community extensions and interfaces. NFS 4.1 (pNFS) appealing because of performance, simplicity, level of integration w/ OS BlueArc successfully used by PHENIX and STAR

Areas of Interest

Whys is NFS4.1(pNFS) attractive?

BlueArc System Performance

Traditional Network File System No Throughput Aggregation Metadata and data together BUT Proven architecture Enterprise features Open, standard protocols Open storage philosophy BlueArc is about high performance network storage through standards. We deliver our file services via NFS and CIFS for example. So just so we’re on the same page, this is a view of a typical, traditional NFS system. The benefits are… At BlueArc we deliver the fastest systems for mixed workloads in this architecture BUT there what you don’t get here is throughput aggregation to go beyond the fastest nodes we have, and metadata and data are typically residing together on the disks – essentially getting in the way of each other as you scale up in bandwidth performance 8

pNFS Architecture With that said, let’s take a look at the pNFS architecture and I’ll also point out some of the areas where BlueArc’s implementation go beyond the minimum that is defined in the standard. In this picture you have … walk through the general elements Essentially pNFS is a protocol standard, it defines the communication between the clients and the metadata server and belween the clients and the data servers (movers). The communication between the metadata server and the data movers is an implementation detail. Some of the items we add to the solution is building on our current technology portfolio. With our hardware accelerated IOPS performance we have a screaming metadata server that will not melt under heavy workloads. We also have the ability through out to cluster nodes wherever needed. This is a cluster of clusters. The performance profile for the data servers is different so we will optimize those for maximum bandwidth performance. The architecture is essentially limitless, you scale performance (bandwidth) by adding more data server nodes. Any questions? 11

BlueArc pNFS Based NAS Platform Highly scalable single metadata server Clustered for HA Architecture supports multiple clustered metadata servers Support for heterogeneous data servers Built on Enterprise Class Platform Reliability Full Featured NAS – Quota, Snapshots etc. 12

pNFS for HPC Performance Scaling Data files automatically spread across multiple data FS for load balancing and performance Individual data files optionally striped across multiple data FS for performance Extreme Flexibility: A single server can run the entire cluster (acting as both metadata server and data mover) Can scale performance by adding data movers (and relocate the FS) Can scale capacity by adding data file systems Clients Direct IO Get Data Location File A File B Meta-Data File System Separate Data File Systems 13 13

Product Availability and Test Bed at BNL BA’s pNFS – Mid 2011 Up to 40 data movers with up to 8PB capacity each – growing to 128 data movers Mercury - High performance metadata server BlueArc Mercury or BlueArc Linux-based data mover appliances Performance Goals for Test Bed at BNL Will become DDM endpoint w/ SRM, GridFTP, Xrootd Up to 2k concurrent Analysis Jobs Up to 4 GB/s throughput between storage system and clients Aiming at having ~500 TB of disk space 15 15

Test Bed at BNL SRM Xrootd WA Network GridFTP With that said, let’s take a look at the pNFS architecture and I’ll also point out some of the areas where BlueArc’s implementation go beyond the minimum that is defined in the standard. In this picture you have … walk through the general elements Essentially pNFS is a protocol standard, it defines the communication between the clients and the metadata server and belween the clients and the data servers (movers). The communication between the metadata server and the data movers is an implementation detail. Some of the items we add to the solution is building on our current technology portfolio. With our hardware accelerated IOPS performance we have a screaming metadata server that will not melt under heavy workloads. We also have the ability through out to cluster nodes wherever needed. This is a cluster of clusters. The performance profile for the data servers is different so we will optimize those for maximum bandwidth performance. The architecture is essentially limitless, you scale performance (bandwidth) by adding more data server nodes. Any questions? WA Network GridFTP 16