Performance Testing of DDN WOS Boxes Shaun de Witt, Roger Downing Future of Big Data Workshop June 27 th 2013.

Slides:



Advertisements
Similar presentations
Distributed Data Processing
Advertisements

1 Applications Virtualization in VPC Nadya Williams UCSD.
University of Notre Dame
EHarmony in Cloud Subtitle Brian Ko. eHarmony Online subscription-based matchmaking service Available in United States, Canada, Australia and United Kingdom.
WSUS Presented by: Nada Abdullah Ahmed.
Southwest Tier 2 Center Status Report U.S. ATLAS Tier 2 Workshop - Harvard Mark Sosebee for the SWT2 Center August 17, 2006.
The Next I.T. Tsunami Paul A. Strassmann. Copyright © 2005, Paul A. Strassmann - IP4IT - 11/15/05 2 Perspective Months  Weeks.
GridRPC Sources / Credits: IRISA/IFSIC IRISA/INRIA Thierry Priol et. al papers.
Ddn.com ©2012 DataDirect Networks. All Rights Reserved. Object storage in Cloud Computing and Embedded Processing Jan Jitze Krol Systems Engineer.
G Robert Grimm New York University Sprite LFS or Let’s Log Everything.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 8 Introduction to Printers in a Windows Server 2008 Network.
Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.
An Introduction to Cloud Computing. The challenge Add new services for your users quickly and cost effectively.
Highly Available ACID Memory Vijayshankar Raman. Introduction §Why ACID memory? l non-database apps: want updates to critical data to be atomic and persistent.
Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.
ADLB Update Recent and Current Adventures with the Asynchronous Dynamic Load Balancing Library Rusty Lusk Mathematics and Computer Science Division Argonne.
COMP 410 Update!. The Problems Story Time! Describe the Hurricane Problem Do this with pictures, lots of people, a hurricane, trucks, medicine all disconnected.
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
GRID job tracking and monitoring Dmitry Rogozin Laboratory of Particle Physics, JINR 07/08/ /09/2006.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Computing Hardware Starter.
Technology Overview. Agenda What’s New and Better in Windows Server 2003? Why Upgrade to Windows Server 2003 ?  From Windows NT 4.0  From Windows 2000.
IRODS performance test and SRB system at KEK Yoshimi KEK Building data grids with iRODS 27 May 2008.
| nectar.org.au NECTAR TRAINING Module 5 The Research Cloud Lifecycle.
INTRODUCTION The GRID Data Center at INFN Pisa hosts a big Tier2 for the CMS experiment, together with local usage from other HEP related/not related activities.
Multi-core Programming Introduction Topics. Topics General Ideas Moore’s Law Amdahl's Law Processes and Threads Concurrency vs. Parallelism.
DDN & iRODS at ICBR By Alex Oumantsev History of ICBR  Campus wide Interdisciplinary Center for Biotechnology Research  Core Facility  Funded by the.
Data Warehousing 1 Lecture-24 Need for Speed: Parallelism Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
Building a Parallel File System Simulator E Molina-Estolano, C Maltzahn, etc. UCSC Lab, UC Santa Cruz. Published in Journal of Physics, 2009.
Loosely Coupled Parallelism: Clusters. Context We have studied older archictures for loosely coupled parallelism, such as mesh’s, hypercubes etc, which.
Large Scale Parallel File System and Cluster Management ICT, CAS.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
LFC Replication Tests LCG 3D Workshop Barbara Martelli.
Ultimate Integration Joseph Lappa Pittsburgh Supercomputing Center ESCC/Internet2 Joint Techs Workshop.
High Speed Detectors at Diamond Nick Rees. A few words about HDF5 PSI and Dectris held a workshop in May 2012 which identified issues with HDF5: –HDF5.
Efficient Live Checkpointing Mechanisms for computation and memory-intensive VMs in a data center Kasidit Chanchio Vasabilab Dept of Computer Science,
Seamless Virtual Desktop & Application Delivery 2X RAS v14 – What’s New?
Stairway to the cloud or can we take the highway? Taivo Liik.
PanDA Status Report Kaushik De Univ. of Texas at Arlington ANSE Meeting, Nashville May 13, 2014.
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
NIGHTHAWK X6 AC3200 Tri-Band WiFi Router Reviewers Guide.
Data Evolution: 101. Parallel Filesystem vs Object Stores Amazon S3 CIFS NFS.
Monitoring with InfluxDB & Grafana
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
Technology Drill Down: Windows Azure Platform Eric Nelson | ISV Application Architect | Microsoft UK |
Improving Performance using the LINUX IO Scheduler Shaun de Witt STFC ISGC2016.
An Analysis of Data Access Methods within WLCG Shaun de Witt, Andrew Lahiff (STFC)
DDN Web Object Scalar for Big Data Management Shaun de Witt, Roger Downing (STFC) Glenn Wright (DDN)
Research and Service Support Resources for EO data exploitation RSS Team, ESRIN, 23/01/2013 Requirements for a Federated Infrastructure.
1 © 2014 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others. Any statements or representations around future events.
29/04/2008ALICE-FAIR Computing Meeting1 Resulting Figures of Performance Tests on I/O Intensive ALICE Analysis Jobs.
CFI 2004 UW A quick overview with lots of time for Q&A and exploration.
Testing the Zambeel Aztera Chris Brew FermilabCD/CSS/SCS Caveat: This is very much a work in progress. The results presented are from jobs run in the last.
Organizations Are Embracing New Opportunities
Diskpool and cloud storage benchmarks used in IT-DSS
Introduction to Data Management in EGI
Alejandro Álvarez on behalf of the FTS team
Hadoop Clusters Tess Fulkerson.
Storage Virtualization
Integration of Singularity With Makeflow
RSS 2000 v3 Product Presentation
Building a Database on S3
Cloud Web Filtering Platform
BusinessObjects IN Cloud ……InfoSol’s story
Presentation transcript:

Performance Testing of DDN WOS Boxes Shaun de Witt, Roger Downing Future of Big Data Workshop June 27 th 2013

Topics Bit of Background on WOS Testing with IRODS Testing with compute cluster

WOS Boxes WOS in a nutshell (and why STFC are interested) –DataDirect Networks Web Object Scaler –Policy based replication between boxes at remote sites –Data Access through standard protocols –Fast Data Access Times –Easy integration with GPFS and IRODS Loan DDN WOS Boxes Provided by OCF –STFC are very grateful for this opportunity to test hardware

Use With IRODS STFC runs IRODS as part of EUDAT Project Reasons for Possible use of WOS within EUDAT –High Speed Online Storage –Replicated data between STFC sites at RAL and DL IRODS Setup –Used IRODS 3.2 –RHEL 6.2 VM with single 2.4 GHz and 1GB RAM Definitely sub-optimal! –Data loaded to cache and then sync’d to WOS

IRODS Testing (1) Tested object storage rate using scripts developed as part of EUDAT Sequentially add small files in groups of 5k –Looking for evidence of performance degradation with capacity or number of inodes used Test ran for ~ 1 week –2.26 Mobjects inserted

Number of Objects Time to Write 5k Files (secs)

Prelim Analysis Quite flat upload times –If anything rates increase as number of entries increases! Long period humps associated with Oracle Automated Optimisation Sort period humps associated with Oracle backups Observed rate on writing files (single client) ~5/sec –Almost certainly due to bottlenecks associated with VM

Other IRODS Tests Tried to perform massively parallel tests of WOS through IRODS but frustrated again by VM limitations –Stressed the VM more than WOS! Tested WOS directly using WOS native C/C++ API –Parallel writes of 1KB file with >30 threads completed –Object creation rate observed in WOS ~10 objects/sec

Scale out Tests Run on production compute cluster Attached to WOS via 1GB link Caveats of results –Typically multiple job run on a node (1 job/core) –Scheduling done via batch queue system Jobs did simple read/write/delete of 1GB files directly into WOS using wget (also tested using curl but no significant difference) Number of jobs up to 100

Preliminary Analysis Max results just show line speed into box –Not a big surprise! Mean and Min rates show –Pretty good read/write balance –Performance does drop off but at a lower rate than expected -146 KB/sec per client for reads -300 KB/sec per client for write BUT remember in these tests not all clients may be concurrently readin/writing

Modified Test Reconfigured tests so that all transfers happen concurrently –Initial test showed severe performance limitation Connection exhaustion at 10 concurrent connects –But rapid response from DDN to provide firmware update

Preliminary Analysis Peak rates for read and write fall off fairly predictably with number of concurrent connections What is very god is the aggregate rate being delivered is almost constant with number of clients –To limit shown Would have been interested to show performance under combined read/write conditions –But is this really what WOS was designed for?

Summary WOS box lives up to its claims! –Highly scalable data ingest and delivery –Replication policies are easy to set up and work well May be possible to set up an academic WOS network if we can find a suitable project –Predictable performance under ‘real life’ operation –Use of common language APIs and support for web based reading/writing through restful interface provides good integration potential –IRODS driver works out of the box GPFS support not tested

Thanks Thanks to: –Georgina Ellis (OCF) for helping arrange the loan –Glenn Wright (DDN) for support during testing –Edlin Browne (DDN) for coming with this idea and helping arrange the loan And paying for dinner tonight! –And to all of you! Questions? –I guess some of which can be answered by DDN!