Www.kit.edu KIT – The cooperation of Forschungszentrum Karlsruhe GmbH und Universität Karlsruhe (TH) Hadoop on HEPiX storage test bed at FZK Artem Trunov.

Slides:



Advertisements
Similar presentations
Duke Atlas Tier 3 Site Doug Benjamin (Duke University)
Advertisements

Linux Clustering A way to supercomputing. What is Cluster? A group of individual computers bundled together using hardware and software in order to make.
Site Report US CMS T2 Workshop Samir Cury on behalf of T2_BR_UERJ Team.
Storage Networking Technologies and Virtualization Section 2 DAS and Introduction to SCSI1.
BNL Oracle database services status and future plans Carlos Fernando Gamboa RACF Facility Brookhaven National Laboratory, US Distributed Database Operations.
Frangipani: A Scalable Distributed File System C. A. Thekkath, T. Mann, and E. K. Lee Systems Research Center Digital Equipment Corporation.
CVMFS AT TIER2S Sarah Williams Indiana University.
Jun 29, 20101/25 Storage Evaluation on FG, FC, and GPCF Jun 29, 2010 Gabriele Garzoglio Computing Division, Fermilab Overview Introduction Lustre Evaluation:
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
© 2013 Mellanox Technologies 1 NoSQL DB Benchmarking with high performance Networking solutions WBDB, Xian, July 2013.
Components of Windows Azure - more detail. Windows Azure Components Windows Azure PaaS ApplicationsWindows Azure Service Model Runtimes.NET 3.5/4, ASP.NET,
The Hadoop Distributed File System
A. Mohapatra, HEPiX 2013 Ann Arbor1 UW Madison CMS T2 site report D. Bradley, T. Sarangi, S. Dasu, A. Mohapatra HEP Computing Group Outline  Infrastructure.
Investigation of Storage Systems for use in Grid Applications 1/20 Investigation of Storage Systems for use in Grid Applications ISGC 2012 Feb 27, 2012.
Hadoop Hardware Infrastructure considerations ©2013 OpalSoft Big Data.
INTRODUCTION The GRID Data Center at INFN Pisa hosts a big Tier2 for the CMS experiment, together with local usage from other HEP related/not related activities.
Your university or experiment logo here NextGen Storage Shaun de Witt (STFC) With Contributions from: James Adams, Rob Appleyard, Ian Collier, Brian Davies,
INDIACMS-TIFR Tier 2 Grid Status Report I IndiaCMS Meeting, April 05-06, 2007.
Data transfer over the wide area network with a large round trip time H. Matsunaga, T. Isobe, T. Mashimo, H. Sakamoto, I. Ueda International Center for.
SLAC Experience on Bestman and Xrootd Storage Wei Yang Alex Sim US ATLAS Tier2/Tier3 meeting at Univ. of Chicago Aug 19-20,
03/03/09USCMS T2 Workshop1 Future of storage: Lustre Dimitri Bourilkov, Yu Fu, Bockjoo Kim, Craig Prescott, Jorge L. Rodiguez, Yujun Wu.
Manchester HEP Desktop/ Laptop 30 Desktop running RH Laptop Windows XP & RH OS X Home server AFS using openafs 3 DB servers Kerberos 4 we will move.
São Paulo Regional Analysis Center SPRACE Status Report 22/Aug/2006 SPRACE Status Report 22/Aug/2006.
Investigation of Storage Systems for use in Grid Applications 1/20 Investigation of Storage Systems for use in Grid Applications ISGC 2012 Feb 27, 2012.
1 PRAGUE site report. 2 Overview Supported HEP experiments and staff Hardware on Prague farms Statistics about running LHC experiment’s DC Experience.
ITEP computing center and plans for supercomputing Plans for Tier 1 for FAIR (GSI) in ITEP  8000 cores in 3 years, in this year  Distributed.
OSG Tier 3 support Marco Mambelli - OSG Tier 3 Dan Fraser - OSG Tier 3 liaison Tanya Levshina - OSG.
ATLAS Great Lakes Tier-2 (AGL-Tier2) Shawn McKee (for the AGL Tier2) University of Michigan US ATLAS Tier-2 Meeting at Harvard Boston, MA, August 17 th,
ATLAS Tier 1 at BNL Overview Bruce G. Gibbard Grid Deployment Board BNL 5-6 September 2006.
USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.
Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Implementation of a reliable and expandable on-line storage for compute clusters Jos van Wezel.
CASPUR Site Report Andrei Maslennikov Lead - Systems Rome, April 2006.
OSG Abhishek Rana Frank Würthwein UCSD.
Scientific Storage at FNAL Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015.
Ian Gable HEPiX Spring 2009, Umeå 1 VM CPU Benchmarking the HEPiX Way Manfred Alef, Ian Gable FZK Karlsruhe University of Victoria May 28, 2009.
 Introduction  Architecture NameNode, DataNodes, HDFS Client, CheckpointNode, BackupNode, Snapshots  File I/O Operations and Replica Management File.
BlueArc IOZone Root Benchmark How well do VM clients perform vs. Bare Metal clients? Bare Metal Reads are (~10%) faster than VM Reads. Bare Metal Writes.
Florida Tier2 Site Report USCMS Tier2 Workshop Livingston, LA March 3, 2009 Presented by Yu Fu for the University of Florida Tier2 Team (Paul Avery, Bourilkov.
Office of Science U.S. Department of Energy NERSC Site Report HEPiX October 20, 2003 TRIUMF.
KIT – University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association STEINBUCH CENTRE FOR COMPUTING - SCC
PROOF tests at BNL Sergey Panitkin, Robert Petkus, Ofer Rind BNL May 28, 2008 Ann Arbor, MI.
PROOF Benchmark on Different Hardware Configurations 1 11/29/2007 Neng Xu, University of Wisconsin-Madison Mengmeng Chen, Annabelle Leung, Bruce Mellado,
Experiments in Utility Computing: Hadoop and Condor Sameer Paranjpye Y! Web Search.
Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.
Cloud Computing project NSYSU Sec. 1 Demo. NSYSU EE IT_LAB2 Outline  Our system’s architecture  Flow chart of the hadoop’s job(web crawler) working.
BaBar Cluster Had been unstable mainly because of failing disks Very few (
Portuguese Grid Infrastruture(s) Gonçalo Borges Jornadas LIP 2010 Braga, Janeiro 2010.
A. Mohapatra, T. Sarangi, HEPiX-Lincoln, NE1 University of Wisconsin-Madison CMS Tier-2 Site Report D. Bradley, S. Dasu, A. Mohapatra, T. Sarangi, C. Vuosalo.
Database CNAF Barbara Martelli Rome, April 4 st 2006.
Load Rebalancing for Distributed File Systems in Clouds.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
Nov 18, 20091/15 Grid Storage on FermiGrid and GPCF – Activity Discussion Grid Storage on FermiGrid and GPCF Activity Discussion Nov 18, 2009 Gabriele.
An Analysis of Data Access Methods within WLCG Shaun de Witt, Andrew Lahiff (STFC)
IHEP Computing Center Site Report Gang Chen Computing Center Institute of High Energy Physics 2011 Spring Meeting.
By: Joel Dominic and Carroll Wongchote 4/18/2012.
Storage at SMU OSG Storage 9/22/2010 Justin Ross Southern Methodist University.
OSG STORAGE OVERVIEW Tanya Levshina. Talk Outline  OSG Storage architecture  OSG Storage software  VDT cache  BeStMan  dCache  DFS:  SRM Clients.
KIT – Universität des Landes Baden-Württemberg und nationales Forschungszentrum in der Helmholtz-Gemeinschaft Steinbuch Centre for Computing
G. Russo, D. Del Prete, S. Pardi Frascati, 2011 april 4th-7th The Naples' testbed for the SuperB computing model: first tests G. Russo, D. Del Prete, S.
Presenter: Yue Zhu, Linghan Zhang A Novel Approach to Improving the Efficiency of Storing and Accessing Small Files on Hadoop: a Case Study by PowerPoint.
S. Pardi Computing R&D Workshop Ferrara 2011 – 4 – 7 July SuperB R&D on going on storage and data access R&D Storage Silvio Pardi
Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.
BeStMan/DFS support in VDT OSG Site Administrators workshop Indianapolis August Tanya Levshina Fermilab.
KIT - University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association Xrootd SE deployment at GridKa WLCG.
Experience of Lustre at QMUL
Introduction to Distributed Platforms
Experience of Lustre at a Tier-2 site
Hadoop Clusters Tess Fulkerson.
Experience with GPFS and StoRM at the INFN Tier-1
Investigation of Storage Systems for use in Grid Applications
Presentation transcript:

KIT – The cooperation of Forschungszentrum Karlsruhe GmbH und Universität Karlsruhe (TH) Hadoop on HEPiX storage test bed at FZK Artem Trunov Karlsruhe Institute of Technology Karlsruhe, Germany

Steinbuch Center for Computing, department IDA Motivation Hadoop is a distributed file system with map/reduce framework, designed to run on commodity cluster and make use of their local hard drives Has potential for high parallel performance, scalable with the number of nodes Extremely fault tolerant against loss of cluster nodes Already in use at a number of OSG site in production Packaged in OSG, supported. Reference installation at FNAL OSG Hadoop is packaged into rpms for SL4, SL5 by Caltech BeStMan, gridftp backend There is a native ROOT I/O plugin for Hadoop Exists as a patch, by Brian Bockelman HEPiX storage WG has a test bed and reference data for performance comparison

Steinbuch Center for Computing, department IDA Test bed at FZK and previous tests Server 8 cores 2.00GHz RAM: 16 GB FC: 2 x Qlogic 2462 dual 4Gb Network: Quad GE card in ALB bonding mode=6. miimon=100 measured 450 MB/sec memory-memory in both directions OS: SuSE SLES10 SP2 64 bit Kernel: _lustre smp Disk system: DDN 9550 Direct attach with 2 x 4 Gb FC 4 Luns each composed of two tiers 16 data disks per Lun Lun block size - 4MB Cache: disabled Clients 10 x 8 cores 2.66GHz 16 GB RAM OS: RHEL4 64 bit Kernel: ELsmp non-modified + Lustre modules Test jobs CMSSW 1_6_6 2,4,6,8 jobs per node, 10 nodes 40 minutes run … Admin nodes (dCache, AFS) 10 Worker nodes 1Gb GridKa fabric Rack switch 10GB 4x1Gb 2x4G FC IBM core Xeon E4 8 core Xeon 4x1Gb DDN Tiers ( 8 data disks each) 16 internal disks with RAID5 or RAID6 Best Results with DDN disks | AFS/XFS | 73 MB/sec 116 MB/sec 116 MB/sec 113 MB/sec | | NATIVE | evs evs evs evs | | GPFS | 171 MB/sec 307 MB/sec 398 MB/sec 439 MB/sec | | | evs evs evs evs | | AFS/LU | 134 MB/sec 251 MB/sec 341 MB/sec 394 MB/sec | | VIA VICE| evs evs evs evs | | LUSTRE | 146 MB/sec 256 MB/sec 358 MB/sec 399 MB/sec | | NATIVE | evs evs evs evs | Basic setup of tests with DDN (Andrei Maslennikov)

Steinbuch Center for Computing, department IDA Testbed for Hadoop External servers and storage are not used Make use of worker nodes’ two internal 250 GB SATA hard drives On each allocate ~200MB partitions, format with ext3 Total of ~4TB of free space Hadoop setup Version 0.19 SL4 x86_64 from Caltech repo 10 datanodes + 1 namenode Test Jobs were run on 9 data nodes and the name node Fuse interface to HDFS, mounted on each node Slight complication: due to high sensitivity of Hadoop to performance of hard drives, had to reject one data node and use one of admin nodes as data nodes This had little impact on the test result. Hadoop settings: block size 64M replication factor 1 java heap size 512MB fuse settings: 'ro,rdbuffer=65536,allow_other’ network settings (pretty standard): net.core.netdev_max_backlog = net.core.rmem_max = net.core.wmem_max = net.ipv4.tcp_rmem = net.ipv4.tcp_wmem = block device settings: echo > /sys/block/sd${dr}/queue/max_sectors_kb echo > /sys/block/sd${dr}/queue/read_ahead_kb echo 32 > /sys/block/sd${dr}/queue/iosched/quantum echo 128 > /sys/block/sd${dr}/queue/nr_requests Vary during the tests block devise settings: read_ahead_kb: from 1 to 32 MB nr_requests from 32 to 512 fuse read ahead buffer: rdbuffer: from 16k to 256k Optimum was found at the following: read_ahead_kb: 16 MB nr_requests: 128 fuse rdbuffer: 128k Measure Total event count for 40 minute test jobs Read rate from disk … 10 Worker nodes 1GB GridKa fabric Rack router 10GB 4x1GB IBM core Xeon E4 8 core Xeon 4x1GB DDN Tiers ( 8 data disks each) 16 internal disks with RAID5 or RAID6

Steinbuch Center for Computing, department IDA Best results 20 threads 40 threads 60 threads 80 threads |Hadoop | 116 MB/sec 218 MB/sec 270 MB/sec 369 MB/sec | | | evs evs evs evs | | LUSTRE | 146 MB/sec 256 MB/sec 358 MB/sec 399 MB/sec | |DDN DISK | evs evs evs evs |

Steinbuch Center for Computing, department IDA Discussion Hadoop in this test bed is close to Lustre and outperforms it in the maximum load test. 8 jobs on 8 core machine is a standard batch setup Some other considerations are also taken into account when selecting storage Cost of administration, ease of deployment, capacity scaling, support for large name spaces Hard to think it’s a main HEP T1 storage solution, or it needs a lot of additional testing and careful deployment. As T2/T3 storage should be very interesting to WLCG sites Cost and maintenance factor is very favorable to small sites

Steinbuch Center for Computing, department IDA Future plans HEPiX test bed at FZK moved to a dedicated rack, RH5 Hadoop 0.20 or 0.21, all 64bit Newer CMS and Atlas software Check performance with replication factor >1 Check with various chunk sizes Test on a high end storage?

Steinbuch Center for Computing, department IDA Acknowledgments Andrei Maslennikov test suite setup, valuable guidance and comments Brian Bockelman, Terrence Martin (OSG) Hadoop wiki, tuning tips FZK team Jos van Wezel, Manfred Alef, Bruno Hoeft, Marco Stanossek, Bernhard Verstege