B ENCHMARK ON D ELL 2950+MD1000 ATLAS Tier2/Tier3 workshop Wenjing wu AGLT2 / University of Michigan 2008/05/27.

Slides:



Advertisements
Similar presentations
PowerEdge T20 Customer Presentation. Product overview Customer benefits Use cases Summary PowerEdge T20 Overview 2 PowerEdge T20 mini tower server.
Advertisements

PowerEdge T20 Channel NDA presentation Dell Confidential – NDA Required.
Alastair Dewhurst, Dimitrios Zilaskos RAL Tier1 Acknowledgements: RAL Tier1 team, especially John Kelly and James Adams Maximising job throughput using.
Zeus Server Product Training Son Nguyen Zeus Server Product Training Son Nguyen.
Linux Clustering A way to supercomputing. What is Cluster? A group of individual computers bundled together using hardware and software in order to make.
Novell Server Linux vs. windows server 2008 By: Gabe Miller.
Ceph vs Local Storage for Virtual Machine 26 th March 2015 HEPiX Spring 2015, Oxford Alexander Dibbo George Ryall, Ian Collier, Andrew Lahiff, Frazer Barnsley.
ISCSI Performance in Integrated LAN/SAN Environment Li Yin U.C. Berkeley.
1 Web Server Administration Chapter 3 Installing the Server.
Teraserver Darrel Sharpe Matt Todd Rob Neff Mentor: Dr. Palaniappan.
Project Cysera Hardware Configuration Drafted by Zoebir Bong.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.
Capacity Planning in SharePoint Capacity Planning Process of evaluating a technology … Deciding … Hardware … Variety of Ways Different Services.
Sven Ubik, Petr Žejdl CESNET TNC2008, Brugges, 19 May 2008 Passive monitoring of 10 Gb/s lines with PC hardware.
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
MCITP Administrator: Microsoft SQL Server 2005 Database Server Infrastructure Design Study Guide (70-443) Chapter 1: Designing the Hardware and Software.
Tools for TEIN2 operation Yoshitaka Hattori Jin Tanaka APAN-JP.
Performance Tradeoffs for Static Allocation of Zero-Copy Buffers Pål Halvorsen, Espen Jorde, Karl-André Skevik, Vera Goebel, and Thomas Plagemann Institute.
Interposed Request Routing for Scalable Network Storage Darrell Anderson, Jeff Chase, and Amin Vahdat Department of Computer Science Duke University.
Disk Access. DISK STRUCTURE Sector: Smallest unit of data transfer from/to disk; 512B 2/4/8 adjacent sectors transferred together: Blocks Read/write heads.
Optimizing Performance of HPC Storage Systems
A TCP/IP transport layer for the DAQ of the CMS Experiment Miklos Kozlovszky for the CMS TriDAS collaboration CERN European Organization for Nuclear Research.
Evaluation of the LDC Computing Platform for Point 2 SuperMicro X6DHE-XB, X7DB8+ Andrey Shevel CERN PH-AID ALICE DAQ CERN 10 October 2006.
CERN - IT Department CH-1211 Genève 23 Switzerland t Tier0 database extensions and multi-core/64 bit studies Maria Girone, CERN IT-PSS LCG.
GBT Interface Card for a Linux Computer Carson Teale 1.
LECC2003 AmsterdamMatthias Müller A RobIn Prototype for a PCI-Bus based Atlas Readout-System B. Gorini, M. Joos, J. Petersen (CERN, Geneva) A. Kugel, R.
Company name KUAS HPDS dRamDisk: Efficient RAM Sharing on a Commodity Cluster Vassil Roussev, Golden G. Richard Reporter :
Amy Apon, Pawel Wolinski, Dennis Reed Greg Amerson, Prathima Gorjala University of Arkansas Commercial Applications of High Performance Computing Massive.
Linux Servers with JASMine K. Edwards, A. Kowalski, S. Philpott HEPiX May 21, 2003.
Data transfer over the wide area network with a large round trip time H. Matsunaga, T. Isobe, T. Mashimo, H. Sakamoto, I. Ueda International Center for.
A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.
4 Dec 2006 Testing the machine (X7DBE-X) with 6 D-RORCs 1 Evaluation of the LDC Computing Platform for Point 2 SuperMicro X7DBE-X Andrey Shevel CERN PH-AID.
CASPUR Storage Lab Andrei Maslennikov CASPUR Consortium Catania, April 2002.
Increasing Web Server Throughput with Network Interface Data Caching October 9, 2002 Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice Computer Architecture.
DYNES Storage Infrastructure Artur Barczyk California Institute of Technology LHCOPN Meeting Geneva, October 07, 2010.
Performance tests of storage arrays Irina Makhlyueva ALICE DAQ group 20 September 2004.
CHEP04 Performance Analysis of Cluster File System on Linux Yaodong CHENG IHEP, CAS
Providing Differentiated Levels of Service in Web Content Hosting Jussara Almeida, etc... First Workshop on Internet Server Performance, 1998 Computer.
LFC Replication Tests LCG 3D Workshop Barbara Martelli.
Consolidation and Optimization Best Practices: SQL Server 2008 and Hyper-V Dandy Weyn | Microsoft Corp. Antwerp, March
RAL Site Report John Gordon HEPiX/HEPNT Catania 17th April 2002.
Sep. 17, 2002BESIII Review Meeting BESIII DAQ System BESIII Review Meeting IHEP · Beijing · China Sep , 2002.
UTA Site Report DØrace Workshop February 11, 2002.
A Silvio Pardi on behalf of the SuperB Collaboration a INFN-Napoli -Campus di M.S.Angelo Via Cinthia– 80126, Napoli, Italy CHEP12 – New York – USA – May.
COMPASS Computerized Analysis and Storage Server Iain Last.
PROOF Benchmark on Different Hardware Configurations 1 11/29/2007 Neng Xu, University of Wisconsin-Madison Mengmeng Chen, Annabelle Leung, Bruce Mellado,
Q G Making Storage Real-time Steve Modica, CTO, Small Tree.
Exploiting Task-level Concurrency in a Programmable Network Interface June 11, 2003 Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice Computer Architecture.
Site Throughput Review and Issues Shawn McKee/University of Michigan US ATLAS Tier2/Tier3 Workshop May 27 th, 2008.
Providing Differentiated Levels of Service in Web Content Hosting J ussara Almeida, Mihaela Dabu, Anand Manikutty and Pei Cao First Workshop on Internet.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
Improving Performance using the LINUX IO Scheduler Shaun de Witt STFC ISGC2016.
Remigius K Mommsen Fermilab CMS Run 2 Event Building.
AS 61/62 USP The 1 st BRASWELL NAS (Celeron N3050/N3150) – Best performance: 224 MB/s read, 213 MB/s write Equipped with dedicated hardware encryption.
Recent experience with PCI-X 2.0 and PCI-E network interfaces and emerging server systems Yang Xia Caltech US LHC Network Working Group October 23, 2006.
1© Copyright 2015 EMC Corporation. All rights reserved. NUMA(YEY) BY JACOB KUGLER.
1062m0656 between 10692m2192 DS/ICI/CIF EqualLogic PS6510E
G. Russo, D. Del Prete, S. Pardi Frascati, 2011 april 4th-7th The Naples' testbed for the SuperB computing model: first tests G. Russo, D. Del Prete, S.
STORAGE EXPERIENCES AT MWT2 (US ATLAS MIDWEST TIER2 CENTER) Aaron van Meerten University of Chicago Sarah Williams Indiana University OSG Storage Forum,
S. Pardi Computing R&D Workshop Ferrara 2011 – 4 – 7 July SuperB R&D on going on storage and data access R&D Storage Silvio Pardi
NetFlow Analyzer Best Practices, Tips, Tricks. Agenda Professional vs Enterprise Edition System Requirements Storage Settings Performance Tuning Configure.
© 2007 Z RESEARCH Z RESEARCH Inc. Non-stop Storage GlusterFS Cluster File System.
This document contains information on a pre-launch desktop that is under NDA and is not yet available. Expected launch is: January 20, 2017.
Internal Parallelism of Flash Memory-Based Solid-State Drives
Building 100G DTNs Hurts My Head!
Lightning Talk FreeBSD storage performance
Power couple. Dell EMC servers powered by Intel® Xeon® processors and running Windows Server* 2016, ready to securely handle dynamic business workloads.
Lecture 9: Data Storage and IO Models
High-Performance Storage System for the LHCb Experiment
Presentation transcript:

B ENCHMARK ON D ELL 2950+MD1000 ATLAS Tier2/Tier3 workshop Wenjing wu AGLT2 / University of Michigan 2008/05/27

D ELL MD1000 2

CURRENT SETUP

2950 H ARDWARE EQUIPMENT Chassis Model: PowerEdge CPUS: Quad core, Intel Xeon CPU Model 15 Stepping 11 Memory : 16GB DDR II SDRAM, Memory Speed: 667 MHz NIC : Broadcom NetXtreme II BCM5708 Gigabit Ethernet Myricom 10G-PCIE-8A-C Raid controllers PERC 5/E Adapter Version (Slot 1 PCI-e 8x) PERC 5/E Adapter Version (Slot 2 PCI-e 4x) PERC 6/E Adapter Firmware version (Slot 1 PCI-e 8x) (extra 700$) PERC 6/E Adapter Firmware version (Slot 2 PCI-e 4x) (extra 700$) Storage Enclosures 4 MD1000 (each has15 SATA-II 750GB disks)

2950 S OFTWARE EQUIPMENT OS Scientific Linux CERN SLC release 4.5 (Beryllium) Kernel version: UL3smp (current UL5smp ) Version Report BIOS Version : (current 2.2.6) BMC Version : 1.33 (current 2.0.5) DRAC 5 Version : 1.14 (current 1.33)

B ENCHMARK T OOL Benchmark tool: iozone ( iozone el4.rf.x86_64 ) Raid configuration tool: omconfig ( srvadmin-omacore i386 ) Soft Raid: mdadm ( mdadm x86_64 )

M ETRICS OF B ENCHMARK Controller Level (both perc5/perc6) raid setup (R0, R5,R50,R6,R60) Read and write policy (ra, ara,nra, wb, wt, fwb) Threshold of both Controllers Stripe size (8KB,16KB,32KB,64KB, 128KB, 256KB,512Kb,1024KB) Perc5 support maximum 128KB stripe size, perc6 support maximum 1024KB stripe size Kernel tuning ( UL3smp) read Ahead size Request queue length IO scheduler File System tuning (xfs) inode size su/sw size internal/external log device

G ENERAL PRINCIPLE FOR B ENCHMARK There are various factors which would impact the benchmark result, to measure one, we are trying to fix the other affecting factors on a best value we have got or we anticipate.. We need to benchmark different IO patterns (sequence read/write random read/write/mix workload) In all, we need a benchmark for all best options for our Dell2950.

C ONTROLLER L EVEL raid setup (R5,R50,R6,R60) Read and write policy (ra, ara,nra, wb, wt, fwb) Threshold of Controller(perc5/perc6) Stripe size (8KB,16KB,32KB,64KB, 128KB, 256KB,512Kb,1024KB) Perc5 support maximum 128KB stripe size, perc6 support maximum 1024KB stripe size

P ERC 5 VS PERC 6 System setup: Controller=perc6/perc5 PCI slots= both pci express x4 and x8 raid=r60/r6/r50 stripe size =128KB read=ra, write=wb OS kernel= UL3smp readAhead size=10240Blocks=5MB nr_queue=128 queue_depth=128 IO_scheduler=deadline file system options: su=0, sw=0 isize=256, bsize=4096, log=internal bsize=4096 iozone options: filesize=32GB, ram size=16GB record size=512KB, multiple threads Measure: perc 5/6

READ

WRITE

R AID SETUP System setup: Controller=perc5 /perc6 PCI slots= both pci express x4 and x8 stripe size =128KB OS kernel= UL3smp readAhead size=10240Blocks=5MB nr_queue=128 queue_depth=128 IO_scheduler=deadline file system options: su=0, sw=0 isize=256, bsize=4096, log=internal bsize=4096 iozone options: filesize=32GB, ram size=16GB record size=512KB, multiple threads Measure: different raid (r5, r50,r6,r60)

W RITE

S OFT RAID ON PERC 5 Soft raid 0 over 2 r5: Soft raid stripe size should be the same as the hard raid5 stripe size(128KB) Soft raid 0 over 2 r50: Soft raid stripe size should be the same as the hard raid5 stripe size(128KB)

WRITE

R EAD

R EAD AND W RITE POLICY System setup: Controller=perc5 PCI slots= both pci express x4 and x8 raid=r50 stripe size =128KB OS kernel= UL3smp readAhead size=10240Blocks=5MB nr_queue=128 queue_depth=128 IO_scheduler=deadline file system options: su=0, sw=0 isize=256, bsize=4096, log=internal bsize=4096 iozone options: filesize=32GB, ram size=16GB different record size Measure: different policies (ra, nra,ara, wb,wt,fwb)

W RITE

R EAD

P ERC 5 THRESHOLD System setup: Controller=perc5 Pci slots= pci express x8 raid=r0 stripe size =128KB read=ra, write=wb OS kernel= UL3smp readAhead size=10240Blocks=5MB nr_queue=128 queue_depth=128 IO_scheduler=deadline file system options: su=0, sw=0 isize=256, bsize=4096, log=internal bsize=4096 iozone options: filesize=32GB, ram size=16GB record size=512KB Measure single controller with different number of disks.(4-30disks)

P ERC 5 THRESHOLD

P ERC 6 THRESHOLD System setup: Controller=perc6 Pci slots= pci express x8 raid=r60 stripe size =512KB read=ra, write=wb OS kernel= UL3smp readAhead size=10240Blocks=5MB nr_queue=512 queue_depth=128 IO_scheduler=deadline file system options: su=0, sw=0 isize=256, bsize=4096, log=internal bsize=4096 iozone options: filesize=32GB, ram size=16GB record size=512KB Measure single controller with different number of disks.(8, 12,24,30,45)

P ERC 6 THRESHOLD

S TRIPE SIZE System setup: Controller=perc6 PCI slots= both pci express x4 and x8 raid=r60 stripe size =(64,128,256,512,1024)KB read=ra, write=wb OS kernel= UL3smp readAhead size=10240Blocks=5MB nr_queue=512 queue_depth=128 IO_scheduler=deadline file system options: su=0, sw=0 isize=256, bsize=4096, log=internal bsize=4096 iozone options: filesize=32GB, ram size=16GB record size=512KB, multiple threads Measure: different stripe size (64,128,256,512,1024)KB

R60 – STRIPE SIZE

R60- STRIPE SIZE

K ERNEL TUNING read Ahead size Request queue length IO scheduler

R EAD A HEAD SIZE System setup: Controller=perc5 PCI slots= both pci express x4 and x8 raid=r50 stripe size =128KB read=ra, write=wb OS kernel= UL3smp nr_queue=128 queue_depth=128 IO_scheduler=deadline file system options: su=0, sw=0 isize=256, bsize=4096, log=internal bsize=4096 iozone options: filesize=32GB, ram size=16GB record size=512KB, Measure: different readAhead size

R EAD

R EQUEST Q UEUE LENGTH System setup: Controller=perc6 PCI slots= both pci express x4 and x8 raid=r60 stripe size =128KB read=ra, write=wb OS kernel= UL3smp readAhead size=10240Blocks=5MB queue_depth=128 IO_scheduler=deadline file system options: su=0, sw=0 isize=256, bsize=4096, log=internal bsize=4096 iozone options: filesize=32GB, ram size=16GB record size=512KB, multiple threads Measure: different request queue length

R EAD

W RITE

IO SCHEDULER System setup: Controller=perc6 PCI slots= both pci express x4 and x8 raid=r50 stripe size =128KB read=ra, write=wb OS kernel= UL3smp readAhead size=10240Blocks=5MB nr_queue=512 queue_depth=128 file system options: su=0, sw=0 isize=256, bsize=4096, log=internal bsize=4096 iozone options: filesize=32GB, ram size=16GB record size=512KB, multiple threads Measure: different scheduler

READ

WRITE

R ANDOM READ

F ILESYSTEM TUNING inode size su/sw size internal/external log device

F ILE S YSTME System setup: Controller=perc5 Raid=r50 PCI slots= both pci express x4 and x8 stripe size =128KB OS kernel= UL3smp readAhead size=10240Blocks=5MB nr_queue=128 queue_depth=128 IO_scheduler=deadline file system options: su=0, sw=0 isize=256, bsize=4096, dd options: filesize=10GB, ram size=320MB record size=1MB Measure: internal or external log device for xfs

W RITE

R EAD

X FS INODE SIZE System setup: Controller=perc5 Raid=r50 PCI slots= both pci express x4 and x8 stripe size =128KB OS kernel= UL3smp readAhead size=10240Blocks=5MB nr_queue=128 queue_depth=128 IO_scheduler=deadline file system options: su=0, sw=0 bsize=4096, Internal Log, isize=256, bsize=4096 dd options: filesize=10GB, ram size=320MB record size=1MB Measure: xfs inode size

X FS INODE SIZE

X FS SU / SW SIZE System setup: Controller=perc5 Raid=r50 PCI slots= both pci express x4 and x8 stripe size =128KB OS kernel= UL3smp readAhead size=10240Blocks=5MB nr_queue=128 queue_depth=128 IO_scheduler=deadline file system options: isize=256KB, bsize=4096KB, Internal Log, isize=256KB, bsize=4096KB iozone options: filesize=10GB, ram size=320MB record size=1MB Measure: xfs sw/su size

S U / SW SIZE

O UR SETUP N OW System setup: Controller=perc56 Raid=r60 PCI slots= both pci express x4 and x8 stripe size =512KB OS kernel= UL5smp Kernel options: readAhead size=10240Blocks=5MB nr_queue=512 queue_depth=128 IO_scheduler=deadline file system options: isize=256KB, bsize=4096KB, Internal Log, isize=256KB, bsize=4096KB

O UR PERFORMANCE NOW Single read=670MB/s Aggregate read=1500MB/s (threads>=2) Even with 40 concurrent readers, it can still achieve 1200MB/s.. Single write=320MB/s Aggregate write=680MB/s (threads>=2) This is not the best IO, r60 with stripe size 128KB can achieve 760MB/s of single read and single write performs almost the same. For a production system, we focus more on the aggregate performance…

O NGOING PROJECT CITI people of UM are doing: Disk-to-disk transfer over 10 GbE Deliverables Monthly report on performance tests, server configurations, kernel tuning, and kernel bottlenecks Final report on performance tests, server configurations, kernel tuning, and kernel bottlenecks UltraLight kernel Deliverables Tuned and tested UltraLight kernel with full feature set Current 10GbE NIC drivers Current storage drivers Tuned for WAN data movement Web100 patches Other patches for performance, security, and stability Release document and web page updates for UltraLight kernel Recommend sustainable options for the Ultralight kernel in the near and intermediate term

O NGOING PROJECT ( CONT ) QoS experiments Deliverable Document throughput performance with and without QoS in the face of competing traffic

M ORE INFORMATION AGLT2 IO benchmark page: stOnRaidSystems References: ml