Summary of 1 TB Milestone RD Schaffer Outline : u Goals and assumptions of the exercise u The hardware constraints u The basic model u What have we understood.

Slides:



Advertisements
Similar presentations
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
Advertisements

Objectivity Data Migration Marcin Nowak, CERN Database Group, CHEP 2003 March , La Jolla, California.
1 PERFORMANCE EVALUATION H Often in Computer Science you need to: – demonstrate that a new concept, technique, or algorithm is feasible –demonstrate that.
1 CS 501 Spring 2005 CS 501: Software Engineering Lecture 22 Performance of Computer Systems.
1 Andrew Hanushevsky - HEPiX, October 6-8, 1999 Mass Storage For BaBar at SLAC Andrew Hanushevsky Stanford.
1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Matei Ripeanu.
© 2009 IBM Corporation Statements of IBM future plans and directions are provided for information purposes only. Plans and direction are subject to change.
Data Storage Willis Kim 14 May Types of storages Direct Attached Storage – storage hardware that connects to a single server Direct Attached Storage.
Basics of Operating Systems March 4, 2001 Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard.
Ekrem Kocaguneli 11/29/2010. Introduction CLISSPE and its background Application to be Modeled Steps of the Model Assessment of Performance Interpretation.
Distributed Systems Early Examples. Projects NOW – a Network Of Workstations University of California, Berkely Terminated about 1997 after demonstrating.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
CSE 321b Computer Organization (2) تنظيم الحاسب (2) 3 rd year, Computer Engineering Winter 2015 Lecture #4 Dr. Hazem Ibrahim Shehata Dept. of Computer.
Data Center Infrastructure
Operating in a SAN Environment March 19, 2002 Chuck Kinne AT&T Labs Technology Consultant.
Online Systems Status Review of requirements System configuration Current acquisitions Next steps... Upgrade Meeting 4-Sep-1997 Stu Fuess.
Guide to Linux Installation and Administration, 2e 1 Chapter 9 Preparing for Emergencies.
Profiling Grid Data Transfer Protocols and Servers George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA.
Farm Management D. Andreotti 1), A. Crescente 2), A. Dorigo 2), F. Galeazzi 2), M. Marzolla 3), M. Morandin 2), F.
Jean-Yves Nief CC-IN2P3, Lyon HEPiX-HEPNT, Fermilab October 22nd – 25th, 2002.
GStore: GSI Mass Storage ITEE-Palaver GSI Horst Göringer, Matthias Feyerabend, Sergei Sedykh
David N. Brown Lawrence Berkeley National Lab Representing the BaBar Collaboration The BaBar Mini  BaBar  BaBar’s Data Formats  Design of the Mini 
Data transfer over the wide area network with a large round trip time H. Matsunaga, T. Isobe, T. Mashimo, H. Sakamoto, I. Ueda International Center for.
20-22 September 1999 HPSS User Forum, Santa Fe CERN IT/PDP 1 History  Test system HPSS 3.2 installation in Oct 1997 IBM AIX machines with IBM 3590 drives.
A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.
21 st October 2002BaBar Computing – Stephen J. Gowdy 1 Of 25 BaBar Computing Stephen J. Gowdy BaBar Computing Coordinator SLAC 21 st October 2002 Second.
1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Storage and File Structure.
Grid Lab About the need of 3 Tier storage 5/22/121CHEP 2012, The need of 3 Tier storage Dmitri Ozerov Patrick Fuhrmann CHEP 2012, NYC, May 22, 2012 Grid.
SLAC Site Report Chuck Boeheim Assistant Director, SLAC Computing Services.
9 February 2000CHEP2000 Paper 3681 CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts Hardware and Resources.
5 May 98 1 Jürgen Knobloch Computing Planning for ATLAS ATLAS Software Week 5 May 1998 Jürgen Knobloch Slides also on:
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.
JLAB Computing Facilities Development Ian Bird Jefferson Lab 2 November 2001.
The Alternative Larry Moore. 5 Nodes and Variant Input File Sizes Hadoop Alternative.
1 Database mini workshop: reconstressing athena RECONSTRESSing: stress testing COOL reading of athena reconstruction clients Database mini workshop, CERN.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
The concept of RAID in Databases By Junaid Ali Siddiqui.
PC clusters in KEK A.Manabe KEK(Japan). 22 May '01LSCC WS '012 PC clusters in KEK s Belle (in KEKB) PC clusters s Neutron Shielding Simulation cluster.
07/10/99 Focus Meeting1 Review of application software services for LHC era Lucia Silvestris Lucia Silvestris Cern/CMC -Infn/Bari.
STAR C OMPUTING Plans for Production Use of Grand Challenge Software in STAR Torre Wenaus BNL Grand Challenge Meeting LBNL 10/23/98.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Status of the Bologna Computing Farm and GRID related activities Vincenzo M. Vagnoni Thursday, 7 March 2002.
Some Ideas for a Revised Requirement List Dirk Duellmann.
Status & development of the software for CALICE-DAQ Tao Wu On behalf of UK Collaboration.
PROOF Benchmark on Different Hardware Configurations 1 11/29/2007 Neng Xu, University of Wisconsin-Madison Mengmeng Chen, Annabelle Leung, Bruce Mellado,
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Multimedia Retrieval Architecture Electrical Communication Engineering, Indian Institute of Science, Bangalore – , India Multimedia Retrieval Architecture.
January 20, 2000K. Sliwa/ Tufts University DOE/NSF ATLAS Review 1 SIMULATION OF DAILY ACTIVITITIES AT REGIONAL CENTERS MONARC Collaboration Alexander Nazarenko.
A UK Computing Facility John Gordon RAL October ‘99HEPiX Fall ‘99 Data Size Event Rate 10 9 events/year Storage Requirements (real & simulated data)
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
Jianming Qian, UM/DØ Software & Computing Where we are now Where we want to go Overview Director’s Review, June 5, 2002.
CIT 140: Introduction to ITSlide #1 CSC 140: Introduction to IT Operating Systems.
Operating System Overview
© 2002, Cisco Systems, Inc. All rights reserved.
WP18, High-speed data recording Krzysztof Wrona, European XFEL
The demonstration of Lustre in EAST data system
Lecture 16: Data Storage Wednesday, November 6, 2006.
Bernd Panzer-Steindel, CERN/IT
The COMPASS event store in 2002
Ákos Frohner EGEE'08 September 2008
Storage Virtualization
Grid Canada Testbed using HEP applications
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
ALICE Data Challenges Fons Rademakers Click to add notes.
Computer Systems Performance Evaluation
Use of variable size array to model the Atlas raw data
High Performance Storage System
Presentation transcript:

Summary of 1 TB Milestone RD Schaffer Outline : u Goals and assumptions of the exercise u The hardware constraints u The basic model u What have we understood so far u Where do we go from here u Summary

17 March 1999 Atlas database meeting2 Thanks to those who contributed to the 1 TB test I would like to thank those people who contributed to the successful completion of the 1 TB milestone: Martin Schaller and Rui Silva Carapinha from Atlas Gordon Lee, Alessandro Miotto, and Harry Renshall from the IT/PDP group Dirk Duellmann and Marcin Nowak from IT/ASD group

17 March 1999 Atlas database meeting3 Basic goals of the 1 TB test The primary goals: Write 1 TB of simulated raw data (jet production digits) to Objy databases stored in HPSS è Demonstrate the feasibility of the different elements with a first approximation of a model for Atlas raw data è Understand the performance of the different elements: basic hardware configuration, raw data object model Learning from this: Develop a system capable of easily loading Objy databases with Zebra data at a few MB/sec

17 March 1999 Atlas database meeting4 Globally, what has been achieved We have written 1 TB of jet production data into Objy databases stored in HPSS Overall performance of 1 TB: 5 ibm/aix Objy clients writing to 2 sun/solaris Objy servers typical aggregate write speed: ~1.5 MB/sec with HPSS staging-out ~3 MB/sec without HPSS staging-out operational efficiency over X-mas break: ~50% è ~19 days to write 1 TB The observed performance has not yet been fully understood

17 March 1999 Atlas database meeting5 The hardware configuration The basic hardware configuration was a client-server model: Zebra to Objy formatter (IBM/AIX) Zebra file stager HPSS server IBM/AIX tape server Dec 100 GB Multiple Objy clients 2 Objy servers AMS/HPSS (Sun/Solaris)

17 March 1999 Atlas database meeting6 The hardware constraints The hardware constraints: u Limited to clients running Objy V4 on AIX machines F Atlas software (Fortran + C++) releases were available only on hp/ibm/dec => hp needed V5.1 and dec needed V5 è No tests could be done with client on Sun server to and bypass AMS è Forced dependence on network connection Sun/AIX machines

17 March 1999 Atlas database meeting7 The basic model Recall the basic raw data transient model: DetectorElement Identifier identify() iterator digits_begin() iterator digits_end() Digit Identifier identify() Point3D position() float response() DetectorPosition Point3D center() Transform3D transform() DetectorPosition Point3D local_position(channel) Digit contains only channel numbers (+drift for trt) Object granularity: e.g. SCT/Pixel wafer, TRT layer, MDT chamber, LAr region Part saved in Objy

17 March 1999 Atlas database meeting8 The basic model, cont. The basic persistent model is: PEvtObjVector PEvent PDigit PEvtObj PDetectorElement Separate containers for each Det/digit type Different classes for Si/TRT/Calo Persistent by containment (VArrays) No attempt YET has been done to optimize the data model.

17 March 1999 Atlas database meeting9 The basic model, cont In order to limit the amount of data and cpu required by the conversion application: è digits were duplicated by x10 Typical event size: ~3 MB (jet production, with digit dupl.) Sizes of VArrays: ~100 B, ~1000 B, ~ B 6% 66% 24% of total data Space overhead: ~15% (1 - byte count/db size) (no ootidy run, no data compression tried) Objy page size was 8 kB

17 March 1999 Atlas database meeting10 What have we understood so far What are the different elements which can limit the I/O throughput: u Objy server disk I/O u Objy HPSS interface u Objy client Objy server communication F AMS F network u Objy application (Zebra read, conversion, Objy write) We have been investigating these different elements

17 March 1999 Atlas database meeting11 What have we understood, cont. Objy server disk I/O and HPSS interface: The 1 TB test was done with a single SCSI disk per server with typical I/O speed of 10 MB/sec F We have seen concurrent read/write reduces throughput by factor of 2 èHPSS staging causes factor of 2 reduction for SCSI disks. Since the 1 TB test, the main Objy server (atlobj02) has been equipped with a RAID disk array: è Aggregate I/O rates for multiple stream read/write are ~25 to 30 MB/sec (simple I/O, I.e. NOT Objy) è I/O rates are ~ independent of number of streams

17 March 1999 Atlas database meeting12 What have we understood, cont. Objy performance on Objy server, i.e. local disk r/w: Marcin Nowak has made r/w measurements with a simple application (2 GB db, 2 kB objects) F write speed MB/sec (8 kB and 32 kB page sizes) F read speed ~ 25 MB/sec (8 kB and 32 kB page sizes) è~x2 loss for local write for this model èconfirms other measurements where Objy read speed is ~80% of simple disk read

17 March 1999 Atlas database meeting13 What have we understood, cont. Objy performance from remote client (atlobj01/sun), i.e. add AMS and network (tcp speed ~11 MB/sec): Corresponding measurements by Marcin Nowak: F write speed MB/sec (8 kB and 32 kB page sizes) F read speed MB/sec (8 kB and 32 kB page sizes) Interaction of the network + AMS clearly: èreverses the I/O of read and write, and èintroduces an additional throughput loss The detailed reasons for this remain to be understood

17 March 1999 Atlas database meeting14 What have we understood, cont. Network configurations: Although the two sun servers have a network connection of ~11 MB/sec, Other computing platforms, e.g. rsplus/atlaswgs/atlaslinux, have a worse (and unknown) network connectivity: measurements typically give MB/sec è Over the next few weeks, this is a general upgrade of the network connectivity to GB ethernet which should bring connections between atlaswgs and atlobj01/02 to ~10 MB/sec

17 March 1999 Atlas database meeting15 What have we understood, cont. Performance of the Objy application: Up to now, this has not yet been thoroughly investigated: è it has been clear that bottlenecks have been elsewhere (at least for events with duplicated digits) è it will help to have full Atlas software built on sun to understand the local r/w performance However, there is indication that work needs to be done: For example, I am now able to read through a database locally on atlobj02 and find <7 MB/sec where Marcin Nowak’s simple model gives 25 MB/sec read performance.

17 March 1999 Atlas database meeting16 Where do we go from here Clearly the performance issues must be fully understood. We would like to reach the point where we can repeat the 1 TB milestone with ~5 MB/sec average write speed è should take ~3 days In parallel, we are creating an environment where Objy can be used in a more general way.

17 March 1999 Atlas database meeting17 Where do we go from here, cont. Creating a working environment to use Objy: u We have been porting the Objy software to hp/dec/linux/sun using Objy v5.1 u We currently have three Objy machines: F atlobj01 - a developers server, for personal boot files and developer db’s, has ~100 GB disk space F atlobj02 - production server with RAID disk backed up by HPSS –This will soon be replaced by another sun, and atlobj02 will then become an WGS. F lockatl - production lock server for production boot/journal files

17 March 1999 Atlas database meeting18 Where do we go from here, cont. Creating a working environment to use Objy, cont.: u For people working on OO reconstruction developments, I would like to see the following scenario: F stabilize db schema (e.g. for different production releases) F move “standard subsets” of the Geant3 data to atlobj02 F reconstruction developers then work on atlaswgs accessing these subsets via atlobj02 As the db schema evolves, this cycle will have to be repeated creating new federations for the different production releases.

17 March 1999 Atlas database meeting19 Summary As a first exercise, we have been able to write 1 TB of Atlas raw data into Objy db’s and HPSS over the X-mas holidays The average write performance was ~1.5 MB/sec with a duty cycle of ~50%. Although not fully understood, elements limiting performance have been identified to be: network, disk r/w capabilities, use of AMS, and (possibly) the event model We hope to repeat this test with ~5 MB/sec capability, and set up a production environment for reconstruction developers to work in.