Mini-Workshop on multi-core joint project Peter van Gemmeren (ANL) I/O challenges for HEP applications on multi-core processors An ATLAS Perspective.

Slides:



Advertisements
Similar presentations
MEMORY popo.
Advertisements

Buffers & Spoolers J L Martin Think about it… All I/O is relatively slow. For most of us, input by typing is painfully slow. From the CPUs point.
Khaled A. Al-Utaibi  Computers are Every Where  What is Computer Engineering?  Design Levels  Computer Engineering Fields  What.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 11.
January 25 & 27, Csci 2111: Data and File Structures Week3, Lecture 1 & 2 Secondary Storage and System Software: CD-ROM & Issues in Data Management.
1 External Sorting Chapter Why Sort?  A classic problem in computer science!  Data requested in sorted order  e.g., find students in increasing.
External Sorting CS634 Lecture 10, Mar 5, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Chapter 7 Interupts DMA Channels Context Switching.
1 External Sorting Chapter Why Sort?  A classic problem in computer science!  Data requested in sorted order  e.g., find students in increasing.
External Sorting 198:541. Why Sort?  A classic problem in computer science!  Data requested in sorted order e.g., find students in increasing gpa order.
1 External Sorting for Query Processing Yanlei Diao UMass Amherst Feb 27, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Backup & Recovery 1.
Filesytems and file access Wahid Bhimji University of Edinburgh, Sam Skipsey, Chris Walker …. Apr-101Wahid Bhimji – Files access.
XAOD persistence considerations: size, versioning, schema evolution Joint US ATLAS Software and Computing / Physics Support Meeting Peter van Gemmeren.
Wahid Bhimji University of Edinburgh J. Cranshaw, P. van Gemmeren, D. Malon, R. D. Schaffer, and I. Vukotic On behalf of the ATLAS collaboration CHEP 2012.
Computing hardware CPU.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
Roger Jones, Lancaster University1 Experiment Requirements from Evolving Architectures RWL Jones, Lancaster University Ambleside 26 August 2010.
ATLAS Peter van Gemmeren (ANL) for many. xAOD  Replaces AOD and DPD –Single, EDM – no T/P conversion when writing, but versioning Transient EDM is typedef.
Requirements for a Next Generation Framework: ATLAS Experience S. Kama, J. Baines, T. Bold, P. Calafiura, W. Lampl, C. Leggett, D. Malon, G. Stewart, B.
2009 Sep 10SYSC Dept. Systems and Computer Engineering, Carleton University F09. SYSC2001-Ch7.ppt 1 Chapter 7 Input/Output 7.1 External Devices 7.2.
Frontiers in Massive Data Analysis Chapter 3.  Difficult to include data from multiple sources  Each organization develops a unique way of representing.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 13.
Analysis of the ROOT Persistence I/O Memory Footprint in LHCb Ivan Valenčík Supervisor Markus Frank 19 th September 2012.
Introduction Advantages/ disadvantages Code examples Speed Summary Running on the AOD Analysis Platforms 1/11/2007 Andrew Mehta.
Database Management Systems, R. Ramakrishnan and J. Gehrke 1 External Sorting Chapter 13.
Operating Systems David Goldschmidt, Ph.D. Computer Science The College of Saint Rose CIS 432.
ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.
Toward a Next-generation I/O Framework David Malon, Jack Cranshaw, Peter van Gemmeren, Marcin Nowak, Alexandre Vaniachine US ATLAS Technical Planning Meeting.
1 External Sorting. 2 Why Sort?  A classic problem in computer science!  Data requested in sorted order  e.g., find students in increasing gpa order.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
I/O Infrastructure Support and Development David Malon ATLAS Software Technical Interchange Meeting 9 November 2015.
File Systems cs550 Operating Systems David Monismith.
I/O Strategies for Multicore Processing in ATLAS P van Gemmeren 1, S Binet 2, P Calafiura 3, W Lavrijsen 3, D Malon 1 and V Tsulaia 3 on behalf of the.
CPSC Why do we need Sorting? 2.Complexities of few sorting algorithms ? 3.2-Way Sort 1.2-way external merge sort 2.Cost associated with external.
ATLAS Distributed Computing perspectives for Run-2 Simone Campana CERN-IT/SDC on behalf of ADC.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 11.
Multimedia Retrieval Architecture Electrical Communication Engineering, Indian Institute of Science, Bangalore – , India Multimedia Retrieval Architecture.
External Sorting. Why Sort? A classic problem in computer science! Data requested in sorted order –e.g., find students in increasing gpa order Sorting.
Part IVI/O Systems Chapter 13: I/O Systems. I/O Hardware a typical PCI bus structure 2.
Next-Generation Navigational Infrastructure and the ATLAS Event Store Abstract: The ATLAS event store employs a persistence framework with extensive navigational.
I/O aspects for parallel event processing frameworks Workshop on Concurrency in the many-Cores Era Peter van Gemmeren (Argonne/ATLAS)
I/O and Metadata Jack Cranshaw Argonne National Laboratory November 9, ATLAS Core Software TIM.
Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,
Multi Process I/O Peter Van Gemmeren (Argonne National Laboratory (US))
Peter van Gemmeren (ANL) Persistent Layout Studies Updates.
Athena I/O Component Refactorization - Overview Random notes about todays agenda. Peter Van Gemmeren (Argonne National Laboratory (US))
Root I/O and the Gaudi Framework
Atlas IO improvements and Future prospects
Multicore Computing in ATLAS
External Sorting Chapter 13
Chapter 9 – Real Memory Organization and Management
Assembly Language for Intel-Based Computers, 5th Edition
Computer Architecture
Grid Canada Testbed using HEP applications
Device Management Damian Gordon.
External Sorting Chapter 13
Types of Computers Mainframe/Server
Outline Module 1 and 2 dealt with processes, scheduling and synchronization Next two modules will deal with memory and storage Processes require data to.
CPU Key Revision Points.
CS222: Principles of Data Management Lecture #10 External Sorting
External Sorting.
CS222P: Principles of Data Management Lecture #10 External Sorting
External Sorting Chapter 13
New I/O navigation scheme
Presentation transcript:

Mini-Workshop on multi-core joint project Peter van Gemmeren (ANL) I/O challenges for HEP applications on multi-core processors An ATLAS Perspective

AthenaMP: The Process point of view AthenaMP is the ALTAS multi-core framework Diagram taken from Sebastien Binet (thanks). 2 05/13/2010Peter van Gemmeren (ANL): I/O challenges for HEP applications on multi- core processors initfork Collect & merge bootstrap parallel event processing

Merge In AthenaMP each worker node produces it own output file, which need to be merged after all worker are done. – Done in serial, can take significant amount of wall clock time. For ATLAS, which uses transient / persistent separation and data stored in POOL / ROOT there are 3 different merge level: – Athena merge (b -> B -> P -> T -> T -> P -> B -> b) Currently used, slow – Direct persistent to persistent copy in Athena (b -> B -> P -> P -> B -> b) Waiting for adoption, 20 – 30 % less slow – Fast POOL / ROOT merge (b -> b) Outside Athena, does not yet work, 10 – 20 times faster 3 05/13/2010Peter van Gemmeren (ANL): I/O challenges for HEP applications on multi- core processors decompressPers. to Trans Compressed baskets (b) Persistent State (P) Transient State (T) Baskets (B) stream

Full Athena merge (b -> B -> P -> T -> T -> P -> B -> b) Runs inside Athena Event Loop: – Uncompress ROOT baskets, rebuild persistent objects. – Convert persistent to transient objects – Do nothing with transient objects – Convert transient to persistent objects – Stream persistent objects into ROOT baskets and compress Produces Output File which is very similar to one produced by a single core processing all input events. – Handles Metadata propagation using full Athena framework. However, there may be metadata which is irreproducible from many incomplete jobs (e.g., lumiblock information) – Re-optimizes persistent Data Objects E.g.: The file has one GUID which is used in all the DataHeader (and read only once for all events) – Re-optimizes ROOT baskets Small baskets are combined and re-compressed. 4 05/13/2010Peter van Gemmeren (ANL): I/O challenges for HEP applications on multi- core processors

Direct persistent to persistent copy in Athena (b -> B -> P -> P -> B -> b) Same as full Athena, but without P -> T and T -> P – Uncompress ROOT baskets, rebuild persistent objects. – Do nothing with persistent objects – Stream persistent objects into ROOT baskets and compress Can do P -> T and T -> P for selected objects Creates new DataHeader and re-optimizes its persistent representation (as in full athena) Produces Output File which is very similar to one produced by a single core processing all input events. – Handles Metadata propagation using full Athena framework. – Re-optimizes selected persistent Data Objects – Re-optimizes ROOT baskets 5 05/13/2010Peter van Gemmeren (ANL): I/O challenges for HEP applications on multi- core processors

Fast POOL / ROOT merge (b -> b) Run as separate POOL application, not using the Athena framework Does not currently work for ATLAS events, using external tokens – Recent (CVS head -> POOL nightly) modifications by Markus Frank and myself promise to enable POOL fast merge for ATLAS events Even when POOL fast merge works, there are inherent limitations: – Can’t Summarize metadata Metadata records have to be summarized on every read. – Navigational references are not updated Utilize POOL redirection, which may cost time. – Re-optimize storage layout (-> non-optimal compression, slower read speed for some objects) In principle this could be overcome, by having ROOT uncompress, combine and recompress the buffer (b -> B -> B -> b). However this will slow down the merge significantly. 6 05/13/2010Peter van Gemmeren (ANL): I/O challenges for HEP applications on multi- core processors

AthenaMP: The I/O point of view 7 05/13/2010Peter van Gemmeren (ANL): I/O challenges for HEP applications on multi- core processors initfork Collect & merge bootstrap parallel event processing Input File Output File temporary output files

AthenaMP: The I/O point of view, to be fair 8 05/13/2010Peter van Gemmeren (ANL): I/O challenges for HEP applications on multi- core processors Input File Output File temporary output files

So what’s wrong with that? Read data:A process (initialization, event execute,…) reads part of the input file (e.g., to retrieve one event). All worker use the same input file, which in general is to large to be cached in memory. Multiple access may mean poor read performance, especially if events are not consecutive. Uncompress / Stream ROOT baskets:Each reader (i.e. worker) will retrieve its event data, which means reading multiple ROOT baskets, uncompressing them and streaming them into persistent objects. – ROOT baskets contain object member of several events, so multiple worker may use the same baskets and each of them will uncompress them independently: Wastes CPU time (multiple uncompress of the same data) Wastes memory (multiple copies of the same Basket, probably not shared) Write data:Each process writes its own output file. Compress / Stream to ROOT baskets:Writer compress data separately. Suboptimal compression factor (which will cost storage and CPU time at subsequent reads) – This can be ‘healed’ later by using Athena merge. Wastes memory (each worker needs its own set of output buffer) 9 05/13/2010Peter van Gemmeren (ANL): I/O challenges for HEP applications on multi- core processors

AthenaMP -> ~GaudiParallel: The Scatter/Gather point of view 10 05/13/2010Peter van Gemmeren (ANL): I/O challenges for HEP applications on multi- core processors Input File Output File Output Daemon (Writer) Receives pers. objects, streams, compresses Input Daemon (Reader) Uncompress, stream, provide pers. objects MetaData processor

AthenaMP -> ~GaudiParallel: GaudiParallel follows MPI Scatter / Gather scheme – Reduces I/O, total memory, CPU & Wall Clock time and even disk space – I/O uses new Reader / Writer classes Serialize whole event from Transient Data Store and send it to worker process Athena – Uses StoreGate (flat data store) instead of Gaudi Transient Data Store (hierarchical). StoreGate (and ATLAS persistency) support C++ classes with all their capabilities / complexity – Implements Transient / Persistent separation in persistency framework. Could use t-p conversion as locus for scatter / gather to allow on demand retrieval of objects – Rather than sending complete events, the persistent state of objects can be send. – Persistent state representations are much simpler than their transient counterparts. – Have dedicated processing of in-file metadata attached to Reader and/or Writer To Be Addressed: – Event ordering Multi-core processed file may not preserve the event ordering of input files. – Removing any duplicate events 11 05/13/2010Peter van Gemmeren (ANL): I/O challenges for HEP applications on multi- core processors

Summary ATLAS uses AthenaMP to utilize multi-core processors – Works today – Very process focused I/O more of an afterthought – Currently limited to merging output files I/O becoming bottleneck: – Even for reconstruction up to 20 – 30 % wall clock time in merge – Increases memory foot print by several 100 MB per core Program of work needed to parallelize I/O for multi-core processor – Several design alternatives should be investigated – Emerging I/O capabilities in GaudiParallel should be used for Athena (if it makes sense). Of course differences between Gaudi and Athena mean there is work to do on our end Design of multi-core I/O needs to be highly configurable to allow performance tuning. – Expect multi-core I/O performance tuning to be a significant effort Similar to single core I/O tuning or (multi-core) memory/CPU optimization /13/2010Peter van Gemmeren (ANL): I/O challenges for HEP applications on multi- core processors