Presentation is loading. Please wait.

Presentation is loading. Please wait.

Toward a Next-generation I/O Framework David Malon, Jack Cranshaw, Peter van Gemmeren, Marcin Nowak, Alexandre Vaniachine US ATLAS Technical Planning Meeting.

Similar presentations


Presentation on theme: "Toward a Next-generation I/O Framework David Malon, Jack Cranshaw, Peter van Gemmeren, Marcin Nowak, Alexandre Vaniachine US ATLAS Technical Planning Meeting."— Presentation transcript:

1 Toward a Next-generation I/O Framework David Malon, Jack Cranshaw, Peter van Gemmeren, Marcin Nowak, Alexandre Vaniachine US ATLAS Technical Planning Meeting 28 June 2015

2 Background  Input and output pose fundamental challenges to the efficient exploitation of emerging computing platforms and architectures  Input and output are points of serialization, and I/O bandwidth has not scaled with processing power or core count.  Applications whose computational kernels may be scalable to very large numbers of CPUs will lose their scalability if they are or become I/O-bound. –Even when this does not happen, a framework that achieves high throughput by creating a substantial post-processing burden (e.g., in a later, possibly complicated merge step) has achieved that throughput only nominally.  ATLAS is encountering issues that require I/O framework development on every emerging front –In AthenaMP –In Event Server work –On HPCs (feeding zillions of processors, minimizing their I/O during execution, handling their simultaneous output, …) –In production use of distributed I/O –Even in standard production 28 June 2015 David Malon, US ATLAS Technical Planning Meeting 2

3 Some history  Steps toward I/O framework evolution began some time ago, with a plan to –Better define and factorize an I/O architecture below the Gaudi conversion service; Among other things, separating state representation and streaming from storage control –Clean up and simplify APR (which is almost all of POOL merged into the ATLAS code base); –Refactor and improve separation of concerns in event selectors and related services; –Improve consistency and reuse across input types (bytestream and ROOT/POOL in particular) and treat input and output a bit more symmetrically Example: Treating input persistence technology more like output persistence technology (which is a property for an outstream whereas it requires an entirely different service for input) –Redesign to simplify implementation of shared readers and writers –Improve rapid in-file event selection and related navigation –… and more (will not repeat all that material here) …  Some of this work has been done, but much was put on hold because of higher- priority demands –Analysis task forces, xAOD, derivation framework, and I/O infrastructure to support them –ROOT 6 migration  The future frameworks effort is becoming the (welcome) context in which much of the I/O framework evolution effort resumes 28 June 2015 David Malon, US ATLAS Technical Planning Meeting 3

4 Caveats  That does not mean the higher-priority needs are somehow entirely behind us  Ensuring the success of Run 2 data taking is our highest priority –The stream of blocker JIRA tickets for Tier 0, for fast reprocessing, for essential Monte Carlo, and for the derivation framework all trump future frameworks  … and with ROOT 6, for example, work remains to –address dictionary duplication (not entirely core) –improve dictionary organization/factorization (mixing transient and persistent) and issues arising from needless includes (also not entirely core) –reduce memory footprint and improve robustness more generally –and other items will no doubt arise when a ROOT 6 release is used in production  With xAOD, there is even more work to do –Cleaning up xAOD hacks and inappropriate data model dependencies in outstream infrastructure –Coping with blowups in data size or VMEM or other constrained resources due to the xAOD data model (cf. ESD production size explosion and hitting hard, fatal ROOT buffer limits as one example) –Tuning the branching, buffering, flush settings, … for the new data model will take time and study and effort 28 June 2015 David Malon, US ATLAS Technical Planning Meeting 4

5 I/O requirements for a future framework  The I/O layer should support variable numbers of readers and writers, both to provide a means to match I/O to processing capacity and to allow adaptation to a range of deployment environments.  The framework should be configurable to support I/O-intensive as well as CPU- intensive processing, without the I/O layer itself being the bottleneck.  While the architecture of I/O components may be complex, it should be factorized from the architecture of the scheduler: –for example, readers and writers (event selectors and output streams) should be schedulable in essentially the same way as any algorithm. (With caveats, of course) 28 June 2015 David Malon, US ATLAS Technical Planning Meeting 5

6 I/O requirements for a future framework (2)  The framework should be agnostic to the nature of its data sources and sinks, i.e., to whether data come from local or remote storage media or over the wire or from specialized readout devices, and to whether they are written to storage or to local or remote processes. I/O layer components will deal with the necessary specializations.  The I/O layer (together with the whiteboard) should isolate the algorithms, tools, services accessing data from the persistence technology used behind the scenes. Current data formats, including bytestream, xAOD, and the POOL formats, need to be supported. Explicit dependencies on a particular persistence technology should be avoided as much as reasonably possible. 28 June 2015 David Malon, US ATLAS Technical Planning Meeting 6

7 I/O requirements for a future framework (3)  Input and output infrastructure must be capable of respecting semantic constraints on data organization, such as not interleaving events from different runs or run segments (luminosity blocks). –Related to metadata plans (later session)  The framework needs to provide sufficient bookkeeping to ensure that all events in semantically meaningful units have been processed, and may be required to provide more detailed bookkeeping in jobs that filter events. The I/O layer should facilitate such accounting, and should provide a means to associate metadata with event samples. –Also related to metadata plans (later session)  The I/O layer should exploit similarities in HLT and offline data access where possible. There are parallels between data requests to readout systems and I/O requests to storage, as well as parallels between selective region of interest (RoI) retrieval and selective (partial event) retrieval from disk. 28 June 2015 David Malon, US ATLAS Technical Planning Meeting 7

8 Some followups this week  Discussion and brainstorming about Athena I/O components and thread safety later this week –And next week, the future frameworks workshop  Short-term workarounds for APR may be possible, but even in early prototyping we probably want to do something better –and give APR cleanup higher priority –Too, we have simpler (not direct to ROOT, but a bit closer to that model) services than APR—other possible starting points  Working to ensure too that we are cognizant of what our sister experiments have done –Chris Jones’ work for CMS on ROOT thread safety; Markus Frank’s POOL replacement work for LHCb, …  Getting I/O to work in a multithreaded framework is essential, but this does not represent the full extent of I/O framework evolution 28 June 2015 David Malon, US ATLAS Technical Planning Meeting 8

9 Not just threading  Expect to be able to evolve the I/O framework in support of both the future (control) framework effort and distributed processing (e.g., event service) –Complementary rather than competing, architecturally –Have written the future framework I/O requirements to reflect this  Would like to take the opportunity to make the I/O framework a bit more general –Athena and its future evolution is definitely the primary target and the one that must be unconditionally and robustly supported, but the architectural notions are certainly more general And indeed more general than event processing Will not build in needless Athena dependencies  Will contribute developments where they make the most sense –Should not deliver an Athena workaround to a ROOT obstacle to thread safety, for example, when an improvement to ROOT instead would help the broader community –Both ATLAS and CMS have already fed developments to the ROOT I/O code, and we expect this to continue 28 June 2015 David Malon, US ATLAS Technical Planning Meeting 9


Download ppt "Toward a Next-generation I/O Framework David Malon, Jack Cranshaw, Peter van Gemmeren, Marcin Nowak, Alexandre Vaniachine US ATLAS Technical Planning Meeting."

Similar presentations


Ads by Google