New I/O navigation scheme

Slides:



Advertisements
Similar presentations
Indexing Large Data COMP # 22
Advertisements

Arrays. Memory organization Table at right shows 16 bytes, each consisting of 8 bits Each byte has an address, shown in the column to the left
Git: Part 1 Overview & Object Model These slides were largely cut-and-pasted from tutorial/, with some additions.
Bigtable: A Distributed Storage System for Structured Data F. Chang, J. Dean, S. Ghemawat, W.C. Hsieh, D.A. Wallach M. Burrows, T. Chandra, A. Fikes, R.E.
1 DATABASE TECHNOLOGIES BUS Abdou Illia, Fall 2007 (Week 3, Tuesday 9/4/2007)
Microsoft Word 2000 Presentation 7 Microsoft Word 2000 Presentation 7.
File Systems CSCI What is a file? A file is information that is stored on disks or other external media.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 7 OS System Structure.
How to Build a CPU Cache COMP25212 – Lecture 2. Learning Objectives To understand: –how cache is logically structured –how cache operates CPU reads CPU.
Mail Merge. - A feature supported by many word processors that enables you to generate form letters. - You use mail merge when you want to create a set.
Grand Challenge in MDC2 D. Olson, LBNL 31 Jan 1999 STAR Collaboration Meeting
Lecture 10 Page 1 CS 111 Summer 2013 File Systems Control Structures A file is a named collection of information Primary roles of file system: – To store.
9/28/2005Philippe Canal, ROOT Workshop TTree / SQL Philippe Canal (FNAL) 2005 Root Workshop.
I/O Strategies for Multicore Processing in ATLAS P van Gemmeren 1, S Binet 2, P Calafiura 3, W Lavrijsen 3, D Malon 1 and V Tsulaia 3 on behalf of the.
ROOT Workshop M.Frank LHCb/CERN Improvements in the I/O Area (*)  General I/O related improvements  Tree related issues  Plans (*) I present.
Bigtable: A Distributed Storage System for Structured Data
Summary of persistence discussions with LHCb and LCG/IT POOL team David Malon Argonne National Laboratory Joint ATLAS, LHCb, LCG/IT meeting.
G.Govi CERN/IT-DB 1GridPP7 June30 - July 2, 2003 Data Storage with the POOL persistency framework Motivation Strategy Storage model Storage operation Summary.
Next-Generation Navigational Infrastructure and the ATLAS Event Store Abstract: The ATLAS event store employs a persistence framework with extensive navigational.
I/O aspects for parallel event processing frameworks Workshop on Concurrency in the many-Cores Era Peter van Gemmeren (Argonne/ATLAS)
Bigtable A Distributed Storage System for Structured Data.
Mini-Workshop on multi-core joint project Peter van Gemmeren (ANL) I/O challenges for HEP applications on multi-core processors An ATLAS Perspective.
Multi Process I/O Peter Van Gemmeren (Argonne National Laboratory (US))
HYDRA Framework. Setup of software environment Setup of software environment Using the documentation Using the documentation How to compile a program.
Peter van Gemmeren (ANL) Persistent Layout Studies Updates.
July 10, 2016ISA's, Compilers, and Assembly1 CS232 roadmap In the first 3 quarters of the class, we have covered 1.Understanding the relationship between.
ORACLE's Approach ORALCE uses a proprietary mechanism for security. They user OLS.... ORACLE Labeling Security. They do data confidentiality They do adjudication.
Athena I/O Component Refactorization - Overview Random notes about todays agenda. Peter Van Gemmeren (Argonne National Laboratory (US))
File System Implementation
Translation Lookaside Buffer
5 In the Survey Options section, click an option to determine whether users' names will appear in survey results, and then whether users can respond to.
Storage Access Paging Buffer Replacement Page Replacement
Module 11: File Structure
5 In the Survey Options section, click an option to determine whether users' names will appear in survey results, and then whether users can respond to.
Data Prefetching Smruti R. Sarangi.
Chapter 14: Protection Modified by Dr. Neerja Mhaskar for CS 3SH3.
Chapter 14: System Protection
(on behalf of the POOL team)
A Real Problem What if you wanted to run a program that needs more memory than you have? September 11, 2018.
Information Systems Today: Managing in the Digital World
Hashing - Hash Maps and Hash Functions
Mail Merge.
File System Structure How do I organize a disk into a file system?
Chapter 11: File System Implementation
Dirk Düllmann CERN Openlab storage workshop 17th March 2003
5 In the Survey Options section, click an option to determine whether users' names will appear in survey results, and then whether users can respond to.
1) What is a Database? A database is an organized collection of information about a subject. Examples: Address Book, Telephone Book.
Lecture 28: Virtual Memory-Address Translation
Chapter 11: File System Implementation
DHT Routing Geometries and Chord
FILE ORGANIZATION.
A bit more about Read Codes and SNOMED CT
Chapter 11: File System Implementation
Improve Run Merging Reduce number of merge passes.
Selected Topics: External Sorting, Join Algorithms, …
Data Prefetching Smruti R. Sarangi.
CPS216: Advanced Database Systems
Mastering Memory Modes
Training & Development
Developing a Web Site.
Hash Tables By JJ Shepherd.
The use of plugins A plugin (or plug-in, or extension) is a component that adds a specific feature to the “standard” Handbook on IT Audit for Supreme Audit.
Zooming on ROOT files and Containers
Outline Announcements Differences between FORTRAN and C
Chapter 11: File System Implementation
Process.
Database Systems (資料庫系統)
Conceptual execution on a processor which exploits ILP
Tuple.
Implementation Plan system integration required for each iteration
Presentation transcript:

New I/O navigation scheme Peter van Gemmeren (ANL)

Outline Current/old, RootTree: I/O navigation infrastructure Recap Disadvantage New: RootTreeIndex: Status, regression Advanced features 6/27/2019 Peter van Gemmeren (ANL): New I/O navigation scheme

Current/old: I/O navigation infrastructure, diagram DataHeader Collection 1 Collection N Collection M Event Loop Enter and iterate over TAG Conditions Token Oid1, Oid2 Oid2 1 2 3 … ##Links Link 1 Link 2 Oid1 Provide key, type, …, persistent address Store Gate 6/27/2019 Peter van Gemmeren (ANL): New I/O navigation scheme

Current/old: I/O navigation infrastructure, RootTreeContainer Object location uses different elements File ID (In ROOT: TFIle), a GUID together with File Catalogs are used to identify the file. Container (In ROOT: TTree, TBranch), OID1, long int is used to give the entry number into (POOL) ##Links container, which stores container name, type id… Object (In ROOT: entry number), OID2, long int is used to give the entry number for the object. 6/27/2019 Peter van Gemmeren (ANL): New I/O navigation scheme

Current/old: I/O navigation infrastructure Has worked solidly for all use cases in Run 1 and 2. But does not support object relocation, e.g. ROOT fast or in memory merging because ROOT TTree entry numbers change. POOL did provide a somewhat cumbersome extension as ##Sections table. When extending ATLAS’ workflows to be more distributed, this has become a bottle neck. Identified as part of I/O review for improvement. 6/27/2019 Peter van Gemmeren (ANL): New I/O navigation scheme

New: I/O navigation infrastructure, diagram DataHeader Collection 1 Collection N Collection M Event Loop Enter and iterate over TAG Conditions Token Oid1, Oid2 ##Links Link 1 Link 2 Provide key, type, …, persistent address Store Gate Index Y 1 Y 2 Y 3 … DataHeader Collection 1 Collection N Collection M Event Loop Enter and iterate over TAG Conditions Token Oid1, Oid2 Oid2 1 2 3 … ##Links Link 1 Link 2 Oid1 Provide key, type, …, persistent address Store Gate 6/27/2019 Peter van Gemmeren (ANL): New I/O navigation scheme

New: I/O navigation infrastructure, RootTreeIndexContainer (203) Object location uses extended elements for container and object Container, OID1, long int is used to give a uid 32 bit & entry number 32 bit, into (POOL) ##Links container, which stores container name, type id… This is stored in the ##Links TTree as an additional branch and indexed by ROOT. Object, OID2, long int is used to give a uid 32 bit & entry number 32 bit for the object. This is stored in the TTree as an additional branch and indexed by ROOT. When reading RootTreeIndexContainer, the OID is looked up in the TIndex column. Enabled in master, w/o apparent problems in common use-cases. Doesn’t quite work yet for merged ##Links, work in progress. 6/27/2019 Peter van Gemmeren (ANL): New I/O navigation scheme

Outlook Implemented new persistent POOL (minor) technology for RootTreeIndexContainer, to add index branch and allow navigation reference to use immutable values rather than entry number. In master, seems to work transparent to existing/old RootTreeContainer. Can be read by release 21 Should (when finished) allow fast and in memory merging features. Some related work minimizing content and complexity (possibly generality) of the current DataHeader into a new persistent version. These change the way our navigational data is written, and therefore we try to get things in early to spot potential problems before real data gets written. 6/27/2019 Peter van Gemmeren (ANL): New I/O navigation scheme