STAR Persistent Pointers in the STAR Micro-DST V. Perevoztchikov Brookhaven National Laboratory,USA.

Slides:



Advertisements
Similar presentations
More on File Management
Advertisements

1 File Management (a). 2 File-System Interface  File Concept  Access Methods  Directory Structure  File System Mounting  File Sharing  Protection.
Access Trisha Cummings. Access 1.Microsoft Access is a relational database management system from Microsoft, 2.Skilled software developers and data architects.
Jonathan Walpole Computer Science Portland State University
LEARNING OBJECTIVES Index files.
Chapter 12 File Management
Chapter 12 File Management
Database Implementation Issues CPSC 315 – Programming Studio Spring 2008 Project 1, Lecture 5 Slides adapted from those used by Jennifer Welch.
CS 333 Introduction to Operating Systems Class 18 - File System Performance Jonathan Walpole Computer Science Portland State University.
1 Operating Systems Chapter 7-File-System File Concept Access Methods Directory Structure Protection File-System Structure Allocation Methods Free-Space.
1 Lecture 20: Indexes Friday, February 25, Outline Representing data elements (12) Index structures (13.1, 13.2) B-trees (13.3)
B + -Trees (Part 1) COMP171. Slide 2 Main and secondary memories  Secondary storage device is much, much slower than the main RAM  Pages and blocks.
Chapter 1 An Overview of Database Management. 1-2 Topics in this Chapter What is a Database System? What is a Database? Why Database? Data Independence.
CS 333 Introduction to Operating Systems Class 19 - File System Performance Jonathan Walpole Computer Science Portland State University.
Chapter 7 Indexing Objectives: To get familiar with: Indexing
CS4432: Database Systems II
File Organization Techniques
Lesson 7-Creating and Changing Directories. Overview Using directories to create order. Managing files in directories. Using pathnames to manage files.
STAR C OMPUTING Maker and I/O Model in STAR Victor Perevoztchikov.
1 Physical Data Organization and Indexing Lecture 14.
File Systems Long-term Information Storage Store large amounts of information Information must survive the termination of the process using it Multiple.
File Processing - Indexing MVNC1 Indexing Jim Skon.
Chapter 10: File-System Interface Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Jan 1, 2005 Chapter 10: File-System.
File Systems CSCI What is a file? A file is information that is stored on disks or other external media.
Chapter 11 File Systems and Directories. 2 File Systems File: A named collection of related data. File system: The logical view that an operating system.
DBMS Implementation Chapter 6.4 V3.0 Napier University Dr Gordon Russell.
CS4432: Database Systems II Record Representation 1.
File Management Chapter 12. File Management File management system is considered part of the operating system Input to applications is by means of a file.
STAR Sti, main features V. Perevoztchikov Brookhaven National Laboratory,USA.
File Organization Lecture 1
STAR Kalman Track Fit V. Perevoztchikov Brookhaven National Laboratory,USA.
STAR Event data storage and management in STAR V. Perevoztchikov Brookhaven National Laboratory,USA.
CE Operating Systems Lecture 17 File systems – interface and implementation.
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
CS333 Intro to Operating Systems Jonathan Walpole.
STAR Schema Evolution Implementation in ROOT I/O V. Perevoztchikov Brookhaven National Laboratory,USA.
Lecture 10 Page 1 CS 111 Summer 2013 File Systems Control Structures A file is a named collection of information Primary roles of file system: – To store.
Slide: 1 UNIX FILE SYSTEM By:Qing Yang ID: Operating System Research Topic December, 2000.
File Systems cs550 Operating Systems David Monismith.
General Purpose ROOT Utilities Victor Perevoztchikov, BNL.
Visual Basic for Application - Microsoft Access 2003 Finishing the application.
THE FILE SYSTEM Files long-term storage RAM short-term storage Programs, data, and text are all stored in files, which is stored on.
CIS 250 Advanced Computer Applications Database Management Systems.
Introduction to Databases Angela Clark University of South Alabama.
CS399 New Beginnings Jonathan Walpole. Disk Technology & Secondary Storage Management.
ECE 456 Computer Architecture Lecture #9 – Input/Output Instructor: Dr. Honggang Wang Fall 2013.
STAR SVT Self Alignment V. Perevoztchikov Brookhaven National Laboratory,USA.
STAR Simulation. Status and plans V. Perevoztchikov Brookhaven National Laboratory,USA.
Lists List Implementations. 2 Linked List Review Recall from CMSC 201 –“A linked list is a linear collection of self- referential structures, called nodes,
W4118 Operating Systems Instructor: Junfeng Yang.
File-System Management
Jonathan Walpole Computer Science Portland State University
INLS 623– Database Systems II– File Structures, Indexing, and Hashing
CS522 Advanced database Systems
CS 540 Database Management Systems
File System Structure How do I organize a disk into a file system?
COMP 430 Intro. to Database Systems
Lecture 10: Buffer Manager and File Organization
Fast File Clone in ZFS Design Proposal Pavel Zakharov 11/14/2018.
Chapter 11: File System Implementation
Optimizing Malloc and Free
Database Implementation Issues
Automated support of STL containers
Jonathan Walpole Computer Science Portland State University
Files Management – The interfacing
DATABASE IMPLEMENTATION ISSUES
CS333 Intro to Operating Systems
Database Implementation Issues
Database Implementation Issues
Presentation transcript:

STAR Persistent Pointers in the STAR Micro-DST V. Perevoztchikov Brookhaven National Laboratory,USA

STAR Victor Perevoztchikov, BNL ROOT2001 What is the PROBLEM? The problem is in supporting: creating,updating of complex DST, Mini-DST,Micro-DSTs.  Full experimental DST contains a lot of non physical information. It is not good for analysis. There is no disk space to keep it;  Mini-DST contains only physical information. But it is still too much for particular physics team. Disk space is problem too;  Set of micro-DSTs. Each one is good for one team. But a lot of duplicated information. Hard to maintain;  The evident solution is to read simultaneously a subset of needed micro-DSTs. In this case we can avoid duplication of information. The last solution is good, but there is a difficulty: - relationship between objects from the different Micro-DST (files).

STAR Victor Perevoztchikov, BNL ROOT2001 Relationship. Typical Example Track has relationship with Hits, from which it was. made. In C++ it is natural to use pointers for it. When Hits are in one file and tracks in another, it is not too trivial. Indices as a solution?  We choose C++ as OO tool. Why we should refuse of it’s power and return to Fortran era approach? Then it is more logical to return to Fortran. It was not bad language.  If our Hits are from different collections, one index is not enough. We need second index for collection and must create a collection of collections for indexing them. But these hits collections could be also in different files.  If we use indexing, we forbid rearranging of collections, shrinking them, etc… So, we decided to implement persistent pointers.

STAR Victor Perevoztchikov, BNL ROOT2001 Object Identifier. Object Identifier is the possible solution. The object table[id] == pointer, allows conversion from id to pointer. What kind of identifier? Well known Universally Unique Identifier (UUIDGEN utility) is good for it, but too big (32 bytes). It is easy to avoid by using UUID only for the top level objects of the record. All subordinated objects could have 4 bytes EUID (Event Unique Identifier). So each object is world wide uniquely labeled by UUID.EUID. In ROOT TObject has fUniqueID member, which could be used for that, without additional penalty.

STAR Victor Perevoztchikov, BNL ROOT2001 Little bit of History. Relationship of objects from different files is not a new problem:  Professional Data Bases, like Oracle, usually support it;  Objectivity, as far as I know, not;  ZEBRA, where links functionally are very close to pointers, no support also;  ADAMO, probably(?) supports it.  DSPACK (NA49) supports it, indexing pointers during I/O;  FARFALLA, old C++ I/O system, no support.

STAR Victor Perevoztchikov, BNL ROOT2001 Main Features of STAR Persistent Pointer Implementation  Top object of structure inherited from special class (StXRefMain).It has UUID and contains list sub-structures (Branches).  Top object of sub-structure inherited from special class (StXRef). It has the same UUID as the main object. Sub-structures could be written in different files or different records of one file;  Each object could be: l Ordinary object. Can not be seen from the other file or record; l Pointable object. Must be inherited from special class (StObject). It has 4 bytes ID. Could be seen from the other file;  The complexity of structure is based on two types of containers: l Structural container, which is owner of objects and delete them if deleted; l Reference container. Keeps only reference pointers to objects. Never deletes them.

STAR Victor Perevoztchikov, BNL ROOT2001 Main Feature continue.  Each object belongs to only one structural container;  Each object could belong to several reference containers;  Containers can contain other containers; Such structure could be very complicated and it is complicated in STAR case. Writing :  Structural container writes the objects and EUID;  Reference container writes only EUID of objects; Reading :  Structural container reads the objects and fills temporary table[EUID]=pointer;  Reference container reads EUID and gets pointer=table[EUID]; Updating :  During reading user can add sub-structure and write it. It could be back pointers to old sub-structures in it.

STAR Victor Perevoztchikov, BNL ROOT2001 Conclusions  The Persistent Pointers utility was developed in STAR on the base of ROOT I/O;  It is working and will be tested in upcoming production and reproduction in STAR;  Physicist can read only needed several Micro-DSTs simultaneously, like they were written together.  Each physicist can add his own Micro-DST with additional information, with no rewriting the old data.  We can keep only commonly used parts on disks, keeping all the rest on tapes;  There is no penalty in speed. It is even slightly (5%) faster than standard ROOT I/O.  It is not big. About 1000 lines of code.