Andrea Valassi (CERN IT-DB)CHEP 2004 Poster Session (Thursday, 30 September 2004) 1 HARP DATA AND SOFTWARE MIGRATION FROM TO ORACLE Authors: A.Valassi,

Slides:



Advertisements
Similar presentations
Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
Advertisements

Chapter 10: Designing Databases
CERN - IT Department CH-1211 Genève 23 Switzerland t LCG Persistency Framework CORAL, POOL, COOL – Status and Outlook A. Valassi, R. Basset,
1 Databases in ALICE L.Betev LCG Database Deployment and Persistency Workshop Geneva, October 17, 2005.
Manish Bhide, Manoj K Agarwal IBM India Research Lab India {abmanish, Amir Bar-Or, Sriram Padmanabhan IBM Software Group, USA
Objectivity Data Migration Marcin Nowak, CERN Database Group, CHEP 2003 March , La Jolla, California.
A Comparison of Database Software CS 616 April 8, 2004 Team 7 Mandar Patankar Jonathan Cohen B. Timothy Walsh.
Jeremy Boyd Director – Mindscape MSDN Regional Director
F Fermilab Database Experience in Run II Fermilab Run II Database Requirements Online databases are maintained at each experiment and are critical for.
DATABASE MANAGEMENT SYSTEM ARCHITECTURE
ETL By Dr. Gabriel.
CERN - IT Department CH-1211 Genève 23 Switzerland t Partitioning in COOL Andrea Valassi (CERN IT-DM) R. Basset (CERN IT-DM) Distributed.
Digital Object: A Virtual Online Storage Solution 598C Course Project Huajing Li.
Andrea Valassi (CERN IT-SDC) DPHEP Full Costs of Curation Workshop CERN, 13 th January 2014 The Objectivity migration (and some more recent experience.
The Design Discipline.
2/10/2000 CHEP2000 Padova Italy The BaBar Online Databases George Zioulas SLAC For the BaBar Computing Group.
Database testing Prepared by Saurabh sinha. Database testing mainly focus on: Data integrity test Data integrity test Stored procedures test Stored procedures.
Chapter Oracle Server An Oracle Server consists of an Oracle database (stored data, control and log files.) The Server will support SQL to define.
1 Robert Wijnbelt Health Check your Database A Performance Tuning Methodology.
CERN - IT Department CH-1211 Genève 23 Switzerland t Tier0 database extensions and multi-core/64 bit studies Maria Girone, CERN IT-PSS LCG.
Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration Migrating PHENIX databases from object to relational model.
Conditions DB in LHCb LCG Conditions DB Workshop 8-9 December 2003 P. Mato / CERN.
Lecture 12 Designing Databases 12.1 COSC4406: Software Engineering.
1 Introduction to ADO.NET Microsoft ADO.NET 2.0 Step by Step Rebecca M Riordan Microsoft Press, 2006.
ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, April 2013 Relational APDM & Relational ASDM models effort done in online.
Software Solutions for Variable ATLAS Detector Description J. Boudreau, V. Tsulaia University of Pittsburgh R. Hawkings, A. Valassi CERN A. Schaffer LAL,
Searching Business Data with MOSS 2007 Enterprise Search Presenter: Corey Roth Enterprise Consultant Stonebridge Blog:
CHEP 2006, Mumbai13-Feb-2006 LCG Conditions Database Project COOL Development and Deployment: Status and Plans Andrea Valassi On behalf of the COOL.
Intro – Part 2 Introduction to Database Management: Ch 1 & 2.
Francesco Rizzo (ISTAT - Italy) SDMX ISTAT FRAMEWORK GENEVE May 2007 OECD SDMX Expert Group.
Development of Hybrid SQL/NoSQL PanDA Metadata Storage PanDA/ CERN IT-SDC meeting Dec 02, 2014 Marina Golosova and Maria Grigorieva BigData Technologies.
ALICE, ATLAS, CMS & LHCb joint workshop on
The Persistency Patterns of Time Evolving Conditions for ATLAS and LCG António Amorim CFNUL- FCUL - Universidade de Lisboa A. António, Dinis.
INNOV-10 Progress® Event Engine™ Technical Overview Prashant Thumma Principal Software Engineer.
DATABASE MANAGEMENT SYSTEM ARCHITECTURE
ADABAS Versus DB2 An Evaluative Study by Butler Bloor Group.
CERN - IT Department CH-1211 Genève 23 Switzerland t COOL Conditions Database for the LHC Experiments Development and Deployment Status Andrea.
Bernhard Düchting Principal Sales Consultant BU Database - Migration Oracle Germany.
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
Development of the CMS Databases and Interfaces for CMS Experiment: Current Status and Future Plans D.A Oleinik, A.Sh. Petrosyan, R.N.Semenov, I.A. Filozova,
Claudio Grandi INFN-Bologna CHEP 2000Abstract B 029 Object Oriented simulation of the Level 1 Trigger system of a CMS muon chamber Claudio Grandi INFN-Bologna.
Oracle for Physics Services and Support Levels Maria Girone, IT-ADC 24 January 2005.
Database Performance Eric Grancher - Nilo Segura Oracle Support Team IT/DES.
Overview of C/C++ DB APIs Dirk Düllmann, IT-ADC Database Workshop for LHC developers 27 January, 2005.
Scalable data access with Impala Zbigniew Baranowski Maciej Grzybek Daniel Lanza Garcia Kacper Surdy.
The DCS Databases Peter Chochula. 31/05/2005Peter Chochula 2 Outline PVSS basics (boring topic but useful if one wants to understand the DCS data flow)
G.Govi CERN/IT-DB 1GridPP7 June30 - July 2, 2003 Data Storage with the POOL persistency framework Motivation Strategy Storage model Storage operation Summary.
Technology Drill Down: Windows Azure Platform Eric Nelson | ISV Application Architect | Microsoft UK |
Conditions Database Status and Plans for 2005 Andrea Valassi (CERN IT-ADC) LCG Applications Area Review 31 March 2005.
Hello world !!! ASCII representation of hello.c.
3 Copyright © 2006, Oracle. All rights reserved. Designing and Developing for Performance.
Evaluation of the C++ binding to the Oracle Database System Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow Seminar November 20, 2002.
Oracle for Physics Services and Support Levels Maria Girone, IT-ADC 6 April 2005.
CTA: CERN Tape Archive Rationale, Architecture and Status
Advanced Higher Computing Science
(on behalf of the POOL team)
CMS High Level Trigger Configuration Management
Controlling a large CPU farm using industrial tools
The COMPASS event store in 2002
POOL persistency framework for LHC
Dirk Düllmann CERN Openlab storage workshop 17th March 2003
Conditions Data access using FroNTier Squid cache Server
LCG Monte-Carlo Events Data Base: current status and plans
CTA: CERN Tape Archive Overview and architecture
Database Management  .
Data Lifecycle Review and Outlook
LHCb Detector Description Framework Radovan Chytracek CERN Switzerland
Overview of big data tools
Andrea Valassi Pere Mato
LHCb Detector Description Framework Radovan Chytracek CERN Switzerland
Presentation transcript:

Andrea Valassi (CERN IT-DB)CHEP 2004 Poster Session (Thursday, 30 September 2004) 1 HARP DATA AND SOFTWARE MIGRATION FROM TO ORACLE Authors: A.Valassi, D.Geppert, M. Lübeck, K. Nienartowicz, M. Nowak (CERN IT-DB, Geneva, Switzerland), E. Tcherniaev (CERN PH, Geneva, Switzerland), D. Kolev (University of Sofia, Bulgaria) MOTIVATION End of Objectivity support at CERN in 2003 THREE PHASES 1 - Raw event data migration 2 - Event collection metadata migration and development of Oracle navigation software 3 - Condition data and software migration SCOPE 1 - Migrate HARP data from Objectivity to the new persistency solutions based on Oracle 2 - Develop read-only Oracle software to allow event processing in the HARP Gaudi framework ……..……………….. ……xxxxxxxxxxxxxxx xxxxxxxxxx………….. ………………….…… ….……xxxxxxxxxxxx xxxxxxxxxxxxxxx..… ………………………. ……….. /castor/xxx/PartialRun raw ……..……………….. ………..…xxxxxxxxxx xxxxxxxxxxxxxxxxx… ………..……………… ……xxxxxxxxxxxxxxx xxxxxx..……...……… …………….….….… ………… /castor/xxx/PartialRun raw PHASES 1 AND 2 (EVENT DATA): HYBRID DATA STORE Raw data in flat files (DATE format) Metadata in Oracle Similarities to the POOL data model Phase 2: Migration of event collection metadata Phase 1: Migration of raw event data and metadata Each run contains two ‘partial runs’ (two event builders were used to process the events from each run) A partial run is the simplest self contained set of spills and events (one partial run per input/output file) ORACLE FEATURES USED Integrity constraints (foreign keys) Views, materialized views, query rewrite Logical and physical partitioning for all tables and materialized views (separate tablespaces for different table partitions) Indexes (local/global, b-tree/bitmap) C++ binding: OCCI client library (use SQL bind variables and bulk operations) Example of materialized view: summary table for queries such as - eventsInRun() - physEventsInRun() - calibEventsInRun() DATA VOLUMES FOR EVENT DATA 30 TB raw data on CASTOR tapes 30k partial runs and files (~20k runs) 800M events (rows in Oracle event table) 100 GB event metadata in Oracle In addition: 100 GB Oracle indexes The idea of storing also the DATE binary records as BLOBs in the Oracle database had been initially considered… but what would you gain? The content of the raw records is not meant to be queried using SQL anyway!

Andrea Valassi (CERN IT-DB)CHEP 2004 Poster Session (Thursday, 30 September 2004) 2 Finite State Machine for the two local pools on a migration node (optimize use of input tape drives and local CPU) Control flow Oracle data Bulk data files 4 Castor output pools HARP Oracle server (harpdb1/Linux) 4 Migration nodes (1 local Objectivity federation and 2 Castor input pools on each node) Finite State Machine for 2 local pools Oracle table rows (metadata) Flat binary files (raw DATE records) 1 file = 1 partial run Objectivity database files (metadata and raw DATE records) 1 file = 1 partial run Migration catalog Oracle server (Linux) Read/Update catalog of partial runs that remain to be migrated Control flow via LAM//MPI Migration manager (LAM/MPI master) Finite State Machine for full system ConditionsDB HARP software framework CondDBOracleCondDBObjy ORACLEOBJY ~1% code m_condDBMgr = CondDBOracleDBMgrFactory::createCondDBMgr(); ~1% code m_condDBMgr = CondDBObjyDBMgrFactory::createCondDBMgr(); ~99% code ICondDBMgr m_condDBMgr; m_condDBMgr->xxx(); PHASE 3: CONDITION DATA Data volume: < 10 GB Fast and rather painless: ~ 1 month No design needed, solution available: use Oracle implementation of the same abstract interface that was implemented in Objectivity Data migration: via simple dedicated tools (partly using the API, partly bypassing it) Software for read-only data access: only minor changes with respect to Objectivity SUMMARY Large data migration successfully completed! Hybrid data store with Oracle metadata navigation Data migrations may also be needed at LHC! Many thanks to: IT-DB, IT-DS, IT-ADC, Castor team, S. Giani, I.Papadopoulos CONDITIONS DATABASE API Metadata for condition data: data item name, validity interval [since,till], version Actual condition data encoded in binary records (BLOBs): strings (calibration file names), streamed vectors of numbers (beam or detector control parameters) PHASE 1: RAW EVENT DATA Data migrated: 30 TB raw data in 15 days Reuse tools and hardware from previous COMPASS migration (300 TB raw data) HARP specific preparation, cleanup and consistency checks: 2-3 months PHASE 2: EVENT COLLECTION METADATA Data volume: < 1 GB Complex data model: ~ 4-5 months Relational schema design (see diagram) Migration application for event collections also checks for internal consistency when ‘merging’ two partial runs Implementation of read-only navigation and selection criteria (performance optimization by the use of bind variables, result set caching and materialized views) Re-engineer event selection (move most algorithms to technology-independent PersistentModelAndSelections, tested also in adapted version of Objectivity code) Detailed comparison of Oracle and Objectivity results OracleEventModelSelectionSvc.EventRange = " "; OracleEventModelSelectionSvc.EventTimeRange = " ”; OracleEventModelSelectionSvc.EventAttributes = "0 1 2"; OracleEventModelSelectionSvc.DetectorsInEvent = " "; jobOptions.txt This is translated into an SQL query … WHERE (124 <= EVENT_ID AND EVENT_ID <= 245) AND … The selection is done on the Oracle server (via a b-tree index) rather than by a full scan with C++ event-by-event check as in Objectivity IEventSelection string describeEventSelection ( ) bool applySelection ( IEventHeader header ) OracleEventHeaderFactory void startLoop( IEventSelection selection ) OracleEventHeader* createOracleEventHeader ( ) This instantiates a transient C++ ‘header’ from the Oracle table row that is currently cached in C++ memory as a result of the SQL query uses Abstract interface Implementation implements PersistentModelAndSelections HarpEvent GAUDI OracleEventSelector OracleHarp ORACLE HarpDDCondDBOracle RFIO Old package New package (replaces Objy) New package (no equivalence) CASTOR tapes StorageTek 9940A CASTOR tapes StorageTek 9940B