Data production and virtualisation status

Slides:



Advertisements
Similar presentations
Configuration management
Advertisements

ADABAS to RDBMS UsingNatQuery. The following session will provide a high-level overview of NatQuerys ability to automatically extract ADABAS data from.
CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
Class 6 Data and Business MIS 2000 Updated: September 2012.
Module 3: Table Selection
The ATLAS Production System. The Architecture ATLAS Production Database Eowyn Lexor Lexor-CondorG Oracle SQL queries Dulcinea NorduGrid Panda OSGLCG The.
The McGraw-Hill Companies, Inc Information Technology & Management Thompson Cats-Baril Chapter 3 Content Management.
© 2007 by Prentice Hall 1 Introduction to databases.
Data production using CernVM and lxCloud Dag Toppe Larsen Belgrade
Data production using CernVM and LxCloud Dag Toppe Larsen Warsaw,
MS Access 2007 Management Information Systems 1. Overview 2  What is MS Access?  Access Terminology  Access Window  Database Window  Create New Database.
Forms and Server Side Includes. What are Forms? Forms are used to get user input We’ve all used them before. For example, ever had to sign up for courses.
CERN Using the SAM framework for the CMS specific tests Andrea Sciabà System Analysis WG Meeting 15 November, 2007.
02 August OraMonPlans 08/ August Topics Enhancements –OraMon DB redundancy layer –Compare and fix OraMon configurations –Expiry of historical.
NA61/NA49 virtualisation: status and plans Dag Toppe Larsen CERN
BeBe data production status BeBe 160/80/40: done Software 12J, global key 11_012 Set-up/log: /afs/cern.ch/na61/Calibration/12J012/run*_m/ DSPACK: /castor/cern.ch/na61/11/prod/12J012/DSPACK/run*
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
RunControl status update Nicolas Lurkin School of Physics and Astronomy, University of Birmingham NA62 TDAQ Meeting – CERN, 14/10/2015.
SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.
NA61/NA49 virtualisation: status and plans Dag Toppe Larsen Budapest
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
Log Shipping, Mirroring, Replication and Clustering Which should I use? That depends on a few questions we must ask the user. We will go over these questions.
2-December Offline Report Matthias Schröder Topics: Monte Carlo Production New Linux Version Tape Handling Desktop Computers.
MESA A Simple Microarray Data Management Server. General MESA is a prototype web-based database solution for the massive amounts of initial data generated.
Virtualisation: status and plans Dag Toppe Larsen
L1Calo Databases ● Overview ● Trigger Configuration DB ● L1Calo OKS Database ● L1Calo COOL Database ● ACE Murrough Landon 16 June 2008.
L1Calo DBs: Status and Plans ● Overview of L1Calo databases ● Present status ● Plans Murrough Landon 20 November 2006.
Emdeon Office Batch Management Services This document provides detailed information on Batch Import Services and other Batch features.
Data production and virtualisation status Dag Toppe Larsen Wrocław,
I/Watch™ Weekly Sales Conference Call Presentation (See next slide for dial-in details) Andrew May Technical Product Manager Dax French Product Specialist.
Architecture Review 10/11/2004
ASP.NET Programming with C# and SQL Server First Edition
Development Environment
The STEM Academy Data Solution
Session
Online Database Work Overview Work needed for OKS database
Project Management: Messages
Featured Enhancements to the IDE & Debugger
Virtualisation for NA49/NA61
NA61/NA49 virtualisation:
Dag Toppe Larsen UiB/CERN CERN,
ONYX 12.2.
Progress on NA61/NA49 software virtualisation Dag Toppe Larsen Wrocław
Dag Toppe Larsen UiB/CERN CERN,
Easily retrieve data from the Baan database
Database Systems Unit 16.
Virtualisation for NA49/NA61
Generator Services planning meeting
SQL Injection Attacks Many web servers have backing databases
Software Testing With Testopia
Arrays and files BIS1523 – Lecture 15.
Creating a Baseline Grid
Discussions on group meeting
SQL – Application Persistence Design Patterns
New developments on the LHCb Bookkeeping
Intro to PHP & Variables
Automated Code Coverage Analysis
Simple Steps to Resolve QuickBooks Error Code 6007 Avant-garde application, QuickBooks have garnered attention of small and mid- sized business all across.
This is the cover slide..
Teaching slides Chapter 8.
Computer Science Projects Database Theory / Prototypes
CHAPTER 6 ELECTRONIC DATA PROCESSING SYSTEMS
ITAS Risk Reporting Integration to an ERP
Test Cases, Test Suites and Test Case management systems
Status and plans for bookkeeping system and production tools
SQL – Application Persistence Design Patterns
COMP755 Advanced Operating Systems
Using Veera with R and Shiny to Build Complex Visualizations
Production client status
Presentation transcript:

Data production and virtualisation status Dag Toppe Larsen Wrocław, 2013-10-07

Outline Data production status Virtualisation status Production database Data production script Web interface Plan forward Proposal for new production directory structure

Data production status Data production team: Dag, Bartek, Kevin So far this year 17 mass productions 72 test productions Castor sometimes slow and/or unresponsive Typically lasts for a couple of days, then gets better Also a problem for check of produced data since nsls also hangs Have contacted Castor support, but problem often goes away before properly diagnosed Consider moving to EOS? Irregular batch queue Number of simultaneously running jobs can vary between 10 – 3000 Depends on general batch system load and NA61 relative priority Hard to plan/schedule data production Usually not a big problem, but can be if data urgently needs processing Tried to exclusivelly use xRootd, but nsls- equivalent slow on Castor Have contacted IT Limitation of current implementation Fix might be available on the time scale of ~6 months Not a problem on EOS xRootd

Virtualisation status Have requested and obtained new “NA61” project on final Lxcloud service Same quota (200VCPUs/instances) as before Access controlled by new e-group “na61- cloud” Migration completed A few minor issues had to be worked out with IT Latest software versions (13e legacy, v0r5p0) installed on CVMFS Mass production of BeBe160 has started Next step Compare output to Lxbatch production If results are comparable, declare “victory” Scripts for automated data production in “beta” Prototype data production web interface created

Production DB Production DB has grown a bit beyond what was originally intended Complicated to access information from Castor and bookkeeping DB Elog data not always consistent (needs to be standardised) Elog data needed as input for data production (magnetic field) Difficult to work with the production information without a proper SQL database Created a sqlite DB with three tables: run, production and chunkproduction To contain information about all runs as well as all produced chunks After importing information from bookkeeping DB (elog), productions can be initiated without first query bookkeeping DB For transferring information back to bookkeeping DB after production, I propose to run SQL queries from bookkeeping DB to retrieve relevant information Elog information imported, can (in principle) be used to select data for processing/analysis (e.g. trigger information?)

Production DB schema runs All information for given run Information imported from elog via bookkeeping DB Primary key: run Fields target, beam, momentum, year obtained from elog production All information for given production Combination of target, beam, momentum, year, key, legacy, shine, mode, os, source, type should be unique Primary key: production (automatically generated ID) chunkproductions All produced chunks One row per produced chunk Primary key: production, run, chunk Table has potential to contain order of ~10^6 rows

runs table Contains all information for given run All elog information for run is imported Elog information are used to fill the fields target, beam, momentum, year and magnet Some normalising required to reduce elog entropy Separate elog_* contain elog entries as extracted from elog Can be used with SQL select queries, but entropy sometimes makes this challenging Can be interesting to add additional fields to table to contain “standardised” elog values, easier to select with SQL Are there any further elog entries that are missing in the table? Have imported all raw files found on Castor About 369 chunks do not have elog entries (mostly test runs) sqlite3 prod.db "select count(*) from runs where target=''" 369

runs.beam Contains the beam type for given run Derived from elog_beam_type Not too much entropy (right), but some standardisation required Used to determine “reaction” sqlite3 prod.db "select elog_beam_type, count(*) from runs group by elog_beam_type" |369 Be|1209 Be |15 K-|89 No beam|6 None|314 Pb|31 Pb fragment|223 h|8 h |23 h+|168 h-|221 p|3694 pi-|46

runs.target Contains the target type of given run Derived from elog_target_type Some entropy (right), standardisation required Field used to determine “reaction” sqlite3 prod.db "select elog_target_type, count(*) from runs group by elog_target_type" |369 2C target IN|116 2C target OUT|34 Be target IN|1003 Be target IN |11 Be target OUT|203 Be target OUT |4 C target IN|844 C target OUT|68 C_2cm target IN|106 C_2cm target OUT|28 C_2cm_target_IN|12 C_2cm_target_OUT|1 Empty target|9 LH full target|20 LH target EMPTY|80 LH target Empty|2 LH target FULL|173 LH target EMPTY|362 LH target FULL|1638 LH_target_EMPTY|1 LH_target_FULL|2 Long target|82 None|585 Pb|4 Pb brick IN|33 Pb target IN|500 Pb target OUT|121 Target holder|1 targetholder IN|4

runs.momentum Contains beam momentum for given run Derived from elog_beam_momentum Some entropy (right), but not too much Assume 30GeV!=31GeV Assume 75GeV!=80GeV Field used to determine reaction sqlite3 prod.db "select elog_beam_momentum, count(*) from runs group by elog_beam_momentum" |372 0 GeV/c|313 10 GeV/c|15 100 GeV/c|14 120 GeV/c|161 13 GeV/c|1137 13 GeV/c |15 158 GeV/c|2387 20 GeV/c|371 30 GeV/c|193 31 GeV/c|558 31GeV/c|345 350 GeV/c|62 40 GeV/c|83 40GeV/c|114 75 GeV/c|13 80 GeV/c|263

productions table A unique combination of target, beam, momentum, year, key, legacy, shine, mode, os, source, type is a production Primary key production Auto-generated unique number production: e.g. 1 target: e.g. Be beam: e.g. Be momentum: e.g. 158 year: e.g. 11 key: e.g. 040 legacy: e.g. 13c shine: e.g. v0r5p0 mode: e.g. pp os: e.g. slc5 source: e.g. phys (sim) type: e.g. prod (test) path_in: raw file used path path_out: path for output files path_layout: how output files are stored under path_out description: free text

chunkproductions table Stores all chunks produced Associated to production, run and chunk Has potential to contain order of 10^6 rows By far largest table in DB Potential performance issue Only using numerical values production: e.g. 1 run: e.g. 123456 chunk e.g. 123 rerun: number of times chunk has failed and been reprocessed status: waiting / processing / checking / ok / failed (numeric values) size_*: size of output files error_*: number of errors of given type found in log file from latest processing

Magnetic field Originally planed to store this information in separate field in run table (extracted from elog) Needed for KEY5 and residual corrections However, Seweryn has now added this information in same database as global key (but not part of global key) Working on integrating this information into production scripts Will make automatic data production much simpler

Database Currently using sqlite Pro: DB contained in single file on file system No need to set up data base Everybody can easily access it with custom SQL quires Open format/code, we “really” own the data Con Not sure if performance will be an issue Backup via normal file system backup Have also tried central Oracle database (na61_cloud@pdbr1) Pro Better performance Better backup Better functionality More complicated to access for everybody Did not notice performance differences with current DB size May be forced to follow Oracle update cycles, potential data preservation issue SQL used for queries are compatible with both sqlite and Oracle Exception: creation of tables Should be possible to move to Oracle if performance becomes an issue All in all I feel sqlite is the best choice until it is proved to have performance issues

Automated data production script commands ./prodna61-produce.sh Usage: ./prodna61-produce.sh <command> <command> one of: reactions - list all reactions in database productions - list all productions in database ./prodna61-produce.sh <command> <path_in> <command> one of: regreaction - register all reactions found at path_in in database ./prodna61-produce.sh <command> <target> <beam> <momentum> <year> [<key> <legacy> <shine> <mode> <source> <path_in> <description> ] <command> one of: regproduction - register new production in database produce - start new production check - check production for errors and update database summary - production summary reproduce - reprocess chunks with with errors for production okchunks - list all OK chunks for production

Data production command usage prodna61-produce.sh regreaction /afs/cern.ch/11/Be/Be160 Will register all runs found at the path in the runs table Obtains run information from bookkeeping database/elog Only has to be done one time per reaction (path) prodna61-produce.sh regproduction Be Be 158 11 040 13e v0r5p0 pp phys def “A new prod.” Creates a new production on the production table, and inserts a new row in the chunkproductions table for each of the chunks of the reaction “def” means “use default value, can be used for all parameters except for first Typically takes the latest known value for parameter Has to be done when a new reaction is to be processed prodna61-produce.sh reactions List all reactions registered in database prodna61-produce.sh productions List all productions registered in database prodna61-produce.sh produce Be Be 158 11 040 13e v0r5p0 pp phys Create job files and submit jobs for reaction prodna61-produce.sh check Be Be 158 11 040 13e v0r5p0 pp phys Check which chunks were processed OK, and which need to be reprocessed prodna61-produce.sh summary Be Be 158 11 040 13e v0r5p0 pp phys Write a summary of the outcome of the check command prodna61-produce.sh reproduce Be Be 158 11 040 13e v0r5p0 pp phys Resubmit the chunks the check command found to be not OK prodna61-produce.sh okchunks Be Be 158 11 040 13e v0r5p0 pp phys Write a list of chunks that are OK after the check command

Data production script status Can in most cases produce data Further work on standardisation on elog data needed Sometimes reactions have “specialities” that has to be taken into account Lxbatch and CernVM versions have diverted a bit, need to be (re-)unified Could be nice to use key-value pairs of parameters Need to add possibility to process range of runs (test productions) Not expected to be difficult Plan to soon use it for mass productions (both Lxbatch and CernVM)

Web interface (prototype)

Web interface (prototype) Web interface to production DB http://na61cld.web.cern.ch/na61cld/cgi- bin/start?reaction=Be|Be|158|11 Experimenting with best interface/usability for different use cases Make it “intuitive” and easy to use Have not put effort into making it “look” good But should be easy to do, relies on style sheets (CSS) for design Think “reaction” and “production” are the main entities to build around Currently can only display information Will add ability to log in for starting productions, etc CERN single-sign-on probably best option Trying to create script that will import information about already existing productions into database Current implementation is slow since DB is open/closed for every query, but with proper language bindings DB can be kept open for multiple queries

General plan forward Complete CernVM test BeBe160 test production Finish outstanding issues with automatic production script Finalise web interface for data production Add functionality, improve performance

Proposal for production directory structure (after moving to Shine) Preferably, all unique production parameters should be encoded in the path to avoid conflicts A deep directory structure is however undesirable Proposal: divide directory path into four levels: “type”, “reaction”, “reconstruction conditions” and “file type” /castor/cern.ch/na61/ <type>/ <target>_<beam>_<momentum>_<year>/ <key>_<shine>_<mode>_<os>_<source>/ <file_type>/ run-<run>x<chunk>.<file_type> Examples: /castor/cern.ch/na61/ prod/ Be_Be_158_11/ 040_v0r5p0_pp_slc5_phys/ shoe.root/run-012345x678.shoe.root /castor/cern.ch/na61/ test/ LHT_p_158_11/ 020_v0r5p0_pp_cvm2_sim/ log.bz2/run-987654x321.log.bz2 Advantages: Separated the test productions from the “real” productions Easier to get overview nsls /castor/cern.ch/prod will show all existing reactions nsls /castor/cern.ch/prod/<reaction> will show all productions for reaction Parameters are organised in order of “importance”