Management of the Data in Auger Jean-Noël Albert LAL – Orsay IN2P3 - CNRS ASPERA – Oct 2010 - Lyon.

Slides:



Advertisements
Similar presentations
1 | M. Sutter | IPE | KIT – die Kooperation von Forschungszentrum Karlsruhe GmbH und Universität Karlsruhe (TH) Michael Sutter H.-J. Mathes,
Advertisements

RAMADDA for Big Climate Data Don Murray NOAA/ESRL/PSD and CU-CIRES Boulder/Denver Big Data Meetup - June 18, 2014.
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
Batch Production and Monte Carlo + CDB work status Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
The Pierre Auger Observatory Nicolás G. Busca Fermilab-University of Chicago FNAL User’s Meeting, May 2006.
O. Stézowski IPN Lyon AGATA Week September 2003 Legnaro Data Analysis – Team #3 ROOT as a framework for AGATA.
Jiri Chudoba for the Pierre Auger Collaboration Institute of Physics of the CAS and CESNET.
Jean-Yves Nief, CC-IN2P3 Wilko Kroeger, SCCS/SLAC Adil Hasan, CCLRC/RAL HEPiX, SLAC October 11th – 13th, 2005 BaBar data distribution using the Storage.
First Analysis of the Auger APF Light Source Eli Visbal (Carnegie Mellon University) Advisor: Stefan Westerhoff.
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
S. J. Sciutto (Auger Collaboration) - LISHEP2004, Rio de Janeiro, February International data management: The experience of the Auger Collaboration.
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
J OINT I NSTITUTE FOR N UCLEAR R ESEARCH OFF-LINE DATA PROCESSING GRID-SYSTEM MODELLING FOR NICA 1 Nechaevskiy A. Dubna, 2012.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
Jean-Yves Nief CC-IN2P3, Lyon HEPiX-HEPNT, Fermilab October 22nd – 25th, 2002.
8th November 2002Tim Adye1 BaBar Grid Tim Adye Particle Physics Department Rutherford Appleton Laboratory PP Grid Team Coseners House 8 th November 2002.
The huge amount of resources available in the Grids, and the necessity to have the most up-to-date experimental software deployed in all the sites within.
DPA follow-up review, Malargüe, Nov/2000 DPA Data Processing and Analysis Task Follow-up Review Malargüe, November 14, 2000.
São Paulo Regional Analysis Center SPRACE Status Report 22/Aug/2006 SPRACE Status Report 22/Aug/2006.
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.
Networks ∙ Services ∙ People Enzo Capone (GÉANT) LHCOPN/ONE Meeting, LBL Berkeley (USA) LHCONE Application Pierre Auger Observatory 1-2 June.
Michael Doherty RAL UK e-Science AHM 2-4 September 2003 SRB in Action.
P.Auger, a major step: Need high statistics large detection area : 2 x3000 km² Uniform sky coverage 2 sites located in each hemisphere Argentina and USA.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Large Simulations using EGEE Grid for the.
ClearQuest XML Server with ClearCase Integration Northwest Rational User’s Group February 22, 2007 Frank Scholz Casey Stewart
1 João Espadanal, Patricia Gonçalves, Mário Pimenta Santiago de Compostela 3 rd IDPASC school Auger LIP Group 3D simulation Of Extensive Air.
The Auger Observatory for High-Energy Cosmic Rays G.Matthiae University of Roma II and INFN For the Pierre Auger Collaboration The physics case Pierre.
Auger Showers Catalog (Auger Database) Jean-Noël Albert - LAL - May 2000.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Simulations and Offline Data Processing for.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI VO auger experience with large scale simulations on the grid Jiří Chudoba.
The Storage Resource Broker and.
September 16, 2005VI DOSAR – IFUSP/IFT – Philippe Gouffon Auger Project Computing Needs The Pierre Auger Observatory Auger data and processing Auger data.
D0 File Replication PPDG SLAC File replication workshop 9/20/00 Vicky White.
Jiri Chudoba for the Pierre Auger Collaboration Institute of Physics of the CAS and CESNET.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
GDB meeting - Lyon - 16/03/05 An example of data management in a Tier A/1 Jean-Yves Nief.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
The Database Project a starting work by Arnauld Albert, Cristiano Bozza.
CDF SAM Deployment Status Doug Benjamin Duke University (for the CDF Data Handling Group)
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Pierre Auger Observatory Jiří Chudoba Institute of Physics and CESNET, Prague.
SRB at KEK Yoshimi Iida, Kohki Ishikawa KEK – CC-IN2P3 Meeting on Grids at Lyon September 11-13, 2006.
Michael Prouza Center for Particle Physics Institute of Physics Academy of Sciences of the Czech Republic Prague Studies of the ultra-high energy cosmic.
Scientific Data Processing Portal and Heterogeneous Computing Resources at NRC “Kurchatov Institute” V. Aulov, D. Drizhuk, A. Klimentov, R. Mashinistov,
ATLAS – statements of interest (1) A degree of hierarchy between the different computing facilities, with distinct roles at each level –Event filter Online.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Feedback to sites from the VO auger Jiří Chudoba (Institute of Physics and.
Compute and Storage For the Farm at Jlab
Open-E Data Storage Software (DSS V6)
Jean-Philippe Baud, IT-GD, CERN November 2007
Build and Test system for FairRoot
Integrating ArcSight with Enterprise Ticketing Systems
Visualization of CORSIKA EAS (Development of EAS in the atmosphere)
The Beijing Tier 2: status and plans
AWS Integration in Distributed Computing
High Availability Linux (HA Linux)
Operating System.
The Client/Server Database Environment
Pasquale Migliozzi INFN Napoli
CC - IN2P3 Site Report Hepix Spring meeting 2011 Darmstadt May 3rd
LHC experiments Requirements and Concepts ALICE
SAM at CCIN2P3 configuration issues
Статус ГРИД-кластера ИЯФ СО РАН.
Pierre Auger Observatory Present and Future
30th International Cosmic Ray Conference
The Aperture and Precision of the Auger Observatory
ATLAS DC2 & Continuous production
Use Case for controlling the Pierre Auger Observatory remotely
Presentation transcript:

Management of the Data in Auger Jean-Noël Albert LAL – Orsay IN2P3 - CNRS ASPERA – Oct Lyon

The Auger Experiment  The largest Cosmic Rays Observatory  km 2 – in the Argentina Pampa  Surface Detectors (Cherenkov)  4 Fluorescence Telescopes  6 x 4 Fluorescence Cameras  10 % of the time (night – no moon)  Hybrid Events : SD + FD  Better determination of the energy  High Energy Cosmic Ray  Energy  1 / km 2 / century  Expected : 30 events/year  Pierre Auger Observatory  Experimental Array :  Production Area :  Full production :  Observed UHECR : 50 (GZK effect) Aspera - 7 Oct 2010, Lyon1 Fluorescence Telescope Surface Detector

Data Taking  Local Data Taking  Radio transfer from the thanks and telescopes to the Cdas main building (Malargüe) Aspera - 7 Oct 2010, Lyon2 1 hour by the road Mendoza Paris To Mendoza : 5 hours To Paris : 15 hours  Merged in Events  Root format  Immediate reconstruction of the event parameters  Streaming ( Level 4 triggers ): Hybrid, Gold Events  Copied to a data server for export  Daily transfer to Lyon  1 – 3.5 GB/day (no calibration)  Poor Pampa BW ~ 50KB/s  Calibration data send by disk every 2 months

Data Transfer  IN2P3 Computer Center is the Main Repository for the Auger Data (Tier 0)  The Pierre Auger Observatory data files are copied every nights from the Malargüe server to CCIN2P3 by batches running at Lyon  The files are imported on a GPFS large disk (20 TB)  The files are duplicated in SRB (Storage Resource Broker) for external access by the collaboration sites  Double backup by the CC team (one kept in a separated building) Aspera - 7 Oct 2010, Lyon3 3 Oracle Database SRB GPFS Disk Cdas Builder Files Restricted Access Daily Event Building and Export Batch Data Server HPSS Access reserved to the data transfer Every Night Malargüe AreaLyon Area Large Files (> 100 MB) External Users

Import Manager  Manage the daily transfers of the data  Run the import batches every night  Duplicated the imported files to SRB  Checks the MD5 keys of the imported files  Registers the status of the files in the Oracle database  Registers the status of the import batches in the database  Xpir - Expert system supervisor  Checks for the submitted transfer batches  Stars transfers if new files available at Malargüe  Fixes usual errors (stopped batches, not saved files, …)  Implementation  Part of the AugerDb project  Java  Hibernate framework to access the database (Object Relational Bridge)  Jargon – SRB Java API  CLIPS: a rules oriented language from NASA Aspera - 7 Oct 2010, Lyon4

SRB  A Data Grid distributed files manager  Developed by the San Diego Supercomputer Center (SDSC)  Semi-transparent integration of the HPSS tape robot and of pools of disks  Transparent for the readers  Driven by resources for the writers (small files // large files)  Gives a network wide access to the data files  Stable, reliable, quite fast  Protection by password  Cross platform implementation : Linux, Windows, Mac OS X  VERY simple to install (only few executables to download)  Auger usage  Read-only access for the Auger collaborators  Need a CC account to get the authorization parameters Aspera - 7 Oct 2010, Lyon5

Web Pages for the Data Transfers  Features  List of the most recent imports  Status of the imports  Status of the data files  Details of each imports  Implementation  Dynamic web pages build from database queryies  AugerDb core  Java, JSP, Struts Aspera - 7 Oct 2010, Lyon6

Calibrations  Calibration transferred by IDE Hard Disk every 2 months  Too large for the network  Limitation to 50 KB/s of the link in the Pampa (or less)  Experimental limit : 3.5 GB / day  Send to the LAL by UPS  Copied to Lyon by the network  Copied to SRB, checked, registered in Oracle by Import Manager  Data size  9.3 TB of data since 2001  8 TB since 2006 (full activity of the observatory)  1.2 TB of reconstructed data  Average :  Reconstructed data: 850 MB/day  Raw data + Calibration: 4.8 GB/day  Fluctuations related to the fluorescence activity (Moon periods) Aspera - 7 Oct 2010, Lyon7

Simulations  2 simulation activities  Simulation of the atmospheric showers  2 Fortran programs: Aires and Corsika  Simulation of the answer of detector to a simulated showers  Auger analysis framework: C++ & Root  Initially performed at Lyon  Still used by user productions  Official libraries simulated on the Grid since 2008  simulated showers  user shower simulations  Corsika Grid showers (official libraries)  Large files - Total of 20 TB (~ 250 MB/file)  Mainly Iron and Proton primaries  Ranges of energy and angles  simulated events [3 events / shower] (only for the official libraries)  Small files - Total of 0.85 TB (~ 2 x 3 MB / files [simulation + reconstruction])  Support for the Hybrid events since 2009 Aspera - 7 Oct 2010, Lyon8

Grid Activity Aspera - 7 Oct 2010, Lyon9 Julio LOZANO BAHILO Granada (Malargue, March 2010)

Grid Activity Aspera - 7 Oct 2010, Lyon10 Julio LOZANO BAHILO Granada (Malargue, March 2010)

Distribution of the Grid Jobs Aspera - 7 Oct 2010, Lyon11 Jiri CHUDOBA Prague (Leche, June 2010) Main Contributor (Karlsruhe)

Production Efficiency Aspera - 7 Oct 2010, Lyon12 Jiri CHUDOBA Prague (Leche, June 2010) Poor Efficiency of the Simulation Jobs Too many IO {?}

Simulation Issues  Production issues  Difficulties in the deployment of the analysis environment  Data management (lost files, ownership [DN changes])  Manpower needed to follow the productions and perform quality checking  Usage issues  Difficulties in accessing the files  Need of a Grid certificate  Access to a UI (User Interface: a station with the Grid environment)  Complexity of the commands to check and get the files  Work around  Mirror the Grid files to SRB Aspera - 7 Oct 2010, Lyon13

SRB Simulation Mirror  Copy of the simulation files to SRB at Lyon  Performed by batches running at Lyon  Allow to regulate the transfer flow using BQS resources  Indexation of the simulations (Showers and Events) in the database  Easy network wide access using SRB clients  Duplication of the Simulated Events files to the GPFS disk  Immediate access for the CC users performing analysises Aspera - 7 Oct 2010, Lyon14

Simulation Library Pages  List of the Libraries  Details of the simulation libraries  Number of simulations  Program and Model  Date  Location  Loaded from the database  Updated every 2 hours  Implementation  AugerDb core  Java, JSP, Struts 15Aspera - 7 Oct 2010, Lyon

Simulation Search Pages  Searches by simulation parameters from the database  Simulation Program  High Energy Model  Energy  Primary  Angles  Support for shower simulations and simulated events  Selection between Grid Simulations or SRB Mirrors 16Aspera - 7 Oct 2010, Lyon

17 Aspera - 7 Oct 2010, Lyon