INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org AMGA Metadata Server - Metadata Services in gLite (+ ARDA DB Deployment Plans with Experiments)

Slides:



Advertisements
Similar presentations
WP2: Data Management Gavin McCance University of Glasgow November 5, 2001.
Advertisements

Metadata Progress GridPP18 20 March 2007 Mike Kenyon.
FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
The AMGA metadata catalog Riccardo Bruno - INFN Madrid, 07-11/05/2007.
Asterios Katsifodimos Saturday, May 23, 2015 High Performance Computing systems Lab University of Cyprus The AMGA metadata catalog – An Overview Slides.
1 Grid services based architectures Growing consensus that Grid services is the right concept for building the computing grids; Recent ARDA work has provoked.
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
The LCG File Catalog (LFC) Jean-Philippe Baud – Sophie Lemaitre IT-GD, CERN May 2005.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America The AMGA metadata catalog with use cases.
INFSO-RI Enabling Grids for E-sciencE gLite Data Management Services - Overview Mike Mineter National e-Science Centre, Edinburgh.
LFC tutorial Jean-Philippe Baud, IT-GT, CERN July 2010.
EGEE-III INFSO-RI Enabling Grids for E-sciencE The Medical Data Manager : the components Johan Montagnat, Romain Texier, Tristan.
Marianne BargiottiBK Workshop – CERN - 6/12/ Bookkeeping Meta Data catalogue: present status Marianne Bargiotti CERN.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Information System on gLite middleware Vincent.
The LCG File Catalog (LFC) Jean-Philippe Baud – Sophie Lemaitre IT-GD, CERN May 2005.
LHCb week, 27 May 2004, CERN1 Using services in DIRAC A.Tsaregorodtsev, CPPM, Marseille 2 nd ARDA Workshop, June 2004, CERN.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America AMGA Server Installation Tony Calanducci.
INFSO-RI Enabling Grids for E-sciencE Distributed Metadata with the AMGA Metadata Catalog Nuno Santos, Birger Koblitz 20 June 2006.
Enabling Grids for E-sciencE EGEE-III INFSO-RI I. AMGA Overview What is AMGA Metadata Catalogue of EGEE’s gLite 3.1 Middleware Main Feature of.
ALICE, ATLAS, CMS & LHCb joint workshop on
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks AMGA PHP API Claudio Cherubino INFN - Catania.
Grid Security in a production environment: 4 years of running Andrew McNab University of Manchester.
Hands on session: the AMGA Metadata Catalogue Riccardo Bruno - INFN Madrid, 07-11/05/2007.
Metadata Mòrag Burgon-Lyon University of Glasgow.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Hands on session: the AMGA Metadata Catalogue.
INFSO-RI Enabling Grids for E-sciencE gLite Data Management and Interoperability Peter Kunszt (JRA1 DM Cluster) 2 nd EGEE Conference,
EGEE User Forum Data Management session Development of gLite Web Service Based Security Components for the ATLAS Metadata Interface Thomas Doherty GridPP.
INFSO-RI Enabling Grids for E-sciencE Experiences with LFC and comparison with RNS Erwin Laure Jean-Philippe.
LCG ARDA status Massimo Lamanna 1 ARDA in a nutshell ARDA is an LCG project whose main activity is to enable LHC analysis on the grid ARDA is coherently.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Implementation and performance analysis of.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Enabling Grids for E-sciencE EGEE-II INFSO-RI Medical Data Manager 1 Dicom retrieval : overview of the DPM One command line to retrieve a file:
Database authentication in CORAL and COOL Database authentication in CORAL and COOL Giacomo Govi Giacomo Govi CERN IT/PSS CERN IT/PSS On behalf of the.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Data management in LCG and EGEE David Smith.
INFSO-RI Enabling Grids for E-sciencE /10/20054th EGEE Conference - Pisa1 gLite Configuration and Deployment Models JRA1 Integration.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Evaluating Metadata access strategies with.
Summary of Metadata Workshop Peter Hristov 28 February 2005 Alice Computing Day.
DGC Paris Spitfire A Relational DB Service for the Grid Leanne Guy Peter Z. Kunszt Gavin McCance William Bell European DataGrid Data Management.
EGEE is a project funded by the European Union under contract IST Information and Monitoring Services within a Grid R-GMA (Relational Grid.
FP6−2004−Infrastructures−6-SSA Enabling Grids for E-sciencE The AMGA Metadata Catalog Introduction and hands-on exercises Nuno Santos.
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Data Management cluster summary David Smith JRA1 All Hands meeting, Catania, 7 March.
AMGA-Bookkeeping Carmine Cioffi Department of Physics, Oxford University UK Metadata Workshop Oxford, 05 July 2006.
Site Authorization Service Local Resource Authorization Service (VOX Project) Vijay Sekhri Tanya Levshina Fermilab.
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
EGEE is a project funded by the European Union under contract IST Feedback on the gLite middleware Dietrich Liko / IT - LCG ARDA Workshop,
EGEE-II INFSO-RI Enabling Grids for E-sciencE Data management in EGEE.
The ARDA Project Prototypes for User Analysis on the GRID Dietrich Liko/CERN IT.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
FESR Consorzio COMETA - Progetto PI2S2 The AMGA Metadata Catalog with use cases Salvatore Scifo, Tony Calanducci INFN Catania Grid.
INFSO-RI Enabling Grids for E-sciencE FiReMan Catalog installation Emidio Giorgio INFN First Latin American Workshop for Grid Administrators.
WP2: Data Management Gavin McCance University of Glasgow.
Data Management & Information Systems
INFSO-RI Enabling Grids for E-sciencE ESR Database Access K. Ronneberger,DKRZ, Germany H. Schwichtenberg, SCAI, Germany S. Kindermann,
Jean-Philippe Baud, IT-GD, CERN November 2007
gLite Basic APIs Christos Filippidis
AMGA Metadata Service Vladimir Dimitrov IPP-BAS “gLite middleware Application Developers Course”, Sofia, Bulgaria,
Security and Replication of Metadata with AMGA
gLite Data Management Services
Metadata Services on the GRID
Alice Off-line Week, February 24th, 2005
GSAF Grid Storage Access Framework
AMGA Metadata Service Vladimir Dimitrov IPP-BAS “gLite middleware Application Developers Course”, Plovdiv, Bulgaria,
AMGA Metadata Service Vladimir Dimitrov IPP-BAS “gLite middleware Application Developers Course”, Sofia, Bulgaria,
AMGA Web Interface Vincenzo Milazzo
The AMGA metadata catalog
Metadata Services on the GRID
New Tools In Education Minjun Wang
Presentation transcript:

INFSO-RI Enabling Grids for E-sciencE AMGA Metadata Server - Metadata Services in gLite (+ ARDA DB Deployment Plans with Experiments) Birger Koblitz NA4 3D Workshop, Oct 17 th 2005, CERN

2 Enabling Grids for E-sciencE INFSO-RI D Workshop, CERN Grid Metadata Services Metadata services on the Grid comes in 2 flavours: –File metadata –Simple, generalized rel. DB services: Example from EGEE-BioMed community Files LFN Production Images GUIDDate Patient IDDoctor NameHospital Patient

3 Enabling Grids for E-sciencE INFSO-RI D Workshop, CERN What is AMGA? AMGA is the Metadata Catalogue for gLite: AMGA started out as ARDA's tool to investigate metadata access on the GRID AMGA will be in gLite release 1.5 AMGA works in 2 modes: –Side-by-Side a File Catalogue (LFC): File Metadata –Standalone: General relational data on Grid AMGA has 2 front ends: –SOAP with PTF standardised interface –Text-based TCP streaming protocol (proprietary, documented) AMGA has ideas from many people: UK GridPP Metadata Group, GAG (HEP), gLite DM-team, PTF, LHCb A R D A

4 Enabling Grids for E-sciencE INFSO-RI D Workshop, CERN A Common Interface AMGA implements a common interface designed in close collaboration of gLite and ARDA teams (P. Kunszt, R. Rocha, N. Santos, B.K.) Again: many ideas from UK GridPP Metadata group, LHCb (Bookkeeping, GANGA), GAG, PTF... Endorsed by PTF Design Ideas: –Versatility: Usable for HEP as well as Biomed (security) –Modular: Interface for Entry manipulation, schemes, security  Possible Add-on to File Catalogue –Allows stateless & statefull implementations –Few requirements on back end, can be SQL-DB, XML... Description of WSDL at A R D A

5 Enabling Grids for E-sciencE INFSO-RI D Workshop, CERN DB Access on the Grid Traditional DB access doesn't work on Grid: “Traditional” Way: ODBC, JDBC,... SQL via ODBC, JDBC proprietary Protocols Server SQL-DB Client Application API Server SQL-DB DB-Service SQL Client Application API SOAP XML-RPC Text “Service”: LFC, AMI, RefDB,... +Lightweight Client +Security: GSI, x509 − Performance − Implementation: State +Performance +Simple Implementation − Security, Monitoring − Authentication, resource management??

6 Enabling Grids for E-sciencE INFSO-RI D Workshop, CERN Access Control on the Grid Access control to resources on the Grid is done via a Virtual Organization Management System (VOMS): VOMS Authenticate with X509 Cert VOMS-Cert with Group & Role information VOMS-Cert Resource management AMG A OracleOracle Oracle VOM S

7 Enabling Grids for E-sciencE INFSO-RI D Workshop, CERN Security Concepts Security very important for BioMed, less for HEP Security ↔ Speed Standalone catalogue has: –ACLs for dirs and Unix permissions dirs/entries –Built-in group-management as in AFS AMGA + LFC back end: –Posix ACLs + Unix permissions for dirs/entries (ACLs currently not checked: slow!) –Users/groups via VOMS Currently no security on attribute basis –AMGA allows to create views: Safer, faster, similar to RDBMS Security tested by GILDA team for standalone catalogue, liked built-in group management & ACLs, but we need feedback from BioMed!

8 Enabling Grids for E-sciencE INFSO-RI D Workshop, CERN AMGA in Use AMGA in preproduction within several projects: LHCb and ATLAS: GANGA LHCb Logging and Bookkeeping EGEE BioMed applications –Highly secure access to medical images metadata Generic applications: –Metadata for EGEE-GILDA media library –UNOSAT project: Used side-by side with LFC catalogue for file- metadata of satellite images

9 Enabling Grids for E-sciencE INFSO-RI D Workshop, CERN AMGA Implementation AMGA Implementation: –SOAP and Text frontends –Supports single calls, sessions & connections –SSL security with grid certs –PostgreSQL, Oracle, MySQL, SQLite backends –Works alongside LFC –C++, Java, Python clients See & download at project-arda-dev/metadata/ Client Application C++-API Security wrapper GSI SSL Application XML SQL Server PostgreSQL File Server PostgreSQL Firewall JavaAPI TEXT Server ODBC SOAP Security wrapper GSI SSL MD-Server Command Asynchr. Buffer

10 Enabling Grids for E-sciencE INFSO-RI D Workshop, CERN Ongoing Work: Replication PhD (N. Santos) focuses on replication of AMGA at application level: Goal –Fault-tolerance, scalability, reliability with minimum administration overhead Ideas –Replication.  Partial, only collections that are needed locally.  Master-slave asynchronous –P2P techniques for managing dynamic network of nodes –Discovery and location of metadata collections using P2P techniques Current status: –Initial work on AMGA –Prototype using PostgreSQL replication working

11 Enabling Grids for E-sciencE INFSO-RI D Workshop, CERN Plans ARDA Database Deployment Plans and Replication Test Plans

12 Enabling Grids for E-sciencE INFSO-RI D Workshop, CERN ARDA Deployment Plans Analysis Prototypes developed by HEP-Experiments & ARDA currently foresee: LHCb: –Logging & Bookkeeping DB replicated >=2x AMGA + DB: Oracle Could use Oracle streams or AMGA native replication –GANGA analysis front-end plans to deploy AMGA+Postgres back-end once per Tier1 Replication scheme unclear, data not persistent ATLAS: –GANGA analysis front-end using AMGA, DB back-end unknown (currently PostgreSQL) CMS: –Central dashboard DB at CERN for monitoring data aggregation Currently PostgreSQL, PHP-access

13 Enabling Grids for E-sciencE INFSO-RI D Workshop, CERN Deployment Plan Continued Generic AMGA for BioMed and other communities will be deployed by EGEE deployment teams Rough overview with numbers:

14 Enabling Grids for E-sciencE INFSO-RI D Workshop, CERN Testing Plans ARDA would like to test Oracle streams replication in simple setup at CERN for AMGA: –Test replication functionality, in particular to replicate schema changes –Understand performance requirements ➔ Verify Oracle Streams as replication solution for AMGA O-Streams AMG A ORACLE

15 Enabling Grids for E-sciencE INFSO-RI D Workshop, CERN Summary With AMGA gLite will get generic metadata service Implements most of PTF Metadata Interface AMGA has seen heavy performance/stability testing AMGA currently evaluated / in preproduction for LHCb(GANGA, bookkeeping), BioMed, UNOSAT, ESR ARDA would like to test AMGA replication via Oracle streams Replication of AMGA at application level planned

16 Enabling Grids for E-sciencE INFSO-RI D Workshop, CERN Basic Concepts Entry –Has key (unique string) and attributes Attribute –Has name (string), type (depends on backend, support for basic types) –Belongs to schema –An entry in a schema has a value for each attribute Schema (in AMGA: directory) –Has name and list of attributes –In AMGA: Every entry belongs to one schema, schemas are hierarchical: /collaboration1/jobs Query –SELECT... WHERE... clause in SQL-like query language

17 Enabling Grids for E-sciencE INFSO-RI D Workshop, CERN Example mdclient -p8822 lxb0709 Connected to lxb0709:8822 ARDA Metadata Server Query> dir / >> grid >> collection Query> dir /grid/arda >> lfn-0.dat [... rest of LFC entries] Query> schema_create /grid/arda i int t text Query> listattr /grid/arda >> i >> int >> t >> text Query> addentries /grid/arda/lfn-0.dat /grid/arda/lfn-1.dat Query> listentries /grid//arda >> lfn-0.dat >> lfn-1.dat Query> addentry /grid/arda/lfn-4.dat i 2 t 'A test' Query> listentries /grid/arda >> lfn-0.dat >> lfn-1.dat >> lfn-2.dat Query> addattr /grid/arda f float Query> find /grid/arda/* 'i=2' >> lfn-2.dat Query> addentries /grid/dteam/arda/bla.dat Error 1: File or directory not found: /grid/arda/bla.dat Query> Example command line session with LFC back end: