FP6−2004−Infrastructures−6-SSA-026409 www.eu-eela.org E-infrastructure shared between Europe and Latin America The AMGA metadata catalog with use cases.

Slides:



Advertisements
Similar presentations
Remote Visualisation System (RVS) By: Anil Chandra.
Advertisements

Metadata Progress GridPP18 20 March 2007 Mike Kenyon.
Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Torsten Antoni – LCG Operations Workshop, CERN 02-04/11/04 Global Grid User Support - GGUS -
Data Management Expert Panel - WP2. WP2 Overview.
Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
EGEE-II INFSO-RI Enabling Grids for E-sciencE AMGA Metadata Access on the Grid Mike Mineter.
The AMGA metadata catalog Riccardo Bruno - INFN Madrid, 07-11/05/2007.
Asterios Katsifodimos Saturday, May 23, 2015 High Performance Computing systems Lab University of Cyprus The AMGA metadata catalog – An Overview Slides.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America The AMGA metadata catalog with use cases.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Makrand Siddhabhatti Tata Institute of Fundamental Research Mumbai 17 Aug
E-science grid facility for Europe and Latin America Bridging OurGrid-based and gLite-based Grid Infrastructures Abmar de Barros, Adabriand.
11/16/2012ISC329 Isabelle Bichindaritz1 Web Database Application Development.
INFSO-RI Enabling Grids for E-sciencE gLite Data Management Services - Overview Mike Mineter National e-Science Centre, Edinburgh.
SQL Queries Relational database and SQL MySQL LAMP SQL queries A MySQL Tutorial and applications Database Building Assignment.
IST E-infrastructure shared between Europe and Latin America The AMGA metadata catalog with use cases Domenico Vicinanza, CERN.
EGEE-III INFSO-RI Enabling Grids for E-sciencE The Medical Data Manager : the components Johan Montagnat, Romain Texier, Tristan.
Csi315csi315 Client/Server Models. Client/Server Environment LAN or WAN Server Data Berson, Fig 1.4, p.8 clients network.
FESR Trinacria Grid Virtual Laboratory The AMGA metadata catalog with use cases Riccardo Bruno - INFN gLite Tutorial Istanbul, July.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America AMGA Server Installation Tony Calanducci.
INFSO-RI Enabling Grids for E-sciencE Distributed Metadata with the AMGA Metadata Catalog Nuno Santos, Birger Koblitz 20 June 2006.
EGEE-II INFSO-RI Enabling Grids for E-sciencE AMGA Metadata Access on the Grid Mike Mineter.
David Adams ATLAS ADA, ARDA and PPDG David Adams BNL June 28, 2004 PPDG Collaboration Meeting Williams Bay, Wisconsin.
INFSO-RI Enabling Grids for E-sciencE AMGA Metadata Server - Metadata Services in gLite (+ ARDA DB Deployment Plans with Experiments)
Enabling Grids for E-sciencE EGEE-III INFSO-RI I. AMGA Overview What is AMGA Metadata Catalogue of EGEE’s gLite 3.1 Middleware Main Feature of.
The Advanced Data Searching System The Advanced Data Searching System with 24 February APCTP 2010 J.H Kim & S. I Ahn & K. Cho on behalf of the Belle-II.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks AMGA PHP API Claudio Cherubino INFN - Catania.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America EELA Infrastructure (WP2) Roberto Barbera.
Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Usage of virtualization in gLite certification Andreas Unterkircher.
Metadata Mòrag Burgon-Lyon University of Glasgow.
EGEE User Forum Data Management session Development of gLite Web Service Based Security Components for the ATLAS Metadata Interface Thomas Doherty GridPP.
E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA gLite Information System Pedro Rausch IF.
NeuroLOG ANR-06-TLOG-024 Software technologies for integration of process and data in medical imaging A transitional.
Managing Data DIRAC Project. Outline  Data management components  Storage Elements  File Catalogs  DIRAC conventions for user data  Data operation.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Enabling Grids for E-sciencE EGEE-II INFSO-RI Medical Data Manager 1 Dicom retrieval : overview of the DPM One command line to retrieve a file:
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America gLite Information System Claudio Cherubino.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America R-GMA Server Installation Valeria Ardizzone.
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
Summary of Metadata Workshop Peter Hristov 28 February 2005 Alice Computing Day.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
FP6−2004−Infrastructures−6-SSA Enabling Grids for E-sciencE The AMGA Metadata Catalog Introduction and hands-on exercises Nuno Santos.
AMGA-Bookkeeping Carmine Cioffi Department of Physics, Oxford University UK Metadata Workshop Oxford, 05 July 2006.
EGEE-II INFSO-RI Enabling Grids for E-sciencE AMGA Metadata Service Gergely Sipos.
EGEE is a project funded by the European Union under contract IST Feedback on the gLite middleware Dietrich Liko / IT - LCG ARDA Workshop,
The ARDA Project Prototypes for User Analysis on the GRID Dietrich Liko/CERN IT.
FESR Consorzio COMETA - Progetto PI2S2 The AMGA Metadata Catalog with use cases Salvatore Scifo, Tony Calanducci INFN Catania Grid.
XML 2002 Annotation Management in an XML CMS A Case Study.
IST E-infrastructure shared between Europe and Latin America The GILDA t-Infrastructure and the GENIUS portal Christian Grunfeld,
FESR Consorzio COMETA - Progetto PI2S2 AMGA Official Metadata Service for EGEE Salvatore Scifo – Consorzio Cometa - Catania, ITALY.
FESR Consorzio COMETA - Progetto PI2S2 AMGA Official Metadata Service for EGEE Salvatore Scifo – Consorzio Cometa - Catania, ITALY.
EGEE-II INFSO-RI Enabling Grids for E-sciencE AMGA Metadata Service Mike Mineter.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Grid based telemedicine application
gLite Basic APIs Christos Filippidis
NA4/medical imaging. Medical Data Manager Installation
AMGA - Official Metadata Service for EGEE
AMGA Metadata Service Vladimir Dimitrov IPP-BAS “gLite middleware Application Developers Course”, Sofia, Bulgaria,
Security and Replication of Metadata with AMGA
Metadata Services on the GRID
AMGA Web Interface Salvatore Scifo INFN sez. Catania
Alice Off-line Week, February 24th, 2005
GSAF Grid Storage Access Framework
GSAF Grid Storage Access Framework
AMGA Metadata Service Vladimir Dimitrov IPP-BAS “gLite middleware Application Developers Course”, Plovdiv, Bulgaria,
AMGA Metadata Service Vladimir Dimitrov IPP-BAS “gLite middleware Application Developers Course”, Sofia, Bulgaria,
AMGA Web Interface Vincenzo Milazzo
The AMGA metadata catalog
Metadata Services on the GRID
Presentation transcript:

FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America The AMGA metadata catalog with use cases Javier Perez-Griffo Extremadura Advanced Research Center (CETA-CIEMAT) Seventh EELA Tutorial for Users Merida, 9 of November 2006

E-infrastructure shared between Europe and Latin America Seventh EELA Tutorial for Users, Merida 9 of November 2006 Contents Background and Motivation for AMGA Interface, Architecture and Implementation Metadata Replication on AMGA Deployment Examples GILDA Use cases

E-infrastructure shared between Europe and Latin America Seventh EELA Tutorial for Users, Merida 9 of November 2006 Metadata on the GRID Metadata is data about data On the Grid: information about files –Describe files –Locate files based on their contents But also simplified DB access on the Grid –Many Grid applications need structured data –Many applications require only simple schemas  Can be modelled as metadata –Main advantage: better integration with the Grid environment  Metadata Service is a Grid component  Grid security  Hide DB heterogeneity

E-infrastructure shared between Europe and Latin America Seventh EELA Tutorial for Users, Merida 9 of November 2006 ARDA/gLite Metadata Interface ARDA evaluated existing Metadata Services from HEP experiments –AMI (ATLAS), RefDB (CMS), Alien Metadata Catalogue (ALICE) –Similar goals, similar concepts –Each designed for a particular application domain  Reuse outside intended domain difficult –Several technical limitations: large answers, scalability, speed, lack of flexibility ARDA proposed an interface for Metadata access on the GRID –Based on requirements of LHC experiments –But generic - not bound to a particular application domain –Designed jointly with the gLite/EGEE team –Incorporates feedback from GridPP Adopted as the official EGEE Metadata Interface –Endorsed by PTF (Project Technical Forum of EGEE)

E-infrastructure shared between Europe and Latin America Seventh EELA Tutorial for Users, Merida 9 of November 2006 AMGA Implementation ARDA developed an implementation of PTF interface –AMGA – ARDA Metadata Grid Application Began as prototype to evaluate the Metadata Interface –Evaluated by community since the beginning:  LHCb and Ganga were early testers (more on this later) –Matured quickly thanks to users feedback Now part of gLite middleware –Official Metadata Service for EGEE –First release with gLite 1.5 –Planned for inclusion on gLite 3.1 (not present on gLite 3.0) –Also available as standalone component Expanding user community –HEP, Biomed, UNOSAT…

E-infrastructure shared between Europe and Latin America Seventh EELA Tutorial for Users, Merida 9 of November 2006 Metadata Concepts Some Concepts –Metadata - List of attributes associated with entries –Attribute – key/value pair with type information  Type – The type (int, float, string,…)  Name/Key – The name of the attribute  Value - Value of an entry's attribute –Schema – A set of attributes –Collection – A set of entries associated with a schema –Think of schemas as tables, attributes as columns, entries as rows

E-infrastructure shared between Europe and Latin America Seventh EELA Tutorial for Users, Merida 9 of November 2006 AMGA DEMO with AMGA Web Interface

E-infrastructure shared between Europe and Latin America Seventh EELA Tutorial for Users, Merida 9 of November 2006 AMGA Features Dynamic Schemas –Schemas can be modified at runtime by client  Create, delete schemas  Add, remove attributes Metadata organised as an hierarchy –Collections can contain sub-collections –Analogy to file system:  Collection  Directory; Entry  File Flexible Queries –SQL-like query language –Joins between schemas –Example selectattr /gLibrary:FileName /gLAudio:Author /gLAudio:Album '/gLibrary:FILE=/gLAudio:FILE and like(/gLibrary:FileName, “%.mp3")‘

E-infrastructure shared between Europe and Latin America Seventh EELA Tutorial for Users, Merida 9 of November 2006 Security Unix style permissions ACLs – Per-collection or per-entry. Secure connections – SSL Client Authentication based on –Username/password –General X509 certificates –Grid-proxy certificates Access control via a Virtual Organization Management System (VOMS):

E-infrastructure shared between Europe and Latin America Seventh EELA Tutorial for Users, Merida 9 of November 2006 AMGA Implementation C++ multiprocess server –Runs on any Linux flavour Backends –Oracle, MySQL, PostgreSQL, SQLite Two frontends –TCP Streaming  High performance  Client API for C++, Java, Python, Perl, Ruby –SOAP  Interoperability Also implemented as standalone Python library –Data stored on filesystem

E-infrastructure shared between Europe and Latin America Seventh EELA Tutorial for Users, Merida 9 of November 2006 Architecture TCP-Streaming frontend Designed for scalability –Asynchronous operation  Reading from DB and sending data to client –Response sent to client in chunks  No limit on the maximum response size Example: TCP Streaming –Text based protocol (like SMTP, POP3,…) –Response streamed to client Client: listattr entry Server: 0 entry value1 value2 …

E-infrastructure shared between Europe and Latin America Seventh EELA Tutorial for Users, Merida 9 of November 2006 Accessing AMGA TCP Streaming Front-end –mdcli & mdclient and C++ API (md_cli.h, MD_Client.h) –Java Client API and command line mdjavaclient.sh & mdjavacli.sh (also under Windows !!) –Python Client API –AMGA Web Interface ---NEW  Developed totally by the GILDA team – INFN CT  Based on JAVA AMGA Standard APIs  Web Application using standard as JSP Custom Tags, Servlet SOAP Frontend (WSDL) –C++ gSOAP –AXIS (Java) –ZSI (Python)

E-infrastructure shared between Europe and Latin America Seventh EELA Tutorial for Users, Merida 9 of November 2006 AMGA Web Interface

E-infrastructure shared between Europe and Latin America Seventh EELA Tutorial for Users, Merida 9 of November 2006 Metadata Schema Management

E-infrastructure shared between Europe and Latin America Seventh EELA Tutorial for Users, Merida 9 of November 2006 Entry Management

E-infrastructure shared between Europe and Latin America Seventh EELA Tutorial for Users, Merida 9 of November 2006 ACL Management

E-infrastructure shared between Europe and Latin America Seventh EELA Tutorial for Users, Merida 9 of November 2006 QBE like Query Engine

E-infrastructure shared between Europe and Latin America Seventh EELA Tutorial for Users, Merida 9 of November 2006 Query Result

E-infrastructure shared between Europe and Latin America Seventh EELA Tutorial for Users, Merida 9 of November 2006 AMGA WI Deployment Scenario AMGA WI could be deployed on a dedicated server. This can be located inside the GRID network or outside. Currently the GILDA AMGA Server machine also hosts the web interface. Users access to the catalog towards the functionalities provided by the web interface. User uses a common Web Browser.

E-infrastructure shared between Europe and Latin America Seventh EELA Tutorial for Users, Merida 9 of November 2006 Metadata Replication Motivation –Scalability – Support hundreds/thousands of concurrent users –Geographical distribution – Hide network latency –Reliability – No single point of failure –DB Independent replication – Heterogeneous DB systems –Disconnected computing – Off-line access (laptops) Architecture –Asynchronous replication –Master-slave – Writes only allowed on the master –Replication at the application level  Replicate Metadata commands, not SQL → DB independence –Partial replication – supports replication of only sub-trees of the metadata hierarchy

E-infrastructure shared between Europe and Latin America Seventh EELA Tutorial for Users, Merida 9 of November 2006 Metadata Replication Full replication Partial replication FederationProxy Some use cases

E-infrastructure shared between Europe and Latin America Seventh EELA Tutorial for Users, Merida 9 of November 2006 Early adopters of AMGA LHCb-bookkeeping –Migrated bookkeeping metadata to ARDA prototype  20M entries, 15 GB  Large amount of static metadata –Feedback valuable in improving interface and fixing bugs –AMGA showing good scalability Ganga –Job management system  Developed jointly by Atlas and LHCb –Uses AMGA for storing information about job status  Small amount of highly dynamic metadata

E-infrastructure shared between Europe and Latin America Seventh EELA Tutorial for Users, Merida 9 of November 2006 Biomed Medical Data Manager – MDM –Store and access medical images and associated metadata on the Grid –Built on top of gLite 1.5 data management system –Demonstrated at last EGEE conference (October 05, Pisa) Strong security requirements –Patient data is sensitive –Data must be encrypted –Metadata access must be restricted to authorized users AMGA used as metadata server –Demonstrates authentication and encrypted access –Used as a simplified DB More details at –

E-infrastructure shared between Europe and Latin America Seventh EELA Tutorial for Users, Merida 9 of November 2006 Conclusion AMGA – Metadata Service of gLite –Part of gLite (but still not certificed in gLite 3.0. it will be done with 3.1 release) –Useful for simplified DB access –Integrated on the Grid environment (Security) Replication/Federation features Tests show good performance/scalability Already deployed by several Grid Applications –LHCb, ATLAS, Biomed, … –AMGA WI, gMOD, gLibrary (it follows) AMGA Web Site /

E-infrastructure shared between Europe and Latin America Seventh EELA Tutorial for Users, Merida 9 of November 2006 Any questions? Thanks for the attention