Practicals on R-GMA Valeria Ardizzone INFN

Slides:



Advertisements
Similar presentations
21 Sep 2005LCG's R-GMA Applications R-GMA and LCG Steve Fisher & Antony Wilson.
Advertisements

Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Database Systems More SQL Database Design -- More SQL1.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Introduction to R-GMA: Relational Grid Monitoring Architecture.
Fourth EELA Tutorial for Managers and Users E-infrastructure shared between Europe and Latin America Hands-on on Information System (R-GMA)
LSC Segment Database Duncan Brown Caltech LIGO-G Z.
CSE314 Database Systems More SQL: Complex Queries, Triggers, Views, and Schema Modification Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson.
Introduction on R-GMA Shi Jingyan Computing Center IHEP.
By Lecturer / Aisha Dawood 1.  You can control the number of dispatcher processes in the instance. Unlike the number of shared servers, the number of.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Information System (IS) Valeria Ardizzone.
Application code Registry 1 Alignment of R-GMA with developments in the Open Grid Services Architecture (OGSA) is advancing. The existing Servlets and.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks R-GMA Now With Added Authorization Steve.
6 1 Lecture 8: Introduction to Structured Query Language (SQL) J. S. Chou, P.E., Ph.D.
An information and monitoring system for static and dynamic information about grid resources, applications, networks … RDBMS Servlet aware of API during.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
EGEE is a project funded by the European Union under contract IST R-GMA: Production Services for Information and Monitoring in the Grid John.
Chapter 4: SQL Complex Queries Complex Queries Views Views Modification of the Database Modification of the Database Joined Relations Joined Relations.
WP3 RGMA Deployment Laurence Field / RAL Steve Fisher / RAL.
Grid Computing Environment Shell By Mehmet Nacar Las Vegas, June 2003.
INFSO-RI Enabling Grids for E-sciencE
E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA gLite Information System Pedro Rausch IF.
20 Copyright © 2008, Oracle. All rights reserved. Cache Management.
INFSO-RI Enabling Grids for E-sciencE Information System Valeria Ardizzone INFN EGEE NA4 Generic Applications Meeting Catania,
Module 6: Administering Reporting Services. Overview Server Administration Performance and Reliability Monitoring Database Administration Security Administration.
Aggregator  Performs aggregate calculations  Components of the Aggregator Transformation Aggregate expression Group by port Sorted Input option Aggregate.
EGEE is a project funded by the European Union under contract IST Information and Monitoring Services within a Grid R-GMA (Relational Grid.
FESR Trinacria Grid Virtual Laboratory Relational Grid Monitoring Architecture (R-GMA) Valeria Ardizzone INFN Catania Tutorial per Insegnanti.
INFSO-RI Enabling Grids for E-sciencE R-GMA Gergely Sipos and Péter Kacsuk MTA SZTAKI Credit to Valeria Ardizzone.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using R-GMA.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America R-GMA Practicals Claudio Cherubino INFN.
INFSO-RI Enabling Grids for E-sciencE gLite Information System: R-GMA Tony Calanducci INFN Catania gLite tutorial at the EGEE User.
CERN 21 January 2005Piotr Nyczyk, CERN1 R-GMA Basics and key concepts Monitoring framework for computing Grids – developed by EGEE-JRA1-UK, currently used.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Introduction to R-GMA: Relational Grid Monitoring Architecture.
 CONACT UC:  Magnific training   
Relational Grid Monitoring Architecture (R-GMA)
2 Copyright © 2008, Oracle. All rights reserved. Building the Physical Layer of a Repository.
Introduction to Algorithm. What is Algorithm? an algorithm is any well-defined computational procedure that takes some value, or set of values, as input.
More SQL: Complex Queries, Triggers, Views, and Schema Modification
Top 10 Entity Framework Features Every Developer Should Know
Doron Orbach UCMDB Product Manager
Fundamental of Databases
Running a Forms Developer Application
More SQL: Complex Queries,
Grid Event Management Using R-GMA Monitoring Framework
R-GMA Command Line Tool
Information System Valeria Ardizzone INFN
The Information System
Working in the Forms Developer Environment
What are they? The Package Repository Client is a set of Tcl scripts that are capable of locating, downloading, and installing packages for both Tcl and.
Hands-on on R-GMA Tony Calanducci INFN Catania
Tutorial 5: Working with Excel Tables, PivotTables, and PivotCharts
Prepared by : Moshira M. Ali CS490 Coordinator Arab Open University
R-GMA as an example of a generic framework for information exchange
Introduction to Triggers
Quiz Questions Q.1 An entity set that does not have sufficient attributes to form a primary key is a (A) strong entity set. (B) weak entity set. (C) simple.
SQL: Advanced Options, Updates and Views Lecturer: Dr Pavle Mogin
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
System And Application Software
Chapter 2: System Structures
SDMX Reference Infrastructure Introduction
More SQL: Complex Queries, Triggers, Views, and Schema Modification
R-GMA (Relational Grid Monitoring Architecture) for monitoring applications “s” gLite and LCG.
Introduction of Week 11 Return assignment 9-1 Collect assignment 10-1
Information and Monitoring System
Chapter 10 ADO.
Contents Preface I Introduction Lesson Objectives I-2
RELATIONAL GRID MONITORING ARCHITECHTURE
Overview Multimedia: The Role of WINS in the Network Infrastructure
SDMX IT Tools SDMX Registry
Presentation transcript:

Practicals on R-GMA Valeria Ardizzone INFN First gLite tutorial on GILDA, Catania, 13-15.06.2005

Overview First gLite tutorial on GILDA, Catania, 13-15.06.2005

R-GMA Virtual Database R-GMA models the information infrastructure of a Grid as a set of Consumers, who request information Producers, who provide information and a single Registry, which mediates the communication between producers and consumers. R-GMA presents the information resources of a Virtual Organisation (VO) as a single virtual database containing a set of virtual tables. As the picture above shows, a single schema contains the name and structure (column names, types and settings) of each virtual table in the system. A single registry contains a list, for each table, of producers who have oered to publish (provide data for) rows for the table. A consumer runs an SQL query against a table, and the registry selects the best producers to answer the query in a process called mediation. The consumer then contacts each producer directly, combines the information, and returns a set of tuples. The mediation process is hidden from the user. Note that there is no central repository holding the contents of the virtual table; it is in this sense, that the database is virtual. Each virtual table has a key column (or group of columns) declared in the schema. First gLite tutorial on GILDA, Catania, 13-15.06.2005

FUNCTIONALITY NOT YET IMPLEMENTED The API’s in the current release of R-GMA are wrappers on top of the old EDG API’s and servlets, so not all new R-GMA functionality is yet available. Specifically, the following functions are not available and will raise an appropriate exception if you try to use them: FILE storage for producers LATEST/HISTORY producers with non-DATABASE storage Combined LATEST and HISTORY producers HISTORY and LATEST retention periods (functionality is approximated) Multiple VOs (just pass NULL for the VO lists) TableAuthorization (just pass NULL) Schema replication is also not yet implemented. First gLite tutorial on GILDA, Catania, 13-15.06.2005

R-GMA Command Line Tool The R-GMA command line tool provides simple shell-like access to the R-GMA distribuited information and monitoring system. To Start the R-GMA command line tool, run the simple command “rgma” on your $home: [glite-tutor] /home/vardizzo > rgma Welcome to the R-GMA virtual database for Virtual Organisations. You are connected to the following R-GMA registry services: https://rgmasrv.ct.infn.it:8443/R-GMA/RegistryServlet Type "help" for a list of commands. rgma> First gLite tutorial on GILDA, Catania, 13-15.06.2005

The command line tool can be used in batch mode in two ways: rgma -c <command> Executes <command> and exits. The -c option may be specified more than once in which case the commands are executed in the order they appear on the command line. rgma -f <file> Executes commands in <file> sequentially then exits. Each line should contain one command. First gLite tutorial on GILDA, Catania, 13-15.06.2005

INFORMATION COMMANDS Displays help for a specific command rgma>help <command> Exit the R-GMA command line rgma>exit or quit To show a list of all table names: rgma> SHOW TABLES To show information about a table MyTable: rgma> DESCRIBE MyTable To show a table of properties for the current session: rgma> SHOW PROPERTIES To show a list of all R-GMA producers that produce the table MyTable: rgma> SHOW PRODUCERS OF MyTable First gLite tutorial on GILDA, Catania, 13-15.06.2005

Query Consumer The subset of SQL understood by R-GMA is that defined by the SQL92 Entry Level standard. The algorithm used by the mediator makes a distinction between simple and complex queries, as defined below. Continuous queries must always be simple. Latest and history queries may be complex, and any SQL command processor used by an R-GMA producer that supports latest or history queries, must support complex queries. First gLite tutorial on GILDA, Catania, 13-15.06.2005

Simple Query A simple query is defined SQL SELECT statement without the following: Joins: only one table is allowed DISTINCT, HAVING, GROUP BY, ORDER BY, UNION, INTERSECT, MINUS Aggregation operators such as AVG, MIN, MAX, COUNT, SUM Calculations, other than numerical ones involving only the operators: +, -, *, / and ** WHERE clauses involving operators other than: =, >, <, >=, <=, and AND First gLite tutorial on GILDA, Catania, 13-15.06.2005

Complex Query A complex query is anything permitted by the SQL92 Entry Level standard. In the short term, R-GMA will not support this type of features. First gLite tutorial on GILDA, Catania, 13-15.06.2005

QUERYING DATA Querying data uses the standard SQL SELECT statement, e.g. rgma> SELECT * FROM ServiceStatus The behaviour of SELECT varies according to the type of query being executed. In R-GMA there are three basic types of query: LATEST Queries only the most recent tuple for each primary key HISTORY Queries all historical tuples for each primary key CONTINUOUS Returns tuples continuously as they are inserted The type of query can be changed using the SET QUERY command: rgma> SET QUERY LATEST rgma> SET QUERY CONTINUOUS The current query type can be displayed using rgma> SHOW QUERY First gLite tutorial on GILDA, Catania, 13-15.06.2005

Maximum AGE of tuples The maximum age of tuples to return can also be controlled. By default there is no limit. To limit the age of latest or historical tuples use the SET MAXAGE command which takes a value and a time unit (seconds, minutes, hours, days - default is seconds). The following are equivalent: rgma> SET MAXAGE 2 minutes rgma> SET MAXAGE 120 The current maximum tuple age can be displayed using rgma> SHOW MAXAGE To disable the maximum age, set it to none: rgma> SET MAXAGE none In the case of a continuous query, the MAXAGE parameter has a slightly different effect. If a maximum age is specified, the query will initially return a history of matching tuples up to the specified maximum age. It will then return new tuples as they are inserted. First gLite tutorial on GILDA, Catania, 13-15.06.2005

Query Timeout The final property affecting queries is the timeout. This controls how long the query will execute for before exiting automatically. For a latest or history query the timeout exists to prevent a problem (e.g. network failure) from stopping the query from completing. For a continuous query, the timeout indicates how long the query will continue to return new tuples. The default timeout is 1 minute and it can be changed using rgma> SET TIMEOUT 3 minutes rgma> SET TIMEOUT 180 The current timeout can be displayed using rgma>SHOW TIMEOUT First gLite tutorial on GILDA, Catania, 13-15.06.2005

INSERTING DATA The SQL INSERT statement may be used to add data to the system: rgma> INSERT INTO Table VALUES (’a’, ’b’, ’c’, ’d’) In R-GMA, data is inserted into the system using a Producer component which handles the INSERT statement. Using the command line tool you may work with one producer at a time. If you change the properties of the producer, a new one is created and all INSERT statements will go to the new producer. Producers may be configured to answer the three different types of query. All producers can answer continuous queries, but ability to answer latest and history queries is optional and is changed using : rgma> SET PRODUCER latest rgma> SET PRODUCER latest history rgma> SET PRODUCER continuous First gLite tutorial on GILDA, Catania, 13-15.06.2005

Producer The current producer type can be displayed using: rgma>show producer For a producer that can answer latest and/or history queries, tuples must be deleted after some time interval to prevent the system becoming full of old data. This is controlled by the latest and history retention periods for a producer. The producer will return tuples to Latest queries provided they are younger than the latest retention period (LRP), and similarly for history queries and the history retention period (HRP). rgma> SET PRODUCER latestretentionperiod 30 minutes rgma> SET PRODUCER historyretentionperiod 2 hours To show the current producer HRP/LRP use: rgma>SHOW PRODUCER HRP rgma> SHOW PRODUCER LRP. First gLite tutorial on GILDA, Catania, 13-15.06.2005

Secondary Producer Using the R-GMA command line you can work with one secondary producer at a time, which can consume from multiple tables. If the properties of the secondary producer are changed a new one is created, and is automatically set up to consume from the same tables as the previous one. To instruct the secondary producer to consume from the table MyTable: rgma> SECONDARYPRODUCER MyTable Like the producer, the secondary producer may be configured to answer latest and/or history queries: rgma> SET SECONDARYPRODUCER latest (By default the secondary producer can answer both latest and history queries. ) The current secondary producer type can be displayed using: rgma> SHOW SECONDARYPRODUCER First gLite tutorial on GILDA, Catania, 13-15.06.2005

Some Esercises Show all tables that are present in the Schema: show tables Choose one table where you want insert your information and see the its attribute: describe <NameTable> Show which producer is enable at this moment for your type query and set it to latest query: show producer Some examples of query: select * from Site select SysAdminContact,UserSupportContact from Site select Name from Site where Latitude > 0 select Endpoint, Type from Site,Service where Site.Name = Service.Site_Name and Latitude > 0 First gLite tutorial on GILDA, Catania, 13-15.06.2005

Producer The producer service is a process running on a server on behalf of the user code; Each tuple published by a primary producer carries a time-stamp, added by the producer, which, together with the key columns, is similar to a primary key for the table. In a Primary Producer, the user code periodically inserts tuples into storage maintained internally by the Primary Producer service. The producer service autonomously answers consumer queries from this storage. The Secondary producer service also answers queries from its internal storage, but it populates this storage itself by running its own query against the virtual table: the user code only sets the process running; the tuples come from other producers. Each virtual table has a key column (or group of columns) declared in the schema. Each tuple published by a primary producer also carries a time-stamp, added by the producer, which, together with the key columns, is similar to a primary key for the table. Tuples with the same key, but dierent values for the time-stamp, can also be thought of as dierent versions of the same tuple. First gLite tutorial on GILDA, Catania, 13-15.06.2005

Consumer In R-GMA, each consumer represents a single SQL SELECT query on the virtual database. In R-GMA, each consumer represents a single SQL SELECT query on the virtual database. The request is initiated by user code, but the consumer service carries out all of the work on its behalf. The query is first passed to the Registry to identify which producers, for each virtual table in the query, must be contacted to answer it. This process is called mediation, and the algorithm used in R-GMA is explained in section 6.2. The query is then passed by the consumer service to each relevant producer, to obtain the answer tuples directly. First gLite tutorial on GILDA, Catania, 13-15.06.2005

Web Services Architecture R-GMA provides APIs for Java, C++, C and Python languages, to make it easier for user applications to interact with the R-GMA services. Each operation of each service is represented by a method (or function) in the API, and that method simply packages up its parameters into a SOAP message and sends it to the Web Service for execution. Any return values or errors (exceptions) are passed back to the caller. The API transparently manages any authentication required by the server, and looks after the resource identifier. Direct interaction between a user application and the R-GMA Web Service (by-passing the API) is also supported. R-GMA conforms to the Web Services Architecture [3]. It consists of six principal services (Primary Producer, Secondary Producer, On-demand Producer, Consumer, Registry and Schema), running on one or more servers. Each service has a well defined set of operations (such as a producer’s startStreaming operation) which it can carry out, as specified in a machine-readable XML document conforming to the Web Services Description Language (WSDL [4]). There is one WSDL document for each service. All operations are requested by applications through an exchange of messages with the service. The sequence and format of messages required for each operation, including any parameters, is also specified in the WSDL. R-GMA uses “SOAP messaging over http/s” [5], in a request/response pattern, for all user-to-service and service-to-service communications apart, from streaming. As the picture shows, if the user chooses to use the API (described below), the SOAP interface is hidden from them. Apache AXIS software implements the SOAP interface on the server. Streaming tuples to continuous consumers uses a lower lever communication, for eciency. First gLite tutorial on GILDA, Catania, 13-15.06.2005