Download presentation
Presentation is loading. Please wait.
Published byMaria Sutton Modified over 9 years ago
1
1 / 18 Federal University of Rio de Janeiro – COPPE/UFRJ Author : Wladimir S. Meyer – Doctorate Student Advisors : Jano Moreira de Souza – Ph.D. Milton Ramos Ramirez – D.Sc.
2
2 / 18 Introduction Motivation Objectives Related Works Framework Description Structure Functioning New functionalities added to Secondo The Case Study Final Considerations Summary
3
3 / 18 Introduction Motivation The challenge of integrate spatial databases spread around a computational grid Objectives Aggregate new functionalities to an extensible SDBMS that permit it to act as a platform to study distributed spatial databases in computational grids. This platform should: Be capable of interact (by itself) with other analogous platforms in a grid Offer some level of transparencies [Özsu and Valduriez 1999]: Data independence Network transparency Replication Transparency Be modular to permit focus only in experiences being developed Be capable of exchange “specialized skills” (algebras in this case)
4
4 / 18 Introduction Related Works The GGF Data Access and Integration Services Work Group (GGF- DAIS-WG) produces a lot of recomendations related with DB in grids [OGSA-DAI-WSRF 05]. They are a set of interfaces and services to be implemented outside the DBMS environment Only relational, XML and file system data models are supported The OGSA-DAI project implements many of DAIS-WG recomendations and offers a java toolkit for clients The OGSA-DQP project [Smith et al. 2002] uses OGSA-DAI to offer support in distributed queries over a grid. Only relational databases are benefitted and doesn’t support the newly release of OGSA-DAI based on WSRF.
5
5 / 18 Framework Description - Structure The framework is composed by: A Spatial DBMS * : Secondo [Dieker and Güting 2000] was adopted because its modularity, formalism and extensibility. It was intended originally for experimental purpose with spatial and spatio-temporal data models [Güting et al. 2004]. A grid middleware: it offers several services that are used by the SDBMS [Foster 2005]: Job Manager Service (GRAM) Reliable File Transfer Service (RFT) Index Service (MDS) Globus Toolkit 4 was chosen because of its web service approach and set of powerful components. A set of tools: it was added to provide some extra functionalities like: Submit queries to a set of servers, Discovery an algebra, in other Secondo, based in algebra description files Import an algebra (*) – when used with its spatial algebra
6
6 / 18 Central Index Service (MDS) Secondo#1 Secondo #4 Secondo #3 QUERY Request Global Schema & Fragments’ map Response Secondo #2 Algebras’ Description file Framework Description - Functioning Global Schema Fragments’ map
7
7 / 18 Central Index Service (MDS) Secondo #1 Secondo #4 Secondo #3 QUERY Secondo #2 Request Servers’ status Same fragments Framework Description - Functioning Global Schema Fragments’ map
8
8 / 18 Central Index Service (MDS) Secondo #1 Secondo #4 Secondo #3 QUERY Global Schema Fragments’ map Secondo #2 Framework Description - Functioning CPU load Total amount of memory Total amount of free memory Number of running processes Number of active processes Number of users logged in Total amount of free space in hard disk CPU load Total amount of memory Total amount of free memory Number of running processes Number of active processes Number of users logged in Total amount of free space in hard disk Responses
9
9 / 18 Central Index Service (MDS) Secondo #1 Secondo #4 Secondo #3 QUERY Secondo #2 Send subqueries The Secondo #1 generates a job description file, a Secondo-command file and submit them to selected nodes using GRAM The job description file can express a multijob, for example meaning that a result from a query must be transfered to another to be used in a second step. Framework Description - Functioning Global Schema Fragments’ map
10
10 / 18 Central Index Service (MDS) Secondo #1 Secondo #4 Secondo #3 QUERY Secondo #2 Results as nested lists (RFT) Framework Description - Functioning Global Schema Fragments’ map
11
11 / 18 Central Index Service (MDS) Secondo #1 Secondo #4 Secondo #3 Result Secondo #2 The returned results are aggregated to form a global result Framework Description - Functioning Global Schema Fragments’ map
12
12 / 18 Modified Secondo Submit activities (jobs) to grid Discover and monitor registered resources Framework Description – New functionalities Adapted from [Ramirez 2001] subqueries
13
13 / 18 Files generated automatically during a job submission: Job description file – a file that specifies details about where and how a job must be executed Secondo Command file – specifies a set of commands to be run in a Secondo server Framework Description – New functionalities open database 28433; create tempBox:rect; update tempBox:=[const rect value(-48.775 –48.771 –25.331 –25.339)] let temp=drain_line creatertree [shape]; query temp drain_line windowintersect [tempBox] consume; delete temp; delete tempBox; close database 28433; Spatial select example Constructed with spatial algebra R-tree algebra operators
14
14 / 18 The Case Study To validate the proposed framework a geographic database prototype is being built in the following manner: Composition: 04 computers, with Fedora Linux, as grid nodes, All machines running GT4 with GRAM, MDS, RFT services, All machines running a modified Secondo (Secondo-grid) Distributed spatial database design: The fragments can be replicated All themes belong to the same region Federated architecture with a Global Schema Thematic fragmentation
15
15 / 18 The Case Study Autonomy: modarate, because each Secondo must update the global schema and fragments’ map when necessary Nature of data: Cartographic data supplied by Directory of Geographic Service (Brazilian Army) Queries being implemented: spatial select and spatial join
16
16 / 18 Final Considerations This framework is being developed as a platform for experimental purposes: performance isn’t its main focus Many issues were not included in present work and will be covered in future works: transaction control, optimizer for distributed queries, security, etc Modules of the framework that are running now: Registering and Monitoring modules: based on global schema, fragments’ map, servers’ status monitor and algebras’ description file Automatic generation of files: job description and secondo command file Submission of single queries with GRAM clients
17
17 / 18 Final Considerations Next steps: Conclude the data transference module using RFT Implement multijob submission with complex queries Conclude the infrastructure to import algebras
18
18 / 18 Thank you !
19
19 / 18 The Study Case
20
20 / 18 Files that are generated automatically during a job submission: Job description file – a file that specifies details about where and how a job must be executed Framework Description – New functionalities 1 2 SecondoTTYBDB 3 ${GLOBUS_USER_HOME}/secondo/bin 4 commands.txt 5 ${GLOBUS_USER_HOME}/secondo/bin/results.txt 6 ${GLOBUS_USER_HOME}/stderr 7 8 9 10 gsiftp://brasilia.gridbd.cos.ufrj.br:2888/${GLOBUS_USER_HOME}/secondo/bin/commands.txt 11 10 file:///${GLOBUS_USER_HOME}/secondo/bin/commands.txt 11 12 13 14 15 file://${GLOBUS_USER_HOME}/secondo/bin/results.txt 16 17 gsiftp://submit.host:2888/${GLOBUS_USER_HOME}/secondo/bin/results-srv1.txt 18 17 18 19 20 file://${GLOBUS_USER_HOME} /secondo/bin/results.txt 21 22
21
21 / 18 There are two resources registered, as XML files, in a Central MDS service: A Global Schema: Framework Description – Resources registered 28433NE 1 drainage_line geoData line 1 nome string 1.
22
22 / 18 A map of fragments’ locations Framework Description – Resources registered 28433NE hydrography rio.cos.ufrj.br recife.cos.ufrj.br vegetation brasilia.cos.ufrj.br edification vitoria.cos.ufrj.br brasilia.cos.ufrj.br
23
23 / 18 Each SDBMS server should register an “algebras’ description file” that specifies all its algebras. This is a XML file with the following format: Framework Description – Resources registered rio.cos.ufrj.br spatial /usr/local/secondo/Algebras/Spatial SpatialAlgebra.h, SpatialAlgebra.cpp, SpatialAlgebra.spec, makefile point, points, region, line intersects, inside, touches, atached, overlaps, ininterior, intersection rtree data structure r-tree b-tree /usr/local/secondo/Algebras/RTree RTreeAlgebra.h, RTreeAlgebra.cpp, RTreeAlgebra.spec, makefile rtree creatertree, windowintersects, insertrtree, deletertree, updatertree
24
24 / 18 It is possible to use MDS to provide any kind of information related with a resource. In this framework all servers should be monitored (as a resource) to permit a better choice among machines that contains replicas of a fragment. A script was developed to collect the following information: CPU load Total amount of memory Total amount of free memory Number of running processes Number of active processes Number of users logged in Total amount of free space in hard disk The results are exposed by MDS as a XML file Framework Description – Monitoring status
25
25 / 18 There are two resources registered, as XML files, in a Central MDS service: A Global Schema: A map of fragments’ locations Framework Description – Resources registered Each SDBMS server should register an “algebras’ description file” that specifies all its algebras.
26
26 / 18 The Case Study Join 1.Read the global schema 2.Read the fragments’ locations map 3.Read resources status from nodes with fragments involved in the query 4.Select the nodes with best conditions in case of a replicated fragment 5.Break the global query in sub-queries 6.Estimate cardinality of sub-queries 7.Build a job description file that determines sub-queries execution in an adequate order: sub-queries with smaller cardinality at first 8.Submit the job to GRAM 9.Transfer the results of these first sub-queries to nodes where the last stage of the queries should be executed as a local query in a SDBMS environment (ingenuous approach). 10.Transfer the final results to the original node and delete all temporary files.
27
27 / 18 The Case Study Select 1.Read the global schema 2.Read the fragments’ locations map 3.Read resources status from nodes with fragments involved in the query 4.Select the nodes with best conditions in case of a replicated fragment 5.Break the global query in sub-queries 6.Generate a job description file and submit the job to GRAM 7.Receive and integrate results to generate a global result
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.