© Geodise Project, University of Southampton, 2003. Data Management in Geodise Zhuoan Jiao, Jasmin Wason & Marc Molinari { z.jiao,

Slides:



Advertisements
Similar presentations
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
Advertisements

WP2: Data Management Gavin McCance University of Glasgow November 5, 2001.
Metadata Progress GridPP18 20 March 2007 Mike Kenyon.
Grid Enabled Optimisation and Design Search for Engineering (G EODISE ) Prof Simon Cox Southampton University 3 rd Annual Workshop on Linux Clusters for.
© Geodise Project, University of Southampton, Geodise: Taking the Grid to the Engineer Graeme Pound International Summer.
19-20 March 2003 IVOA Registry Workgroup LeSc Astrogrid Registry: Early Designs Elizabeth Auden Astrogrid Registry Workgroup Leader IVOA Registry Workgroup.
© Geodise Project, University of Southampton, Applications and Middleware Hakki Eres, Fenglian Xu & Graeme Pound.
Oct 31, 2000Database Management -- Fall R. Larson Database Management: Introduction to Terms and Concepts University of California, Berkeley School.
Data Grids: Globus vs SRB. Maturity SRB  Older code base  Widely accepted across multiple communities  Core components are tightly integrated Globus.
Attribute databases. GIS Definition Diagram Output Query Results.
Chapter 7 Managing Data Sources. ASP.NET 2.0, Third Edition2.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
MAHI Research Database Project Status Report August 9, 2001.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
© Geodise Project, University of Southampton, Geodise: A Grid-enabled PSE for design search and optimisation Graeme Pound.
© Geodise Project, University of Southampton, Data Management in Geodise Jasmin Wason, Zhuoan Jiao and Marc Molinari Engineering.
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
DATABASE and XML Moussa Mané. Learning Objectives ● Learn about Native XML Databases ● Learn about the conversion technology available ● Understand New.
Digital Object Architecture
 DATABASE DATABASE  DATABASE ENVIRONMENT DATABASE ENVIRONMENT  WHY STUDY DATABASE WHY STUDY DATABASE  DBMS & ITS FUNCTIONS DBMS & ITS FUNCTIONS 
OracleAS Reports Services. Problem Statement To simplify the process of managing, creating and execution of Oracle Reports.
M1G Introduction to Database Development 6. Building Applications.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
© Geodise Project, University of Southampton, Data Management in Geodise Zhuoan Jiao, Jasmin Wason and Marc Molinari
Dr. Mohamed Osman Hegazi 1 Database Systems Concepts Database Systems Concepts Course Outlines: Introduction to Databases and DBMS. Database System Concepts.
JVO JVO Portal Japanese Virtual Observatory (JVO) Prototype 2 Masahiro Tanaka, Yuji Shirasaki, Satoshi Honda, Yoshihiko Mizumoto, Masatoshi Ohishi (NAOJ),
© Geodise Project, University of Southampton, GEODISE: Grid-enabled toolkits for the Engineer Andrew Price UK e-Science Programme,
1 All-Hands Meeting 2-4 th Sept 2003 e-Science Centre The Data Portal Glen Drinkwater.
Javascript Cog Kit By Zhenhua Guo. Grid Applications Currently, most grid related applications are written as separate software. –server side: Globus,
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University.
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Management of Distributed Data Reagan W. Moore.
Holding slide prior to starting show. A Portlet Interface for Computational Electromagnetics on the Grid Maria Lin and David Walker Cardiff University.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
INFSO-RI Enabling Grids for E-sciencE OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.
© Geodise Project, University of Southampton, Geodise Middleware & Optimisation Graeme Pound, Hakki Eres, Gang Xue & Matthew Fairman Summer 2003.
© Geodise Project, University of Southampton, Knowledge Management in Geodise Geodise Knowledge Management Team Barry Tao, Colin Puleston, Liming.
The CERA2 Data Base Data input – Data output Hans Luthardt Model & Data/MPI-M, Hamburg Services and Facilities of DKRZ and Model & Data Hamburg,
A radiologist analyzes an X-ray image, and writes his observations on papers  Image Tagging improves the quality, consistency.  Usefulness of the data.
The Global Land Cover Facility is sponsored by NASA and the University of Maryland.The GLCF is a founding member of the Federation of Earth Science Information.
Database Management Systems (DBMS)
© Geodise Project, University of Southampton, CFD-based shape optimisation using Geodise toolkits Nacelle Optimisation.
© Geodise Project, University of Southampton, Grid middleware for engineering design search and optimisation Graeme Pound.
© Geodise Project, University of Southampton, Geodise Middleware Graeme Pound, Gang Xue & Matthew Fairman Summer 2003.
DSpace System Architecture 11 July 2002 DSpace System Architecture.
Feb 24-27, 2004ICDL 2004, New Dehli Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer.
© Geodise Project, University of Southampton, Integrating Data Management into Engineering Applications Zhuoan Jiao, Jasmin.
ESG-CET Meeting, Boulder, CO, April 2008 Gateway Implementation 4/30/2008.
1 AHM, 2–4 Sept 2003 e-Science Centre GRID Authorization Framework for CCLRC Data Portal Ananta Manandhar.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
© Geodise Project, University of Southampton, Geodise Compute Toolbox Functions CommandFunctionCommandFunction gd_certinfo.
AFS/OSD Project R.Belloni, L.Giammarino, A.Maslennikov, G.Palumbo, H.Reuter, R.Toebbicke.
The overview How the open market works. Players and Bodies  The main players are –The component supplier  Document  Binary –The authorized supplier.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
© Geodise Project, Scenario: Design optimisation v Model device, discretize, solve, postprocess, optimise Scripting.
© Geodise Project, University of Southampton, Applications and Middleware Hakki Eres, Fenglian Xu & Graeme Pound.
© Geodise Project, University of Southampton, Workflow Support for Advanced Grid-Enabled Computing Fenglian Xu *, M.
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
Platinum DecisionBase1 DW Product Platinum - Computer AssociatesDecisionBase Hyunsook Lim Database Laboratory Dept. of CSE.
ACGT Architecture and Grid Infrastructure Juliusz Pukacki ‏ EGEE Conference Budapest, 4 October 2007.
© Geodise Project, University of Southampton, Data Management in Geodise Jasmin Wason, Zhuoan Jiao and Marc Molinari 12 May.
The CUAHSI Hydrologic Information System Spatial Data Publication Platform David Tarboton, Jeff Horsburgh, David Maidment, Dan Ames, Jon Goodall, Richard.
Robert Worden Open Mapping Software Ltd
Data services on the NGS
Data services on the NGS
Database Management System (DBMS)
Database Design Hacettepe University
Best Practices in Higher Education Student Data Warehousing Forum
Presentation transcript:

© Geodise Project, University of Southampton, Data Management in Geodise Zhuoan Jiao, Jasmin Wason & Marc Molinari { z.jiao, j.l.wason, m.molinari

© Geodise Project, University of Southampton, Providing Data Management Services for Engineering  Engineering design and optimisation is a computationally intensive process.  Large quantities of data may be generated at different locations with different characteristics.  Engineering data is traditionally stored in flat files with little descriptive metadata provided by the file system.  Our focus is on leveraging existing database tools not commonly used in engineering …  …and making them accessible to users of the system.

© Geodise Project, University of Southampton, Tools and Services (1)  File storage  Applications can archive data sent over GridFTP in file systems for benefits of:  Accessibility by a larger community (via authorisation)  Storage capacity  Additional metadata storage and query facilities  Metadata management service  The data can be stored with additional descriptive information detailing standard metadata (e.g. file format, description) and application domain specific metadata (e.g. grids, flux_order).  An XML database is used as is it flexible enough to store nested, complex engineering data.

© Geodise Project, University of Southampton, Tools and Services (2)  Query service  Queries can be performed over the metadata database to help the user locate required data intuitively and efficiently.  Authorisation service  Access rights to data can be granted to an authenticated user based on information stored in an authorisation database.  Location service  Files are referenced with a unique handle.  The location service provides access to a database of file locations mapped to handles.

© Geodise Project, University of Southampton, Data Management Implementation for MATLAB To increase the usability of file and metadata management services for Engineers we have implemented a MATLAB Toolkit for archiving, querying and retrieval of data to and from a Geodise repository.

© Geodise Project, University of Southampton, Geodise Database Toolkit for MATLAB – Archive  gd_archive – Store a file with some metadata.  gd_datagroup – A datagroup is a collection of related files that may be logically grouped together – this can also have associated metadata.  Syntax: groupID = gd_datagroup(, [ ]) fileID = gd_archive(,[ ],[ ])  Examples: m.dimension = ‘2D’; m.component.gamma = 1.4; groupID = gd_datagroup(‘2D-LP turbine rotor job9’, m) meta.grids = 1 meta.flux_order = 2 fileID = gd_archive(‘input.dat’, meta, groupID) fileID = gd_archive(‘mesh_ns.grid.1.adf’, [], groupID) fileID = gd_archive(‘airfoil.msh’)

© Geodise Project, University of Southampton, XML Toolbox for MATLAB  Marc Molinari – GEM project.  xml_format(): Convert a MATLAB variable into an XML string.  xml_parse(): Convert an XML string into a MATLAB variable. Example: >> A.b = ‘Hello World’; >> A.c.aa = [1 2; 3 4; 5 6]; >> X = xml_format(A) X = Hello World >> Y = xml_parse (X); >> str = Y.b str = Hello World

© Geodise Project, University of Southampton, Application of XML Toolbox for MATLAB  Metadata set by user as a MATLAB structure.  More natural format for MATLAB user.  MATLAB structure  Type-based XML  Element names = variable types (e.g., )  Easier for conversion to and from structures.  Type-based XML  Name-based XML  Element names = variable names (e.g., )  Easier for database query. MATLAB xml_format.m Type-based XML Name-based XML xml_parse.m Type-based XML Name-based XML XSLT type2name XSLT name2type

© Geodise Project, University of Southampton, Geodise Database Toolkit for MATLAB – Query  gd_query  Text based query expressed over MATLAB variables for use in MATLAB scripts.  Converted to XPath to query XML database.  XML Toolbox used to convert results into a list of metadata structures.  Syntax: Results = gd_query(,[‘file’|‘datagroup’] )  Example 1: datagroup Results = gd_query(‘dimension = 2D’, ‘datagroup’) Results{1}.standard.files.fileID ans = input_dat_632d05be-ba26-479b-9607-d1845f3c78ff ans = mesh_ns_cs_adf_ce b7-4e25-a5f7-9a8adf8f21b6  Example 2: file r = gd_query(‘standard.userID = me & grids < 2’); r{1}.grids ans = 1

© Geodise Project, University of Southampton, Geodise Database Toolkit for MATLAB – Retrieve  gd_retrieve  Retrieve a file from the repository using unique handle.  Asks Authorisation service whether user has permission to retrieve the file.  Asks Location service where the file is.  File transferred back to local file system using GridFTP.  Syntax newFileLocation = gd_retrieve(, )  Examples gd_retrieve(‘input_dat_632d05be-ba26-479b-960…’, ‘E:\tmp’) ans = E:\tmp\input.dat gd_retrieve(‘input_dat_632d05be-ba26-479b-960…’, ‘E:\tmp\control42.dat’) ans = E:\tmp\control42.dat

© Geodise Project, University of Southampton, Authorisation  Data Authorisation  Globus certificate subject mapped to user ID.  User sets access rights for the data they archive, so it can be queried and retrieved by others.  Access rights stored in a relational database, accessed through Authorisation web service.  Grant users and groups access rights by including their user ID or group ID in the metadata structure.  Example m.grids = 1 m.access.users = {‘userA’,’userB’} m.access.groups = {‘groupC’} gd_archive (‘input.dat’, m)

© Geodise Project, University of Southampton, Future Work  Archive structures as XML  Cannot query inside archived files.  Archive MATLAB structures as XML and query them.  OGSA DAI integration  Replace and enhance some of our functionality with that provided by OGSA DAI.  E.g. Name mapping interface for authenticating Grid credentials to local ids (system and relational database ids).  Change database system  Xindice XML database – flexible and good for prototyping but not scalable and no security.  Will choose a relational database with XML capabilities – Oracle, DB2, SQL Server.