Download presentation
Presentation is loading. Please wait.
Published byDorothy Patrick Modified over 9 years ago
1
Introduction to OGSA-DAI Neil Chue Hong OGSA-DAI Project Manager 14 th February 2006 GGF16, Athens
2
GGF16, Feb 2006 © 2 Data Services: challenges Scale Many sites, large collections, many uses Longevity Research requirements outlive technical decisions Diversity No “one size fits all” solutions will work Primary Data, Data Products, Meta Data, Administrative data, … Many Data Resources Independently owned & managed Geographically distributed and I haven’t even mentioned security yet!
3
GGF16, Feb 2006 © 3 Use Cases for Data Services Data Filtering: Single source producing large amounts of data distributed to many sites downstream Data Discovery: many sources, many query entry points in a linked system Data Translation: source to sink, conversion of data model / structure Data Federation: many sources, linked to provide view as a single source Data Replication full or partial copies to improve throughput Data Integration (model aggregation) e.g. integration of time variant data, streams, files Data Integration (knowledge expansion) forming links between databases to increase knowledge
4
GGF16, Feb 2006 © 4 Trade Offs Speed vs completeness do you require the exact answer or an answer? Application specific vs language specific queries how will users interrogate a data service? Static system vs Dynamic Discovery can you actually have dynamic resources? Static vs Dynamic data READ only, INSERT only, UPDATE permitted Static vs Dynamic queries optimisation over flexibility Intranet vs Internet speed over security Single data model versus mixed data models ease/speed over integration Queries vs Questions assume that we know the structure when we form the query
5
GGF16, Feb 2006 © 5 Requirements on Data Services? Common Data Model e.g. RowSet Common Query Language(s) e.g. XQuery, SQL Standard access to data resource schema information physical data resource information for optimisation purposes data resource descriptive information for discovery / integration Single, seamless security model Dynamic publication and discovery Multiple, efficient delivery methods Move computation towards data Data aggregation functionality Replication information
6
GGF16, Feb 2006 © 6 OGSA-DAI In One Slide An engineered extensible framework for data access and integration. Expose heterogeneous data resources to a grid through web services. Interact with data resources: Queries and updates. Data transformation / compression Data delivery. Customise for your project using Additional Activities Client Toolkit APIs Data Resource handlers A base for higher-level services federation, mining, visualisation,…
7
GGF16, Feb 2006 © 7 MySQL OGSA-DAI service Engine SQLQuery JDBC Data Resources Activities DB2 GZipGridFTPXPath XMLDB XIndice readFile File SWISS PROT XSLT SQL Server Data- bases Application Client Toolkit
8
GGF16, Feb 2006 © 8 MySQL OGSA-DAI service Engine SQLQuery JDBC SQL JDBC SQL JDBC SQL JDBC SQL JDBC Multiple SQL GDS SQLQuery
9
GGF16, Feb 2006 © 9 Distributed Query Processing Higher level services building on OGSA-DAI Queries mapped to algebraic expressions for evaluation Parallelism represented by partitioning queries Use exchange operators table_scan (protein) table_scan termID=S92 (proteinTerm) reduce hash_join (proteinId) op_call (Blast) reduce exchange 3,4 12
10
GGF16, Feb 2006 © 10 DQP architecture
11
GGF16, Feb 2006 © 11 Contributing to OGSA-DAI Additional functionality: Provide activities which implement specific functionality Provide extra client functionality Provide different security mechanisms Provide higher level components and applications Different levels of contributions Based on OGSA-DAI? Works with OGSA-DAI? Part of OGSA-DAI?
12
GGF16, Feb 2006 © 12 Future plans A new version of the OGSA-DAI Engine better support for concurrency, sessions, monitoring and notification Implementing new DAIS specifications Key things that we will be addressing: Performance (particularly format representation and transport) Security Model which can be applied across platforms Transactions provision More data integration facilities Integration with other components registries (e.g. GRIMOIRES) workflow editors (e.g. Taverna) Working with new projects e.g. CancerGrid, iSpider, GEODE
13
GGF16, Feb 2006 © 13 Further information The OGSA-DAI Project Site: http://www.ogsadai.org.uk The DAIS-WG site: http://forge.gridforum.org/projects/dais-wg/ OGSA-DAI Users Mailing list users@ogsadai.org.uk General discussion on grid DAI matters Formal support for OGSA-DAI releases http://bugs.ogsadai.org.uk OGSA-DAI training courses
14
GGF16, Feb 2006 © 14 OMII-UK Context e-Science Reclamation Yard Collaborative Development Users OGSA-DAI Collaborative Development OMII OMII-UK
15
GGF16, Feb 2006 © 15 The OGSA-DAI Team IBM Development Team, Hursley NEReSC, Newcastle NeSC, Edinburgh ESNW, Manchester IBM Dissemination Team EPCC Team, Edinburgh
16
GGF16, Feb 2006 © 16 Software Process Testing Reqs. Prototype Prioritisation Fix Bugs Use Cases Requests Design ImplementQA Release Support Test Cases Programme Board Technical Review Board Technical Reviewer DEVELOPERS USERS REVIEW Contribs Ingest Dissem. Training Nightly unit + system tests Additional test cases System tests based on reqs Continual process → Deep track features Users’ Group Peer Review and Inspection
17
International Cooperation and Recognition USA: o Globus Alliance o IBM Corporation o caBIG o BIRN o Indiana University o GridSphere o GEON o LEAD o MCS o NCSA o Secure Data Grid o UNC Japan: o AIST o BioGrid o NAREGI Europe: o CERN o DataMiningGrid o GridMiner o GridSphere o inteligrid o N2Grid o OntoGrid o Provenance o SIMDAT UK: o OMII o NGS o NCeSS o NIeeS o AstroGrid o BioSimGrid o BRIDGES o CancerGrid o ConvertGrid o eDiaMonD o EDINA o First Group plc o Fujitsu Labs Europe o GEDDM o GeneGrid o Genomic Technology and Informatics o GOLD o Human Genetics Unit o IBM UK o my Grid o Oracle UK China: o CAS o ChinaGrid o cnGrid o INWA Australia: o Curtin Business School o INWA Tutorials BostonCambridge CERNChicago EdinburghLondon San FranciscoSeattle SeoulSingapore TokyoISSGC 03 to 05 DIALOGUE workshops Columbus, Edinburgh, Indiana, Vienna Chicago, Manchester, San Diego South Korea: o KISTI 1485 registered users 5250+ downloads
18
LEAD GeneGrid caBIG BRIDGES OGSA WebDB FirstDIG ConvertGrid eDiaMoND OGSA-DQP Grid Miner Meeting User Requirements
19
GGF16, Feb 2006 © 19 Summary Experienced team delivering quality software mature software and process Engaging with international community understanding and reacting to user requirements Complementary to other nodes delivering a coordinated roadmap of software De facto standard software for DAI driving refinement of standards used by large and small scale projects
20
GGF16, Feb 2006 © 20 Comments and Questions Please
21
GGF16, Feb 2006 © 21 Number of users 1485 registered 5250+ downloads 3 Users’ Group Meetings Edinburgh Brussels Edinburgh Contributors Austria, China, Finland, Poland, Spain, UK, USA Release Statistics 985 downloads of latest release -Actual user downloads not search engine crawlers -Does not include downloads as part of GT3.2 and GT4 releases R1.0 (Jan 03)109 R1.5 (Feb 03)110 R2.0 (Apr 03)254 R2.5 (Jun 03)294 R3.0 (Jul 03)792 R3.1 (Feb 04)686 R4.0 (May 04)1124 R5.0 (Dec 04)766 R6.0 (May 05)985 Meeting User Requirements
22
GGF16, Feb 2006 © 22 Core features of OGSA-DAI A framework for building data clients Client toolkit library for application developers Seamless abstraction across WSI and WSRF services Highly-extensible Customise out-of-the-box product A framework for developing functionality Compose existing activities with application specific activities Data service concurrency and sessions Comprehensive documentation and tutorials Shipped to run on OMII_2, GT4.0 and Axis 1.2
23
GGF16, Feb 2006 © 23 Functionality of OGSA-DAI A framework for data applications Data access, insert and update Relational: MySQL, Oracle, DB2, SQL Server, Postgres, … XML: Xindice, eXist Files – CSV, EMBL, OMIM, SWISSPROT,… Data delivery SOAP over HTTP FTP, GridFTP E-mail Inter-service Data transformation XSLT ZIP, GZIP Security X.509 certificates Message Level Transport Level
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.