EdSkyQuery-G Overview Brian Hills, December 2004 www.edikt.org.

Slides:



Advertisements
Similar presentations
Trying to Use Databases for Science Jim Gray Microsoft Research
Advertisements

September 13, 2004NVO Summer School1 VO Protocols Overview Tom McGlynn NASA/GSFC T HE US N ATIONAL V IRTUAL O BSERVATORY.
Remote Visualisation System (RVS) By: Anil Chandra.
The Australian Virtual Observatory e-Science Meeting School of Physics, March 2003 David Barnes.
ATLAS/LHCb GANGA DEVELOPMENT Introduction Requirements Architecture and design Interfacing to the Grid Ganga prototyping A. Soroko (Oxford), K. Harrison.
Introduction to the BinX Library eDIKT project team Ted Wen Robert Carroll
E-Science Data Information and Knowledge Transformation Eldas Building Service Grids with Enterprise Level Data Access Services Alan Gray
Enterprise Java and Data Services Designing for Broadly Available Grid Data Access Services.
E-Science Data Information and Knowledge Transformation Edikt : e-Science Data, Information and Knowledge Transformation NeSC Review, 30 September 2003.
VO-DAS Chenzhou CUI Chao LIU, Haijun TIAN, Yang YANG, etc National Astronomical Observatories, CAS.
MySpace: Personalized Work Space in AstroGrid Catherine Ling Qin 26 th Nov 2003.
Solar and STP Physics with AstroGrid 1. Mullard Space Science Laboratory, University College London. 2. School of Physics and Astronomy, University of.
Institute for Software Science – University of ViennaP.Brezany 1 Databases and the Grid Peter Brezany Institute für Scientific Computing University of.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
Leicester Database & Archive Service J. D. Law-Green, J. P. Osborne, R. S. Warwick X-Ray & Observational Astronomy Group, University of Leicester What.
BinX and Astronomy Bob Mann Institute for Astronomy and National e-Science Centre.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Digital Object: A Virtual Online Storage Solution 598C Course Project Huajing Li.
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
GRACE Project IST EGAAP meeting – Den Haag, 25/11/2004 Giuseppe Sisto – Telecom Italia Lab.
Supported by the National Science Foundation’s Information Technology Research Program under Cooperative Agreement AST with The Johns Hopkins University.
1 Dr. Markus Hillenbrand, ICSY Lab, University of Kaiserslautern, Germany A Generic Database Web Service for the Venice Service Grid Michael Koch, Markus.
Quality Attributes of Web Software Applications – Jeff Offutt By Julia Erdman SE 510 October 8, 2003.
Astrogrid Resource Registry Querying the Registry 1.Mullard Space Science Laboratory, University College London, Holmbury St. Mary, Dorking, Surrey RH5.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
2131 Structured System Analysis and Design By Germaine Cheung Hong Kong Computer Institute Lecture 2 (Chapter 2) Information System Building Blocks.
WSRF Supported Data Access Service (VO-DAS)‏ Chao Liu, Haijun Tian, Dan Gao, Yang Yang, Yong Lu China-VO National Astronomical Observatories, CAS, China.
Functions and Demo of Astrogrid 1.1 China-VO Haijun Tian.
Science Archive for Sky Surveys Data Providers and the VO - NeSC 2003 March Wide Field Astronomy Unit Institute for Astronomy.
Astronomical data curation and the Wide-Field Astronomy Unit Bob Mann Wide-Field Astronomy Unit Institute for Astronomy School of Physics University of.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Using OGSA-DAI in a commercial environment Terry Sloan EPCC Telephone:
JVO JVO Portal Japanese Virtual Observatory (JVO) Prototype 2 Masahiro Tanaka, Yuji Shirasaki, Satoshi Honda, Yoshihiko Mizumoto, Masatoshi Ohishi (NAOJ),
DAIT (DAI Two) NeSC Review 18 March Description and Aims Grid is about resource sharing Data forms an important part of that vision Data on Grids:
A PPARC funded project AstroGrid Intro & Demo John Taylor Institute for Astronomy, Edinburgh.
E-Science Data Information and Knowledge Transformation Edikt : e-Science Data, Information and Knowledge Transformation E-Science Centres of Excellence.
* Working Group 4. 2 AstroGrid-D Meeting, Heidelberg Tobias Scholl Astrometric Matching Prototype (D4.2) 50 RASS-BSC sources Correlation with.
1 Geospatial and Business Intelligence Jean-Sébastien Turcotte Executive VP San Francisco - April 2007 Streamlining web mapping applications.
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
1 1 EPCC 2 Curtin Business School & Edinburgh University Management School Michael J. Jackson 1 Ashley D. Lloyd 2 Terence M. Sloan 1 Enabling Access to.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
INFSO-RI Enabling Grids for E-sciencE OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.
Solar and space physics datasets within a Virtual Observatory: the AstroGrid experience Silvia Dalla * and Nicholas A Walton  * School of Physics & Astronomy,
European Space Astronomy Centre (ESAC) Villafranca del Castillo, MADRID (SPAIN) Applications May 2006, Victoria, Canada VOQuest A tool.
Sky Survey Database Design National e-Science Centre Edinburgh 8 April 2003.
Enabling Grids for E-sciencE Astronomical data processing workflows on a service-oriented Grid architecture Valeria Manna INAF - SI The.
Managing Enterprise GIS Geodatabases
12 Oct 2003VO Tutorial, ADASS Strasbourg, Data Access Layer (DAL) Tutorial Doug Tody, National Radio Astronomy Observatory T HE US N ATIONAL V IRTUAL.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
NeSC/eDIKT & AstroGrid Phase B NeSC and IBM Grid work NeSC and IBM Grid work NeSC/WFAU sky survey DB design work NeSC/WFAU sky survey DB design work Astro-related.
In Vivo Imaging Middleware and Applications RSNA 2007 Berkant Barla Cambazoglu The Ohio State University Department of Biomedical Informatics.
Jens Hartmann York Sure Raphael Volz Rudi Studer The OntoWeb Portal.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
E-commerce Architecture Ayşe Başar Bener. Client Server Architecture E-commerce is based on client/ server architecture –Client processes requesting service.
OGSA-DAI.
AMSA TO 4 Advanced Technology for Sensor Clouds 09 May 2012 Anabas Inc. Indiana University.
Amy Krause EPCC OGSA-DAI An Overview OGSA-DAI on OMII 2.0 OMII The Open Middleware Infrastructure Institute NeSC,
Introduction: AstroGrid increases scientific research possibilities by enabling access to distributed astronomical data and information resources. AstroGrid.
System Software Laboratory Databases and the Grid by Paul Watson University of Newcastle Grid Computing: Making the Global Infrastructure a Reality June.
How to use the GALEX SkyNode*
EGEE Middleware Activities Overview
aspects of archive system design
Joseph JaJa, Mike Smorul, and Sangchul Song
Grid Computing.
A Component-based Architecture for Mobile Information Access
Introduction of Week 11 Return assignment 9-1 Collect assignment 10-1
Google Sky.
Presentation transcript:

EdSkyQuery-G Overview Brian Hills, December

Contents  Edikt  Motivation & Aims  Architecture  Current Status  Results  Future Outlook

Edikt  “e-Science Data, Information and Knowledge Transformation” –Bridge the gap between applications and computer science:  Produce robust tools…  …for real application science problems…  …test them under extreme science conditions…  …and keep an eye on the commercial possibilities. –Projects which may be of interest to astronomers:  BinX, Eldas & EdSkyquery-G. –Visit:

Astronomy Requirements  Sky Surveys collect masses of data to be managed: –For example, Sloan Digital Sky Survey: 15TB. –2 * 10GB databases to be used for the project. –Edikt will have access to a 155TB SAN.  Further research by leveraging data from different surveys: –Must identify same object from different catalogues.  Require a “federated” view: –Data is distributed, homogeneous, large scale. –Building one big data warehouse isn’t feasible. –Interoperable services to combine disparate data sources. Is the middleware up to the task?

EdSkyQuery-G: Motivation & Aims  Support the “Open SkyQuery Initiative” –Move from a.NET-specific implementation. –Enable similar functionality on other platforms.  Extensible framework for e-science: –Handle heterogeneous archives. –‘Plug in’ algorithms e.g. Nearest Neighbour. –Interact with Astrogrid components & VO. –Leveraged for the BRIDGES project to perform simple joins.  Apply Eldas to large scale E-Science problems: –Test: functionality, scalability & performance.  Cross team collaboration: –Dr. Bob Mann (UoE, ROE), Edikt, EPCC, NeSC.

Portal SkyNode 1 Client SkyQuery: High Level Architecture SkyNode 2 1. User submits query. 2. Client sends query to Portal. 3. Portal invokes SkyNode with an execution plan. 4. Recursive call to next SkyNode in the execution plan. 7. Performs cross- match and returns results. 8. Performs cross- match and returns final set of results. 9. Returns results for display. SkyNode 3 5. Recursive call to last SkyNode in the execution plan. 6. Runs query and returns results.

EdSkyQuery-G: Architecture  Inspired by Greg Riccardi’s paper for DAIS-WG: –  Discusses two approaches: 1.Access recipes for service interactions. 2.Retained state for service interactions.  Potential benefits of #2: –Scalability –Robustness –Usability

EdSkyQuery-G: Architecture 6. Transfer results file. Query Builder Service Manager SkyNode 2 DB2 - Linux 1. User submits query. 2. Query Builder sends query to Service Manager (SM). 4. Handle to results. SkyNode 1 DB2 - XP 7. Handle to results. 5. SM invokes SkyNode 1 with an execution plan + results handle. 8. SM returns a handle to the results. 3. SM invokes SkyNode 1 with an execution plan. 9. Client may pull results back from SkyNode.

EdSkyQuery-G: Service Manager JBOSS Grid Service Layer doQuery getMetadata Distributed Query Processor Split Optimize Execute Service Manager Call to Query service Client Uses Service Generic Parser Call to Xmatch service Call to Info service

EdSkyQuery-G: SkyNode Architecture SkyNode DB2 Invoked via the Service Manager EldasSkyNode Grid Services Query XMatchInfo Query & Stored Proc invocations

EdSkyQuery-G: Stored Procedures EDR Results:.csv.xml SSA Results:.csv.xml Export Transport Import Export Results:.csv.xml Export

Current Status  Pre-Alpha (internal release), November ‘04.  End-to-end invocation of all components: –Client->ServiceManager->SkyNode->Database  2 * 10GB DB2 test databases: –SSA from SuperCosmos, hosted by NeSC. –EDR from SDSS, hosted by EPCC.  Limitations: –SM: No query parser/splitting. –Simple cross database join: not yet using XMatch. –No data transport, other than JDBC between databases. –Final results reside on database server.

EdSkyQuery-G: Pre-Alpha Stored Procedures EDRSSA Results:.csv.xml Export JDBC Query Results Import Results:.csv.xml Export

Results  Tested with 3 queries from ROE cookbook: – –Queries #16, #17, #19.  Results show: –Exporting data is quick. –Importing data is >10 * slower:  Should we use native DB calls rather than JDBC for import only? –Queries slow:  Need more indexes and database tuning? Query No.#Rows selected Export Time (secs) Import Time (secs) Join Query Time (secs) Total Time (secs) 16488, , ,

Deliverables  Software (Internal): –Prototype client & GUI client. –Service Manager. –Eldas + Skynode Interface. –Database stored procedures (Java).  Documentation (some on NescForge): –Use Cases, Requirements, Design. –Performance Testing, Installation.  Papers –AHM 04 Poster & Paper:  –ADASS 04 Paper.

Short Term Focus  Enable science: –Compare different cross match algorithms. –Incorporate XMatch, CSIRO, ROE as stored procedures.  Data Transfer – SCP: –Between databases. –Pull results back to client. –Deliver to a third party.  Client & Service Manager enhancements: –Handle broader range of queries.  Performance: –Compare with pre-alpha benchmarks.  Improve test infrastructure.

Longer Term Focus: AstroGrid Client Registry MySpace Parser Query Builder Service Manager SkyNode 2 DB2 - Linux SkyNode 1 DB2 - XP

Longer Term Focus  Interaction with Astrogrid components: –Clients, Registry, ADQL Parser, MySpace.  OpenSkyQuery: –Compliance with interfaces. –Test with other DBMS and catalogues.  Query Builder GUI/Data Integration tool: –Lead user through choosing fields from different datasets.  GridFTP: –Currently unsuitable…. –NeSC course in Jan 05 may reveal more.

Thank you Questions?