July 2010Cospar10 BremenSlide 1 SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels Authors David Boyes, Benjamin Mampaey,

Slides:



Advertisements
Similar presentations
Publishers Web Sites Standard Features. Objectives Access publishers websites Identify general features available on most publishers websites Know how.
Advertisements

EQUINOX DATA DELIVERY SYSTEM May 31, 2011 –Elizabeth Hill Equinox.uwo.ca.
EndNote Web Reference Management Software (module 5.1)
History Data Service1 Good Design for Historical source based Databases History Data Service Hamish James.
Components of a Data Analysis System Scientific Drivers in the Design of an Analysis System.
CHAPTER 15 WEBPAGE OPTIMIZATION. LEARNING OBJECTIVES How to test your web-page performance How browser and server interactions impact performance What.
MILLENNIUM STATISTICS … fun for all!! Matt Polcyn August 6, 2004.
Extending Primo beyond your ILS data source : including EAD and Graphic Sources Janet Lute ILS Coordinator Princeton University Library IGeLU 2014Oxford,
CONDO MANAGER The Leader in Association Accounting and Management Software Mailing Address: P.O. Box Charlotte, North Carolina Web Site
SOFTWARE PRESENTATION ODMS (OPEN SOURCE DOCUMENT MANAGEMENT SYSTEM)
Query Evaluation. An SQL query and its RA equiv. Employees (sin INT, ename VARCHAR(20), rating INT, age REAL) Maintenances (sin INT, planeId INT, day.
JSOC Overview - 1HMI/AIA Team Meeting – Feb 2006 JSOC Summary- Phil Scherrer Pipeline Processing - Rasmus Larsen Data Access - Rick Bogart Data Visualization.
1 JSOC-SDP Web access to SDO HMI and AIA data The following pages show an excursion through the SDO JSOC-SDP web pages to provide an example of how to.
Page 1JSOC Review – 17 March 2005 DRMS Core System Karen Tian
1.1 CAS CS 460/660 Introduction to Database Systems File Organization Slides from UC Berkeley.
Attribute databases. GIS Definition Diagram Output Query Results.
What is Asset Bank? Asset Bank is an enterprise-scale Digital Asset Management system A fully searchable, categorised library of digital images, videos.
Secondary Storage Management Hank Levy. 8/7/20152 Secondary Storage • Secondary Storage is usually: –anything outside of “primary memory” –storage that.
Definitions Collaboration – working together on team projects and sharing information, often through ad-hoc processes, to accomplish project goals. Document.
Objectives of the Lecture :
Version Control with Subversion. What is Version Control Good For? Maintaining project/file history - so you don’t have to worry about it Managing collaboration.
Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch.
11 Games and Content Session 4.1. Session Overview  Show how games are made up of program code and content  Find out about the content management system.
Computer Science : Information Systems Design and Development Unit Web Sites - National 4 / 5 St Andrew’s High School-Revised January 2013 Slide 1 St Andrew’s.
ESDO Algorithms, Visualization and Data Access Elizabeth Auden 21 September 2006 AHM 2006 Nottingham.
F. I. Suárez-Sol á 1, E. González-Suárez 1, I. González-Hernández 1, A.R. Davey 2,J. Hourcl é 3, VSO Team 1 National Solar Observatory, Tucson AZ – 2 Harvard-Smithsonian.
Using Solar Dynamics Observatory Data in the Classroom to Do Real Science.
Implementing an Automated ACCUPLACER Score Upload System for the i3 Platform A Cooperative Effort by Testing Staff, Other Student Services Areas, and IT.
Data Management Subsystem: Data Processing, Calibration and Archive Systems for JWST with implications for HST Gretchen Greene & Perry Greenfield.
Tutorial 1 Getting Started with Adobe Dreamweaver CS3
© Paradigm Publishing Inc. 9-1 Chapter 9 Database and Information Management.
MASSACHUSETTS INSTITUTE OF TECHNOLOGY NASA GODDARD SPACE FLIGHT CENTER ORBITAL SCIENCES CORPORATION NASA AMES RESEARCH CENTER SPACE TELESCOPE SCIENCE INSTITUTE.
© Paradigm Publishing Inc. 9-1 Chapter 9 Database and Information Management.
Ideas to Improve SharePoint Usage 4. What are these 4 Ideas? 1. 7 Steps to check SharePoint Health 2. Avoid common Deployment Mistakes 3. Analyze SharePoint.
© 2007 by Prentice Hall 1 Introduction to databases.
Guide to Linux Installation and Administration, 2e1 Chapter 2 Planning Your System.
Chapter 17 Creating a Database.
Usability Issues Facing 21st Century Data Archives Joey Mukherjee and David Winningham
What is the VSO? (and what isn’t it?). The VSO …  Allows you to search multiple archives in a single search  Keeps you from needing to keep track of.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
Page 1LWS Teams Day JSOC Overview HMI-AIA Joint Science Operations Center Science Data Processing a.k.a. JSOC-SDP Overview.
Microsoft ® Office Excel 2003 Training Using XML in Excel SynAppSys Educational Services presents:
SPACE TELESCOPE SCIENCE INSTITUTE Operated for NASA by AURA WFC3 and StarView
EGEE User Forum Data Management session Development of gLite Web Service Based Security Components for the ATLAS Metadata Interface Thomas Doherty GridPP.
INTRODUCTION TO GIS  Used to describe computer facilities which are used to handle data referenced to the spatial domain.  Has the ability to inter-
1 MSRBot Web Crawler Dennis Fetterly Microsoft Research Silicon Valley Lab © Microsoft Corporation.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
Lecture 10 Page 1 CS 111 Summer 2013 File Systems Control Structures A file is a named collection of information Primary roles of file system: – To store.
Handling Reference Questions DLI Orientation Session Kingston, Ontario April 5, 2004.
Development of the CMS Databases and Interfaces for CMS Experiment: Current Status and Future Plans D.A Oleinik, A.Sh. Petrosyan, R.N.Semenov, I.A. Filozova,
CIS 250 Advanced Computer Applications Database Management Systems.
EOVSA Pipeline Processing System J. McTiernan EOVSA Prototype Review 24-Sep-2012.
ADNET Systems, Inc. Jack Ireland & Helioviewer Team ADNET Systems, Inc. Helioviewer Discovery for Everyone Everywhere.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
SCORM conformance and authoring software Dr Tabetha Newman Information Transfer Tel. +44 (0) August 2002.
General Architecture of Retrieval Systems 1Adrienn Skrop.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
Metadata V1 By Dick M.A. Schaap – technical coordinator Oostende, June 08.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
MIKADO – Generation of ISO – SeaDataNet metadata files
The ROB SDO data system All of the Sun all of the time: Distributing 1TB/day from the Solar Dynamics Observatory satellite, 24/7 for 5+ years ROB for.
E.C. Auden1, J.L. Culhane1, Y. P. Elsworth2, A. Fludra3, M. Thompson4
INTRODUCTION TO GEOGRAPHICAL INFORMATION SYSTEM
File System Structure How do I organize a disk into a file system?
Soteria Capacity Building Workshop
Introduction to Database Systems
HTML5 and Local Storage.
Presentation transcript:

July 2010Cospar10 BremenSlide 1 SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels Authors David Boyes, Benjamin Mampaey, Cis Verbeeck, Veronique Delouille, Jean-François Hochedez STCE + ROB

July 2010Cospar10 BremenSlide 2 What will be covered Where is the data and the access architecture for the users Some basic terms User access methods – modules – basic web access – virtual observatories – simplified web access – pseudo files and other developments Interesting issues – Retention – Saved searches – Evolving calibration Neat stuff to come – Cutouts – Helioviewer – Grid integration

July 2010Cospar10 BremenSlide 3 Where is the data for the users Data is available from one or more data centre(s) - all are networked Some users are "close", some are "far" - distance matters All data is available somewhere Users can get data (an "export") – from the nearest centre directly – via the nearest centre from a remote centre – directly from another centre Most of this is automatic – you will see differences in e.g. delays

July 2010Cospar10 BremenSlide 4 How the data is accessed (a bit technical) the system is the netDRMS – created by the JSOC at Stanford files are generated by content system holds data files + metadata – SUMS + DRMS mediator is an "export" module makes your very own file – FITS, tar of FITS etc. SQL etc. is hidden from user

July 2010Cospar10 BremenSlide 5 Access summary... No files until you ask for them Data is referenced by content - provided as a file(s) with whatever name you want The exported files are built using stored elements, so e.g. FITS with Rice compression quite direct as AIA data is stored internally in this format Can get anything but... – you may as well ask for all metadata – the files can be large - best not to ask for 100's

July 2010Cospar10 BremenSlide 6 Some basic terms series – basic collection of data items with shared properties – by convention named. – all series records share a metadata format (i.e. keywords) keywords – FITS style keywords plus added metadata only keywords – correspond to columns in the metadata (DRMS) database online means – available from a disk at the site – so offline means : not yet arrived/available, deleted but can be fetched data format – whatever is stored is native (FITS, JP2000), conversion is post-processing – characterised by resolution, cadence (e.g. 4K x 4K at 10s, 1K x 1K at 90s) – naturally can't do better, but can reduce by "cutouts" in time or space data records – can be several items as a group (e.g. image + bad pixel map + alternative format) – data is SUMS plus metadata, referenced by metadata tables (DRMS) - usually one to one – each is self contained, for example cadence is not part of data

July 2010Cospar10 BremenSlide 7 Example series aia_test.lev1 AIA images 4Kx4K full disk full cadence aia_test.synoptic2 AIA images reduced to 1Kx1K full disk and 90s cadence hmi_test.M_45s magnetograms, 45s cadence hmi_test.v_45s dopplergrams, 45s cadence jpeg2K to come, browsing and forecasting

July 2010Cospar10 BremenSlide 8 User access methods Direct via “modules” – on site of data centre Query based – precursor to full data access – checks a part of the data (metadata) without having to retrieve the very large part Indirect via network – web/http based – delivers data somewhere - maybe to fetch immediately or later Direct via wrapper – on site e.g. IDL (Matlab on way)

July 2010Cospar10 BremenSlide 9 A practical pause - limitations Sheer size of request - even if you have a 2TB USB stick, that's only 2 days Network speed - at about 200Mb/s it takes a day to get a day's worth Search/database speed - millions of records Raw data access/retrieval speed - the basic image data takes time to get from disk Retention time - you can get anything, but you probably have to wait for a full day from 2 years ago that nobody else has ever used

July 2010Cospar10 BremenSlide 10 At the data centres, for example – show_series – show_info – jsoc_export_as_fits ~]$ show_info -s ds=aia_test.synoptic2 First Record: aia_test.synoptic2[ T15:00:00.57Z][171] is first of 6 records matching first keyword, Recnum = 1 Last Record: aia_test.synoptic2[ T11:58:41.07Z][335] is first of 2 records matching first keyword, Recnum = Last Recnum: ~]$ show_series aia_test.lev1 aia_test.synoptic2 drms.sites hmi.doptest hmi_test.m_45s hmi_test.s_720s lm_jps.lev1_test4k10s ~]$ jsoc_export_as_fits reqid=REQ_FTP expversion=0.5 rsquery=aia_test.lev1[:#209866] path=tmp method=url protocol=FITS ' ' bytes exported. Access by : modules - the basic bricks

July 2010Cospar10 BremenSlide 11 Access by : basic web access System developed by JSOC : lookdata.html Online via JSOC web site, but heavily loaded Being tested at ROB Provides an easy access to an overview of all the available data Formulating a selection query does require knowledge of query syntax Provides for a wide variety of data packaging – normal user FITS or internal format (FITS with no keywords) – via web for immediate or later access, as one or more individual files or as tar – ROB working on fewer packaging options

July 2010Cospar10 BremenSlide 12 Access by : basic web access

July 2010Cospar10 BremenSlide 13 Access by : Virtual Observatories VSO – development of existing VSO – prototype for SDO running and definitive version in preparation – Soteria – demo provider made for ROB/USET, SDO provider being coded now – Uniform search paradigm Infrastructure hides efficient searches with complex syntax e.g. SQL in various flavours

July 2010Cospar10 BremenSlide 14 Access by : Soteria Virtual Observatory One part of an EU project Based on current web access technology The example is for the ROB USET telescope as a data provider, each SDO site will able be able to act as a provider

July 2010Cospar10 BremenSlide 15 Access by : simplified web access Work in progress Limited offer to direct request of tar files or individual FITS format files, front end for PFS Simplified enquiry based such as : – aia.lev1 + time + period + cadence + wavelengths Preparation is actually more complex than basic access - for example it requires decisions as to what keys are useful for what series

July 2010Cospar10 BremenSlide 16 Access by : pseudo files (PFS) Systematically named files in a directory tree with no real files until you access them Typically based on query covering a much wider range than you really need (or could use) Real files kept in cache so further access very cheap

July 2010Cospar10 BremenSlide 17 mnt `-- aia_test.lev1 ` `-- 06 `-- 17 |-- H0000 | |-- AIA _ _0171.fits | |-- AIA _ _0304.fits | |-- AIA _ _94.fits | |-- AIA _ _1600.fits | |-- AIA _ _211.fits | |-- AIA _ _335.fits | |-- AIA _ _193.fits | |-- AIA _ _335.fits | |-- AIA _ _1600.fits | |-- AIA _ _193.fits | |-- AIA _ _94.fits | `-- AIA _ _131.fits |-- H0100 | |-- AIA _ _0171.fits | |-- AIA _ _211.f |-- AIA _ _193.fits |-- AIA _ _94.fits |-- AIA _ _131.fits |-- AIA _ _1600.fits |-- AIA _ _0171.fits |-- AIA _ _211.fits |-- AIA _ _0304.fits |-- AIA _ _335.fits |-- AIA _ _1600.fits |-- AIA _ _193.fits |-- AIA _ _94.fits `-- AIA _ _131.fits 9 directories, 160 files Access by : pseudo files (PFS) Example with 160 file names, all AIA wavelengths, 15min cadence In prototype at ROB, source downloadable

July 2010Cospar10 BremenSlide 18 Access by : useful methods in development Order and notify via for manual fetch Order and automatic delivery (e.g. sftp)

July 2010Cospar10 BremenSlide 19 Interesting issue - Retention All netDRMS sites have full information for selected series - their “subscribed” series But is it on line? – sites keep the latest, but must selectively discard Enquiry modules can tell if online, but implications (delay...) if not? You can request it, but it can take some time to obtain – for now quick, but after a year or so a record nobody has looked at will be from tape

July 2010Cospar10 BremenSlide 20 Interesting issue - Saved searches How to describe a selection of data Can save result as a record list for a reasonable number of records but this does not save the query – save both query and result? For both your own use and publication Saved query might give different results (e.g. online only) Relates to the issue of calibration

July 2010Cospar10 BremenSlide 21 Interesting issue - Evolving calibration and which data did I use? More accurate calibration will be available as time goes on and more calibration points are acquired So the newest and best data can change This done for most by applying a calibration series e.g. via Solarsoft But there can also be metadata changes The raw data is unlikely to change

July 2010Cospar10 BremenSlide 22 Neat stuff to come - cutouts This is well on the way again being developed by JSOC and LMSAL - for those who don't need the full 4Kx4K Very much reduced data storage requirements Closely related to event tracking and the HEK

July 2010Cospar10 BremenSlide 23 Neat stuff to come - Helioviewer Existing project now being directed towards use with SDO data JPEG2000 based viewer with event marker overlay integration with JPEG2000 series rapid browsing with links to full data ROB is CoI in requested next stage

July 2010Cospar10 BremenSlide 24 Neat stuff to come - grid integration The data element size (10's of MB) is natural for use in a high performance grid The data already geographically distributed - variety of access routes Distributed variety of resources - large clusters, pipelines, GPU's Sites are on high performance research networks

July 2010Cospar10 BremenSlide 25 Thanks to JSOC at Stanford LMSAL Belnet and Geant2 for networking The enthusiastic cooperation from the partner data centres Our sister institutes at the ROB site for hosting the data centre and infrastructure

July 2010Cospar10 BremenSlide 26 Web addresses The main source : JSOC at jsoc.stanford.edu HEK : ROB : wissdom.oma.be SAO : GDS : UCLan : IAS : idc-medoc.ias.u-psud.fr