Interoperability meeting S. Derriere, Cambridge, 2003 May 12-16 UCD - lessons learned What was learned from trying to assign UCDs to: - large catalogues/databases.

Slides:



Advertisements
Similar presentations
Data Documentation Initiative (DDI) Workshop Carol Perry Ernie Boyko April 2005 Kingston Ontario.
Advertisements

History Data Service1 Good Design for Historical source based Databases History Data Service Hamish James.
Configuration management
Configuration management
Chapter 10: Designing Databases
UCD inheritance in VOTables François Ochsenbein. 11 May 2003 François Ochsenbein Summary 1. Alternative propositions 2. impact of the notion of column.
E-Science Data Information and Knowledge Transformation The BinX Language.
Dynamic Ontologies on the Web Jeff Heflin, James Hendler.
File Systems and Databases
Geographic Information Systems
1 Database Systems (Part I) Introduction to Databases I Overview  Objectives of this lecture.  History and Evolution of Databases.  Basic Terms in Database.
1 Lecture 31 Introduction to Databases I Overview  Objectives of this lecture  History and Evolution of Databases  Basic Terms in Database and definitions.
IMT530- Organization of Information Resources1 Feedback Like exercises –But want more instructions and feedback on them –Wondering about grading on these.
Data Management: Documentation & Metadata Types of Documentation.
Configuration Management
Chapter 1: The Database Environment
Michael F. Price College of Business Chapter 6: Logical database design and the relational model.
Page 1 ISMT E-120 Introduction to Microsoft Access & Relational Databases The Influence of Software and Hardware Technologies on Business Productivity.
Trisha Cummings.  Most people involved in application development follow some kind of methodology.  A methodology is a prescribed set of processes through.
10 December, 2013 Katrin Heinze, Bundesbank CEN/WS XBRL CWA1: DPM Meta model CWA1Page 1.
Improving Data Discovery in Metadata Repositories through Semantic Search Chad Berkley 1, Shawn Bowers 2, Matt Jones 1, Mark Schildhauer 1, Josh Madin.
Page 1 ISMT E-120 Desktop Applications for Managers Introduction to Microsoft Access.
This chapter is extracted from Sommerville’s slides. Text book chapter
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Carlos Lamsfus. ISWDS 2005 Galway, November 7th 2005 CENTRO DE TECNOLOGÍAS DE INTERACCIÓN VISUAL Y COMUNICACIONES VISUAL INTERACTION AND COMMUNICATIONS.
DAT602 Database Application Development Lecture 14 HTML.
The Data Attribution Abdul Saboor PhD Research Student Model Base Development and Software Quality Assurance Research Group Freie.
S. Derriere et al., ESSW03 Budapest, 2003 May 20 UCDs - metadata for astronomy Sébastien Derriere François Ochsenbein Thomas Boch CDS, Observatoire astronomique.
Introduction to XML cs3505. References –I got most of this presentation from this site –O’reilly tutorials.
A Domain-Specific RM&IG Solution Designed to Support the Implementation of ISAD(G) Arian Rajh, PhD, Assist. Prof., FFZG Krešimir Meze, Omega software d.o.o.
XML Extensible Markup Language. Markup Languages u What does this number (100) mean? –Actually, it’s just a string of characters! –A markup language can.
Using XML technologies to implement complex tables in short- term statistics Francesco Rizzo
 To explain the importance of software configuration management (CM)  To describe key CM activities namely CM planning, change management, version management.
Eurotrace Hands-On The Eurotrace File System. 2 The Eurotrace file system Under MS ACCESS EUROTRACE generates several different files when you create.
Database structure for the European Integrated Tokamak Modelling Task Force F. Imbeaux On behalf of the Data Coordination Project.
Scalable Metadata Definition Frameworks Raymond Plante NCSA/NVO Toward an International Virtual Observatory How do we encourage a smooth evolution of metadata.
Exchange Design Best Practices Tools for Successful Flow Design and Implementation 1.
SE: CHAPTER 7 Writing The Program
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
MIS 327 Database Management system 1 MIS 327: DBMS Dr. Monther Tarawneh Dr. Monther Tarawneh Week 2: Basic Concepts.
The european ITM Task Force data structure F. Imbeaux.
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
PHP and MySQL Code Reuse, OO, Error Handling and MySQL (Intro)
IVOA Interoperability MeetingBoston, 2004/05/23-28 IVOA plenary session UCD R. Williams, S. Derriere, and the UCD folks.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
IVOA Interoperability MeetingUCD Session – 2004/05/26 S. Derriere UCD services.
SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki.
COS PIPELINE CDR Jim Rose July 23, 2001OPUS Science Data Processing Space Telescope Science Institute 1 of 12 Science Data Processing
Lection №4 Development of the Relational Databases.
F. Genova, VO as a Data Grid, 2003/06/301 Interoperability of astronomy data bases Françoise Genova, CDS.
A facilitator to discover and compose services Oussama Kassem Zein Yvon Kermarrec ENST Bretagne.
1 Chapter 12 Configuration management This chapter is extracted from Sommerville’s slides. Text book chapter 29 1.
Requirements Engineering Requirements Validation and Management Lecture-24.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 4 Slide 1 Software Processes.
Chapter 18 Object Database Management Systems. Outline Motivation for object database management Object-oriented principles Architectures for object database.
Introduction to Core Database Concepts Getting started with Databases and Structure Query Language (SQL)
The Database Project a starting work by Arnauld Albert, Cristiano Bozza.
Understanding Core Database Concepts Lesson 1. Objectives.
Geographic Information Systems
Chapter 2 Database Environment.
Chapter 6 System and Application Software
File Systems and Databases
Transportation Research Thesaurus:
Metadata in Digital Preservation: Setting the Scene
Metadata The metadata contains
Chapter 6 System and Application Software
Chapter 6 System and Application Software
Understanding Core Database Concepts
Chapter 6 System and Application Software
Presentation transcript:

Interoperability meeting S. Derriere, Cambridge, 2003 May UCD - lessons learned What was learned from trying to assign UCDs to: - large catalogues/databases - specific domain in astronomy - data models - FITS headers

Interoperability meeting S. Derriere, Cambridge, 2003 May ● « Structure » of UCD ● Assignation ● Application

Interoperability meeting S. Derriere, Cambridge, 2003 May The « structure » of UCDs The presentation of UCDs is misleading: The tree structure is not mandatory! It is only: - a convenient way of grouping similar elements from a given point of view - a specification of the context (make implicit information explicit)

Interoperability meeting S. Derriere, Cambridge, 2003 May The « structure » of UCDs UCDs are (standard, unique) names for concepts e.g. we find a new(!) concept : « temperature » we name it temperature We forgot to mention that it was the « effective temperature of a star », because it sounded obvious in our context. What if we find a new concept : « temperature of an instrument » ???

Interoperability meeting S. Derriere, Cambridge, 2003 May The « structure » of UCDs 1. We call the 2 concepts temperature 2. We add a little something to dinstinguish the 2 kind of temperatures, defining more elaborated words: - effective-temperature-of-a-star - temperature-of-an-instrument In the process of elaborating UCDs, it was just convenient to group concepts relative to physical quantities, or instrument, together.. that's just how UCDs were defined... PHYS_TEMP_EFFEC and INST_TEMP_SYST The structure could be different.

Interoperability meeting S. Derriere, Cambridge, 2003 May The « structure » of UCDs ● The UCDs are NOT a universal data model, they are not an ontology, they do not impose a structured view of the universe ● But UCDs can be used to name attributes of data models ● The data model is structured, hierarchical -- not UCDs ● The data model carries the structure and describes links between its components -- UCDs are used to name the components

Interoperability meeting S. Derriere, Cambridge, 2003 May Assignation of UCDs Given a dataset: how to describe it, how to assign UCDs ? (translate my own description into something more standard that can be understood by others...) This requires : ● A list of existing UCDs, with their definitions ● A set of decision rules for assignation ● Re-use the knowledge of already assigned data ● Build new terms following standard syntax AT LEAST IMPROVES EFFICIENCY

Interoperability meeting S. Derriere, Cambridge, 2003 May Assignation of UCDs The original descriptions of elements can consist of: ● a name ● a description ● a unit RAdeg deg alpha, degrees (ICRS, Epoch=J ) DEdeg deg delta, degrees (ICRS, Epoch=J ) Plx mas Trigonometric parallax pmRA mas/yr Proper motion mu_alpha.cos(delta) pmDE mas/yr Proper motion mu_delta, ICRS e_RAdegmas Standard error in RA*cos(DEdeg) e_DEdegmas Standard error in DE e_Plx mas Standard error in Plx e_pmRA mas/yr Standard error in pmRA e_pmDE mas/yr Standard error in pmDE

Interoperability meeting S. Derriere, Cambridge, 2003 May Assignation of UCDs

Interoperability meeting S. Derriere, Cambridge, 2003 May Application 1: SDSS (A. Szalay) 1300 columns for the complete SDSS database. Input file for assignation built from SQL DB schema. Results: - Need for manual verification in all cases (not automatic 1-to-1 assignation). - Relatively few new concepts (not described by existing UCDs): STAT_STDEV, _VARIANCE, _COVARIANCE FIT_PARAM_COVARIANCE ID_VERSION CODE_HTM INST_SKY_SIGMA PHOT_TRANS_PARAM POS_EQ_CART_X, _Y, _Z POS_SDSS_MU, _NU, _LAMBDA, _ETA METADATA_ID, _DESCRIPTION, _VERSION, _TABLE, _COLUMN, _UNIT, _NAME, _COMMENT

Interoperability meeting S. Derriere, Cambridge, 2003 May Application 2: Radio Data (A. Richards) MERLIN database. Application to a specific domain (Radio). Results: Concepts specific to the radio/interferometry domains : CHANNELWIDTH, VISIBILITY, BASELINE, deconvolution, beam,... Same words with other definitions: « extension », FoV, position (source / field)

Interoperability meeting S. Derriere, Cambridge, 2003 May Application 3: Data model (M. Louys) The IDHA data model: - ~120 model attributes with their definitions Results: Except for INST quantities, direct assignation is rare. But very often, a proper UCD exists. Descriptions of data model attributes and UCDs have to be checked/improved Missing UCDs are related to: ● image format ● pixel coding ● software description ● data reduction process

Interoperability meeting S. Derriere, Cambridge, 2003 May Note: SIAP and VOX elements The assignation program found some relevant already existing UCDs for some VOX elements. (without exploring the whole UCD list !) To be continued...

Interoperability meeting S. Derriere, Cambridge, 2003 May Application 4: FITS keywords (A. Preite Martinez) FITS headers from different surveys: - list all keywords. Results: In most cases, some relevant UCDs are suggested. If not: - the FITS keyword definition is not accurate - the FITS keyword definition is cryptic (abbreviations, even human assignation of UCD is very difficult) - it is a very specific parameter (a given instrument configuration) - it is related to software domain

Interoperability meeting S. Derriere, Cambridge, 2003 May Conclusions (1) Automatic assignation of UCDs is not easy, because... The UCD list is not complete: - there are missing UCDs in specific domains: missing terms must be defined by small representative groups (not only one project to keep it general -- distinguish specific parameters from « core » ones) - UCDs are missing to describe software-related parameters, and pipeline processing

Interoperability meeting S. Derriere, Cambridge, 2003 May Conclusions (2) The UCD list is not complete: - No UCD for very specific parameters Which is the level of granularity for « core » UCDs ? How specific should the UCD description itself be (use of GROUP/parameters and atoms)? Transforming language into UCD for assignation is not easy ! - We must be flexible on the input (allow to describe things in natural language) - But we must have enough information to guess what we're talking about (column name + unit + description)

Interoperability meeting S. Derriere, Cambridge, 2003 May Conclusions (3) ● Understand how to provide « efficient » descriptions on the assignation side, and on the data provider side. ● Provide examples How do we do this in an evolving world ? - define a « core » UCD list - share the UCD list and definitions / update the list (curator) ? - keep track of version of the UCD list, of the assignation tool version, of deprecated UCDs, etc... - distribute the list of existing implementations (parameter descriptions and already assigned UCDs) ?