Presentation is loading. Please wait.

Presentation is loading. Please wait.

Measurement Data Archive GEC10 March 2011 Larry Lannom Corporation for National Research Initiatives

Similar presentations


Presentation on theme: "Measurement Data Archive GEC10 March 2011 Larry Lannom Corporation for National Research Initiatives"— Presentation transcript:

1 Measurement Data Archive GEC10 March 2011 Larry Lannom Corporation for National Research Initiatives http://www.cnri.reston.va.us/

2 Corporation for National Research Initiatives Why Archive Experimental Results? The obvious: for use by others or by yourself in the future The Fourth Paradigm Data-intensive science Emergent phenomena Funding bodies increasingly asking for data plans Citations from journal articles to data sets on the rise Consistent archiving standards enhance the use of data over time and within a domain

3 Corporation for National Research Initiatives What is Metadata and Why Do I Need It? Lots of miscommunication because Metadata is not a type of data Metadata is a type of relationship between two pieces of data Needed for Understanding and Finding Understanding (sometimes called Descriptive MD) How do I parse this? How do I interpret this? Finding (sometimes called Subject MD) Finding one item in a population of 10 is easy Finding one item in a population of 1M is impossible w/o some some way to distinguish them Generally requires a human in the loop at some level Sometimes the object is self-describing (journal article) Automatic indexing/classification works for some domains

4 Corporation for National Research Initiatives Why is Metadata Hard? To be effective it must be consistent, and consistently applied, within a given domain What is the scope of the domain? What aspects of the object need to be described? What is the vocabulary, is it open or closed? Even within a defined domain, there are many points of view Especially true for any sort of subject description May have to allow for multiple metadata objects for a single described object Spending time on creating good metadata is Good For You The best sources for good metadata are the creators/owners of the described object, but they may lack interest and training Some types of metadata are difficult to automate, e.g., good title Keep it simple – trade consistency and coverage for depth

5 Corporation for National Research Initiatives Misc Points Precision and Recall useful concepts in searching Precision: % of search results are on target Recall: % of the correct result set did my search retrieve Desirable tradeoff is situational Consider University Libraries as reliable archive holders Variety of approaches to managing a useful vocabulary of terms Controlled vocabulary: set of terms – use these instead of slight variations Taxonomy: parent-child relationships Ontologies: introduce other types of relationships


Download ppt "Measurement Data Archive GEC10 March 2011 Larry Lannom Corporation for National Research Initiatives"

Similar presentations


Ads by Google