“Scientists seeking data should be able to efficiently and reliably locate LTER datasets through searching, browsing …“  Get feedback on general direction.

Slides:



Advertisements
Similar presentations
Jump to Contents Instructor Tutorial essignments.com Paperless assignment submission system.
Advertisements

GEOSS StP Browse Scenario Doug Nebert 13Jun2011. Support rapid discovery of data in support of critical EO priorities The GEO Web Portal supports search.
Status on the Mapping of Metadata Standards
United Nations Statistics Division
Mine Action Information Center
6 th Annual Focus Users’ Conference 6 th Annual Focus Users’ Conference Scheduling Requests and Request Reports Presented by: Sara Sayasane Presented by:
Google Apps: Google Mail Got Gmail?....Need Help? Mrs. Connor.
Conducting systematic reviews for development of clinical guidelines 8 August 2013 Professor Mike Clarke
Realtime Equipment Database F.R.E.D. stands for Fastline’s Realtime Equipment Database. F.R.E.D. will allow you to list all your inventory online. F.R.E.D.
Leveraging Your Taxonomy to Increase User Productivity MAIQuery and TM Navtree.
Knowledge organisation and information architecture, Nils Pharo Knowledge organisation and the Web Nils Pharo, 6th November 2002.
A Walk Through the Wiki An introduction to the Commissioning Handbook.
 Journal entries are comments or notes that can be left on an employee ANYTIME throughout the year.  The purpose of the journal entries is to help reviewers.
Long-Term Ecological Research working_groups/controlled_vocabulary Working Group: “Synthesis through data.
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice An FAQ on FAQs for Libraries Pamela.
Introduction to Geospatial Metadata – FGDC CSDGM National Coastal Data Development Center A division of the National Oceanographic Data Center Please .
 Workshops: March & May 2011 and lots of VTCs! Details at:
Query Relevance Feedback and Ontologies How to Make Queries Better.
Controlled Vocabulary & Thesaurus Design Planning & Maintenance.
Objective: Researchers need access to data, regardless of the language used in the metadata. Our objective is to facilitate discovery of ILTER data regardless.
Searching Databases. What is in the Library? The Online Library has thousands of journal articles and electronic books available for your use. Also available.
Introduction to OBIS-USA Biological Data, Applications, & Relationships March 14, 2011.
Authentication, Access Control, and Authorization (1 of 2) 0 NPRM Request (for 2017) ONC is requesting comment on two-factor authentication in reference.
Nachos Phase 1 Code -Hints and Comments
Your New FSU EMarket “Before and After” Guide Shopping, Favorites, and More...
XP New Perspectives on Microsoft Office FrontPage 2003 Tutorial 6 1 Microsoft Office FrontPage 2003 Tutorial 6 – Publishing a Web Site.
Controlled Vocabulary Working Group PRESENTED BY JOHN PORTER.
LTER IMC Meeting Sept Past Activities Created list of about ~650 terms based on widely-used LTER EML Keywords Autocomplete search aid added to.
Clustering User Queries of a Search Engine Ji-Rong Wen, Jian-YunNie & Hon-Jian Zhang.
Web Optimization- Review. Web Optimization- Metrics ( ROI)  What is ROIROI Return on Investment (Finance) ROI = Profit – Costs / Costs.
NASA’s Process of Community Endorsement Standards or: How the NASA Standards Process seeks to “Cross the Chasm” CEOS WGISS, Annapolis MD Richard Ullman,
Metadata Lessons Learned Katy Ginger Digital Learning Sciences University Corporation for Atmospheric Research (UCAR)
Controlled Vocabulary Working Group Virtual Water Cooler Session April 6-7, 2009 Moderator: John Porter rm.action?confKey=jhp7e.
Introduction to Morpho BEAM Workshop Samantha Romanello Long Term Ecological Research University of New Mexico.
Controlled Vocabulary VTC June 1, Agenda Review some past activities Plan some future activities.
 Finalize VOCAB “Terms of Reference”  Define use cases for the keyword database and its development  Develop procedures for capturing and managing.
Information Architecture & Design Week 5 Schedule -Planning IA Structures -Other Readings -Research Topic Presentations Nadalia your Presentations.
EcoTerm IV NBII/EioNet Demo of Federated KOS Search Mike Frame Vienna, Austria April 2007.
1 Understanding Cataloging with DLESE Metadata Karon Kelly Katy Ginger Holly Devaul
Copyright © 2009 Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Education Initiative, and the Intel Teach Program are trademarks.
Controlled Vocabulary Giri Palanisamy Eda C. Melendez-Colom Corinna Gries Duane Costa John Porter.
The New York State Education Department is reviewing the Common Core Learning Standards to ensure the standards are right for New York’s students An online,
Developing a Framework In Support of a Community of Practice in ABI Jason Newberry, Research Director Tanya Darisi, Senior Researcher
Virtual Experiment © Oregon State University Models as a communication tool for HJA scientists Kellie Vache and Jeff McDonnell Dept of Forest Engineering.
PULSE Resources. HOME PAGE:contains the PULSE Vision, Mission, and Aim. It also highlights “New and Note-worthy” items and the right hand side contains.
LTER IM Meeting 2008 – Benson, Boose, Bohm, Gries, Gu, Kaplan, Koskela, Laney, Porter, Remillard, Sheldon and others.
Controlled Vocabulary & Thesaurus Design Associative Relationships & Thesauri.
Controlled Vocabulary & Thesaurus Design Types of Controlled Vocabularies.
Breakout Session: Intensive Campaign Requirements and Strategies What is the rational for intensives? Need to develop and test methods before taking the.
Converting an Existing Taxonomic Data Resource to Employ an Ontology and LSIDS Jessie Kennedy Rob Gales, Robert Kukla.
School of Information, Fall 2007 University of Texas A. Fleming Seay Information Architecture Class Four.
Controlled Vocabulary Working Group Activities
Surviving the Revising: On Your Own or With a Writing Group.
Strategic Planning Chester County Library System Strategic Planning Steering Committee November 14, 2008 Gail Griffith.
Controlled Vocabulary Working Group Activities
Network Information System Advisory Committee NISAC Activity Report 2007 LTER IM Meeting Wade Sheldon (GCE) Committee Co-chair.
1 Introduction Overview This annotated PowerPoint is designed to help communicate about your instructional priorities. Note: The facts and data here are.
Big6 Research and Problem Solving Skills 6 th Grade Project Creating a Travel Brochure.
7 th Grade Big6 Project Assignment: Make a children’s informational book (It can be in graphic novel format or regular picture-book format)
Learning Objectives 1.Students will be able to identify and implement three different strategies for when they are getting too many sources in their search.
Computing Honours Project (COMP10034) Lecture 4 Primary Research.
Ecology and Food CENV 110. Topics Ecology: what is it? The difference between ecology and the environment Elements of ecology The balance of nature Food.
How can I use a digital library to support my teaching? Find good resources to enhance existing curriculum  Search special collections aimed at your interests.
Administrators and System Administrators
Jessie Kennedy Rob Gales, Robert Kukla
LTER Metadata Query Interface – Current Status and Future Challenges
Proposal Mechanism.
BBTalk - Submission and Connection Management Tool
LTER Controlled Vocabulary Virtual WaterCooler - July, 2018
Validation Workshop at The Wales Annual Meeting
Presentation transcript:

“Scientists seeking data should be able to efficiently and reliably locate LTER datasets through searching, browsing …“  Get feedback on general direction of working group activities  Resolve some specific issues  Decide on “Next Steps”  Products  Comments to be acted on  White paper concerning specific issues and “next steps”

TimeActivity 9:00 AM Introductions, Review of Agenda 9:15 AMIntroduction to the LTER Controlled Vocabulary – Past and Future 10:00 AMBreak 10:15 AM Discussion: Locating LTER Data – around-the-room experiences  What are your experiences with finding LTER data?  What would be most helpful in finding data in the future?  Review of “use cases” 11:15 AMTour of draft LTER Controlled Vocabulary 12-noonLunch 1:30 PM Feedback to entire group on things in the controlled vocabulary that need improvement  Things to be removed  Things to be added  Things to be reorganized 2:30 PMBreak 2:45 PM Discussion of specific issues  Core areas  Are related-terms needed, or is a hierarchy sufficient?  Management of the vocabulary – role of researchers 3:00 PM Next Steps How do we engage larger LTER community? How much, and what sort of engagement is needed? 4:00 PMAdjourn

 Eclectic use of terms to used for discovering LTER data makes it difficult to perform reliable or efficient searches  Often several terms for one concept  One site uses CO2 another Carbon Dioxide, another Carbon- dioxide  Carbon to Nitrogen Ratio, C:N, C:N Ratio, Carbon-to-nitrogen Ratio  No way to relate broader terms with narrower terms  Searching on “Landscape Change” doesn’t find data sets related to “desertification” even though desertification is a kind of landscape change

SourceNumber of Terms Number used at 5 or more sites Most Frequently used EML Keywords*2,71186LTER (1002), Temperature (701) EML Titles2,480921And (768), Data (394), LTER (350) DTOC Keywords*2,774103ARC (1645), Temperature (732) Bibliography Titles13,5381,855Of (12,611), Forest (2,050) * Allows multi-word terms Only 3.2%!

 We started off by surveying what terms were already being used in a variety of LTER documents  Our goal was to see if there were any existing lexical resources that we could simply adopt

58% of LTER terms were not found in the NBII Thesaurus Results suggested that we needed to develop our own resource

 Identify a list of preferred terms that would be used by sites in creating metadata documents  Focus on LTER-wide searches  Want to facilitate cross-site synthesis  People searching LTER Metacat rather than individual sites are interested in relevant data from multiple sites  Want to hit the “sweet spot” for the number of terms  Too many terms make keywording documents difficult, and results in searches with too few datasets  Too few terms make it hard to locate usably small numbers of datasets

 Assembled list of words already in LTER Metadata (EML documents)  Selected using criteria:  Keywords shared with GCMD and NBII, or  Keywords used at more than one LTER site  Reviewed by Information Managers  Removals and additions were suggested  Edited based on voting  Created a Draft set of Taxonomys  Included some additions and deletions

 Goal: Improve Searching & Browsing  Reliability (of all the suitable target documents, what percentage did you find)  Efficiency (of the documents your search returned, what percentage were suitable)  A list alone is not sufficient to support browsing and sophisticated searching of data – more structure is needed

ListSynonym RingTaxonomyThesaurusOntology = = = = Complexity Multiple taxonomys are a Polytaxonomy

 Relationships should be independent of context  Must pass “Some-not-all test”  Each taxonomy should include only one type of entity (listed in Z39.19 section 6.3.2)  Things and their physical parts (birds, trees, leaves)  Materials (wood, nitrogen, sand)  Activities or processes (acidification, production)  Events or occurrences (germination, death)  Properties or states of persons, things, materials or actions (age, speed, nitrogen content)  Disciplines or subject fields (ecology, ornithology)  Units of measurement (m, km, miles)  Unique entities (LTER,HJ Andrews Forest)  You can get into trouble if you start “mixing and matching” things within a single taxonomy!

GoodBad Forests Boreal Forest Hardwood Forest Grassland Tallgrass Praire Tundra Forests Fire Ecology OK – these are all the same type of entity – all are THINGS Mixing THINGS and PROCESSES and DISCIPLINES Rodents Mice Rats Desert Plants Cacti Grasses OK – Is not dependent on context. Mice and rats are ALWAYS rodents Problem: Context dependent, not all cacti or grasses are desert plants. Some occur in other systems. Fails “Some-not-all” test.

 The VOCAB Working Group has created a draft set of 10 taxonomys containing 713 terms  Includes additional “broader” terms needed for grouping  Includes synonyms (non-preferred terms)  Some terms originally in the list have been removed because the were perceived to be too ambiguous or context-sensitive to be useful for the purposes of searching or browsing  E.g., “Aboveground”  Some “related” terms have also been identified

 In 2010 a request for information was forwarded to the LTER Executive Board:  “ The Information Management Committee has studied how keywords are used at LTER sites, how LTER keywords relate to external lexographical resources, and compiled a draft keyword. We request guidance from the LTER Executive Board on how a controlled vocabulary might be implemented within the context of LTER to improve the reliability of data searches. “  The EB generally endorsed the idea of a LTER Controlled Vocabulary, and agreed to help have scientists participate in vetting the list and deciding on next steps (THIS WORKSHOP)

 Permit use of a browse interface  Make searches more sophisticated  See “Use case” for searching  search includes synonyms plus narrower terms and/or related terms  Develop tools to help in adding keywords to LTER metadata documents  Prototype versions of a couple are already available  See Keywording “Use Case”

 What are your experiences with finding LTER data?  What would be most helpful in finding data in the future?  Review of “Use Cases”

 Evaluate the utility of the draft polytaxonomy  Is it better than the existing LTER Metacat interfaces?  Are there large changes that need to be made?  Elimination of specific taxonomys?  Creation of new taxonomys?  Addition of related terms to make a thesaurus?  Are there small changes needed?  Removal or replacement of terms

 Improvement of existing documents  Review existing keywords and change to preferred forms  Note: even without doing this the synonym ring will help improve searching and browsing  Use preferred terms for new documents  Ideally at least one term from each of the relevant taxonomys  Note: addition of new terms to the list, should require review of all existing documents to see if they should be added – so term additions should be rare  Changes in taxonomys and term relationships do not require re-keywording of existing documents

TimeActivity 9:00 AM Introductions, Review of Agenda 9:15 AMIntroduction to the LTER Controlled Vocabulary – Past and Future 10:00 AMBreak 10:15 AM Discussion: Locating LTER Data – around-the-room experiences  What are your experiences with finding LTER data?  What would be most helpful in finding data in the future?  Review of “use cases” 11:15 AMTour of draft LTER Controlled Vocabulary 12-noonLunch 1:30 PM Feedback to entire group on things in the controlled vocabulary that need improvement  Things to be removed  Things to be added  Things to be reorganized 2:30 PMBreak 2:45 PM Discussion of specific issues  Core areas  Are related-terms needed, or is a hierarchy sufficient?  Management of the vocabulary – role of researchers 3:00 PM Next Steps How do we engage larger LTER community? How much, and what sort of engagement is needed? 4:00 PMAdjourn

 Todd & Margaret  Focus on INTERFACE  Ways to present the data  Allow “query within result set”  Intersect query sets  Group options – by site, by time  side by side comparisons  Be able find where different types of data intersect  Can be very difficult due to missing data etc.  Problem extends beyond query interface  Interface needs to be a higher priority – sooner rather than later  Recommendation to IMC/NISAC/EB

 Rodger and Kristin  Highest level of hierarchy  Found some things to change or add “root production”, “belowground productivity”  Were generally happy with overall organization  Need system for adding new keywords – this is just a start  Intrigued by theory and where we go from here  How does it matter what is in one place or another?  Want to make sure things are well-organized….  Data vs research question  Does not matter where it is when adding to keyword list  Need to have “best practices” for adding keywords  How will that effect sites?  How many data sets have no preferred terms?

 At least one word from list  At least one from at least 5 of the 10 taxonomys  Signature datasets should be flagged with “signature dataset” tag  Should include Core area(s)

 Core area - Problems with definitions  Some datasets are either none, or all core areas  Weather data  Change entities to core areas?  People will want to look for this  Would not have hierarchy?  That would be OK – can have related terms  Could link to signature datasets  Need “signature dataset” keyword – used to weight  Or prioritize signature datasets for adding preferred terms  Treat as unique:  Primary Production (core area)  Data can be applied to MANY core areas - won’t map  e.g. Climate  Try adding core area taxonomy and then add core areas and related terms?????  May not be needed or appropriate – we are asking the data catalog to do too much – need catalog of research topics

 Want to search for signature datasets at top level of the hierarchy  Needs to be one click away

 Julia and Don  Would be interesting to tally the number of hits for each keyword for each site  Tally of number of datasets for each site  GIS should be preferred term  Can mean Geographical Information Science

 Atmospheric processes cross listed under hydrologic properties  Evapotranspiration should be above transpiration and evaporation  Snow not under precipitation  Geographical Properties ->Spatial Properties  Move imagery under that with satellite and photos under that – depricate landsat  Methods – field, spatial, lab, analytical subcategories  Also cores, dendrometers etc. tools could go under this  Entities  For detailed ones, tried to find other homes  Diseases to disease and move under bio processes  Levels of organization for communities, populations, species  Are these useful terms? How often used  Biomes instead of Ecosystems

 Core areas  Do we need a special taxonomy for core areas?  Are related-terms needed, or is a polytaxonmy (hierarchy) sufficient?  Management of the vocabulary – role of researchers?  Preferred terms – are all really preferred?  E.g., Permanent forest plots

 How do we engage larger LTER community?  How much, and what sort of engagement is needed?  Requests we should make to the EB or IMC?  Managing the controlled vocabulary  What technology development is needed, and who should pursue it?

 Anyone can propose adding, editing, deleting or moving terms within the hierarchy, with justification.  Proposals would be evaluated by the Controlled Vocabulary Working Group according to the following criteria:  The proposed terms should provide clear utility for searching and browsing, and not introduce ambiguity  The proposed terms should be suitable for inclusion (e.g., not locations or specific taxonomic identifiers)  Proposed terms should not be redundant with existing term(s) already in the vocabulary  Terms and their proposed places in taxonomys or thesauri should conform in form with NISO Z and successor documents (e.g., sections 6.5.1, 8.3)

 Best Practices for adding keywords  Preferred terms (and preferred preferred terms )  Presentation to PIs  Statistics on numbers of hits  Add workshop participants to VOCAB  Put in supplement proposal for development of search interface  Write it up now – Shovel Ready!  Like MALS – need to have all sites sign up with letters of endorsement