Using the NASA Thesaurus to Support the Indexing of Streaming Media Gail Hodge, Janet Ormes & Patrick Healey NASA Goddard Space Flight Center Library
Historic Context The Library has collected and circulated the Center’s colloquia on audio or video since 1967 A catalog of these holdings have been posted on the Library’s web site since 2001 Patrons required to come to the Library, resulting in limited accessibility of recorded colloquia Streaming Media Center Project began in 2001 as part of the Library’s response to Knowledge Management initiatives
Introducing the GSFC Media Center
Streaming Media Streaming media –Video that is encoded for delivery across the internet/intranet Encoding –Computer processing of video to a format for web casting Web casting –The act of delivering audio and video content across the internet/intranet –Can be delivered live or on-demand
The Goddard Library Streaming Media Center The Streaming Media Center is now available from the Library website ( website Can be included in personalized portals Library has collected >350 hours of video –>100 hours indexed Currently broadcasting 2 hours daily for the Earth Observing Systems Knowledge Management Pilot
Access Issues Current Needs –Need to know the overall topic of the video –More likely to remember the topic, presenter, date or series Permanent Access –Less likely that users will remember the video’s metadata –More likely that users will want specific information –Terminology may change over time
Indexing Video Content Video indexing is similar to a back-of-the book index for specific information Entering a keyword leads you to the specific location of the subject
Features of Selected Software Compares recognized speech with stored default terminology Uses speaker inflection to identify meaningful intervals Indexing and Search components included
Incorporation of NASA Thesaurus Added specific scientific terminology Used terms and their NTs Used text of Astrophysics Data System to provide terms in grammatical structures Provides query expansion and improves relevancy
Query Expansion "Saturn moons“ – find not only "Saturn" and "moons" but Triton, Ios, etc. (by name), whether or not the specific word(s) "Saturn" and/or "moons" is mentioned
Relevance Interval Creation Relevance Interval Creation links related concepts within media files, which drives Relevance Intervals External knowledge from the thesaurus improves the accuracy of the Creation process because the explicit knowledge in text is incomplete
Benefits Identify relevant pieces of content within a longer video Stream more relevant, specific information intervals to users Minimize manual processing Ultimately improve reuse of information and increase opportunities for knowledge sharing