Download presentation
Presentation is loading. Please wait.
Published byAnnabelle Black Modified over 9 years ago
1
Resource Discovery (metadata and searching) Working Group Report
2
Issues discussed What kinds of resources should EMELD provide search services for? What should the design be for an EMELD search interface? How can EMELD get good metadata into its search database? What level of metadata should be exposed?
3
What resources? Anything that might be of value to the endangered language's linguist. –Language data –Tools –Advice (including reviews) –People –"Gateway" websites
4
What resources? But, there's no reason to rely on this working group for "what". A questionnaire distributed via Linguist
5
What resources? Two kinds of best practice resources Resources with best practice metadata –These resources can be discovered –Non-digital resources encouraged –Digital resources discouraged, but allowed
6
What resources? Best practice digital resources All digital resources encouraged to be of this type Benefits –Enhanced search features (due to document interoperability) –Special "BP globe of approval" √
7
What resources? Side Note –Best Practice "approval" system should be tied into a larger system through which digital resources could be listed as "publications" –A topic for another working group? (Perhaps OLAC?)
8
What resources? Issues which need to be addressed Metadata for resources interesting to linguists but which are not linguistic data Needed: Best practice metadata standards for –Tools –Advice –People –... Test: EMELD could see how it would classify everything in BPU.
9
How to search? Assumption: Metadata and data is distributed Query Language –Metadata: OLAC standard –Data from interoperable documents: A new standard
10
How to search? Resource Query Language Ideal –A generalized query protocal used across the linguistics community –A series of "methods" to be defined can be called on these resources to retrieve structured linguistic data matching query parameters
11
How to search? Problems implementing ideal –No clear sense as to what "methods" are needed. –One solution: Examine results from questionnaire
12
How to search? Problems implementing ideal –Very few repositories allow their data to be accessed in a generalized way –First step: Encourage documentation of repository data access systems and develop a metadata standard for this
13
How to search? Long term implementation issues –An OLAC Query Language Protocol A well-defined linguistic query language A system for "packaging" queries –Linguistic data search registry Linguistic sites register they are data access sites They also register implemented search methods –EMELD will archive best-practice documents for data access for data creators not capable of implementing the query protocol
14
How to search? Pilot project –Take some small subset of resources Data inputted via Field Nijmegen? SIL? AIATSIS? AILLA? –Take FIELD search out of FIELD –Search over that small set of resources –Ideally, keep both resources in separate databases to begin to develop query interchange protocol
15
How to search? Another project: Grammatical thesaurus –Develop a grammatical thesaurus that gives common synomyns for a given grammatical term (Ex. oral stop, plosive) –This could then be used to allow a user's search to be expanded to include synonyms for a given term. –In all likelihood, there are other applications of this.
16
How to search? Search interface –EMELD should implement a VISER-like service for access to its database –There are two distinct kinds of searches Resource location Resource data search
17
How to search? Search interface –The details of the search interface implemented by EMELD are hard to conceive of until more resources can be accessed through it –A questionnaire can help with this area too. EMELD could ask people to try the search and evaluate it Starting with the people in this room
18
Getting the data Sticks –EMELD Ambassadors –Assisted by Linguist Spider
19
Getting the data Carrots –Support harvesting metadata in document headers for submitted URL's. –Resources with best practice metadata can be referenced using some standard EMELD URI which can be used as a reference –These resources could be posted and advertised on Linguist (but consult Baden first)
20
Getting the data Juiciest Carrots (Best Practice resources only) –"Preferred" EMELD URI's –Marked as such in a search –Could undergo "advanced" search techniques –Be peer-reviewed and vetted by LDRA (Linguistic Digital Resource Association)* *This organization does not exist, as far as I know.
21
Granularity Right now there are no recommendations for the granularity of exposed metadata records –Large archives, for example, have hierarchical structure, one level of which must be isolated (the IMDI session, for example) –Cutting-edge archives don't work well with the resource=object model. Their resources are "created" based on the user's needs
22
Granularity The lack of recommendations on this issue inhibits metadata creation Granularity makes a big difference as to what content is searchable Two different audience's in need of advice –"Real" archives (a.k.a. trusted repositories) –Individuals
23
Granularity Recommendation: EMELD should encourage IMDI and OLAC to devise best- practice recommendations for granularity
24
The questionnaire Two broad kinds of questions: –What kinds of things would you like? –What kinds of would you hate hate? (Dafydd's Corollary)
25
The questionnaire Part one: Search capabilities –How do you want to conduct your search (google- style, directory-style, pull-down menus...)? –What kinds of searches are you doing already on other sites? –Search within results? (We wanted this.) –Thesaurus-based search
26
The questionnaire Part Two: Search content –Free entry (like Google) –Feature-based entry –Statistical questions –Phonetic characters –Geographical search –Time search –...
27
The questionnaire Part Three: Results –Google-like results –Journal abstract search-like results –Restricted results (only return web sites,.pdf documents,...) –...
28
The questionnaire Format –Online submission –Combination multiple choice (for the uncreative) and free form (for the creative) –Encourage people to envision the search of the year 2503
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.