Download presentation
Presentation is loading. Please wait.
Published byAugust Rice Modified over 9 years ago
1
Sharing and Browsing Linguistic Data EMELD Arizona: Terry Langendoen Scott Farrar
2
Since Santa Barbara Focus on morpho-syntax Decided to build ontology (to be discussed later in this talk) Decided to build supporting tools –smart search engine (Hedwig) –editor Some work on xml markup
3
The Problem Currently there is no general way for researchers in the endangered languages community to electronically share information. The Web is the most likely tool that could provide a solution. The current WWW is not adequate. An Example from the WWW:
8
Further Complications What about other data formats? –lexicons –grammatical descriptions –(comparative) word lists –paradigms –etc.
9
Warumungu Description 'Grammatical case suffixes' are those which express grammatical relations (subject, object, indirect object), like /karriny-ji/ in (4). A noun without a case suffix is interpreted as having Absolutive case - /nanttu/ in (4) and /wangarri/ in (5) - or as being the main predicator, or as agreeing with some argument with Absolutive case - /kumppu/ and /pulyurrulyurru/ in (5). (from J. Simpson 1998)
10
(4) Karriny-ji +ajjul nyirri-njina nanttu, ngapa-kajji. people-ERG +3pl.S put-PAST.CONT humpy, water-LEST 'The people were erecting humpies for fear of the rain.' [JS:PND:RS] (5) Nyirri-nyi +ama wangarri kumppu pulyurrulyurru. place-PAST.PUN +he rock ABS big.ABS red.ABS 'He placed a big red hill.' [JS:PND:RS]
11
Chichewa Description Other elements that appear as verbal prefixes include modals – for instance, -ngo- 'just, merely' – as well as directional elements -ka- 'go' and -dza- 'come'. These are placed in the immediate pre-OM position, after the tense. This is shown by the following: (from Mchombo 1998)
12
(8a) Mkângo s-ú-ná-ngo-wá-phwány-a maûngu... 3-lion NEG-3SM-past-just-6OM-smash-fv 6-pumpkins... 'The lion did not just smash them, the pumpkins...' (8b) Mkângo u-ku-ká-phwány-á máûngu. 3SM-pres.-go-smash-fv 6-pumpkins 'The lion is going to smash some pumpkins.'
13
A Solution Take advantage of new Web technology Build a community of practice on the Semantic Web What is the Semantic Web?
14
The Semantic Web New markup:,, New markup:,, New tools: smart search engines ontologies, new editors Meaning is encoded explicitly. Pages are interpreted by a reasoner.
15
An Example from the Semantic Web New markup adds functionality to existing documents. Example: Tennessee Navajo
16
Aardvark nocturnal burrowing mammal of the grasslands of Africa that feeds on termites; sole extant representative of the order Tubulidentata WordNet for 'aardvark' Nouns: 1. nocturnal burrowing mammal of the grasslands of Africa that feeds on termites; sole extant representative of the order Tubulidentata Synonyms: aardvark,ant_bear,anteater,Orycteropus_afer Verbs: Adjectives: Adverbs:
17
<rdf:RDF … nocturnal burrowing mammal of the grasslands of Africa that feeds on termites; sole extant representative of the order Tubulidentata WordNet for 'aardvark' Nouns: 1. nocturnal burrowing mammal of the grasslands of Africa that feeds on termites; sole extant representative of the order Tubulidentata Synonyms: aardvark,ant_bear,anteater,Orycteropus_afer Verbs: Adjectives: Adverbs:
18
The Ontology Crucial component of the Semantic Web A resource that explicitly defines what entities can exist in a domain, i.e., the endangered languages community A resource that defines what relations hold between entities demo
19
OWL Web Ontology Language Analogous role of on the WWW The most current “standard” Semantic Web language Under development at the W3C: www.w3c.org
20
Facilitating Tools Search tools for the Semantic Web Editors for composing Semantic Web pages Reasoning engines An extensible data model
21
A Search Engine EMELD Arizona’s prototype (Hedwig) http://emeld.douglass.arizona.edu: 8080/searchindex.html (temporarily out of service) demo on Sunday
22
An Editor EMELD Arizona’s prototype (name?) demo on Sunday
23
A Good Data Model for Creating a Community of Practice Language data should be searchable and comparable—broad access (centralized). Authors or communities want control over their data (local/distributed). Local control should be balanced with data interoperability (Semantic Web).
24
Centralized Model Warumungu Wari Mocovi Biao Min Archi Hopi Community
25
Local Control with Broad Access Semantic Web ontology Wari Hopi Archi Community tools
26
Community Requirements No need to standardize your terminology or abandon tradition. No need to learn (it doesn’t hurt!) Use EMELD tools to put your data on the Semantic Web Maintain your data
27
Contact Info Terry Langendoen Scott Farrar langendt@u.arizona.edufarrar@u.arizona.edu See our website: http://emeld.douglass.arizona.edu:8080
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.