Presentation is loading. Please wait.

Presentation is loading. Please wait.

SIL FieldWorks Language Explorer: The lexicon component Gary Simons SIL International Lexicon Tools and Lexicon Standards Nijmegen, 4–5 August 2010.

Similar presentations


Presentation on theme: "SIL FieldWorks Language Explorer: The lexicon component Gary Simons SIL International Lexicon Tools and Lexicon Standards Nijmegen, 4–5 August 2010."— Presentation transcript:

1 SIL FieldWorks Language Explorer: The lexicon component Gary Simons SIL International Lexicon Tools and Lexicon Standards Nijmegen, 4–5 August 2010

2 2 SIL FieldWorks  FieldWorks is:  a suite of integrated software tools to help field workers manage language and cultural data, with support for complex scripts.  http://fieldworks.sil.org/ http://fieldworks.sil.org/  The Language Explorer tool is designed to:  manage a lexical database  produce dictionaries  interlinearize texts  analyze morphology

3 3 Quick Tour  A short quick tour screen movie demonstrates the look and feel  It is the first of 55 narrated screen movies available at:  http://downloads.sil.org/FieldWorks/Movies /brief demo menu.html http://downloads.sil.org/FieldWorks/Movies /brief demo menu.html

4 4 Integration among areas  The Lexicon, Texts, and Grammar areas all operate over the same database.  In the Lexicon area, users enter lexical entries directly.  In the Texts area, as new morphemes are glossed in text, new lexical entries are created behind the scenes.  In the Grammar area, users describe the categories and features used in lexical description, plus the inflectional templates that guide automatic parsing in Texts.

5 Conceptual-modeling approach  Lexicon, texts, and grammar are all stored in a single, normalized relational database.  We began by working with domain experts to build a conceptual model of the areas and how they integrate.  That was modeled in UML and transformed to a SQL relational database schema.  See the full model with over 100 classes at: http://fieldworks.sil.org/ModelDoc/ModelDocumentation.chm 5

6 Some key features  Use automatic parsing to empirically verify morphological description within lexicon  Build the word net via lexical relations  Build richness into the lexicon by eliciting through semantic domains  Use “bulk edit” for global clean up  Repurpose content by developing multiple presentation views  Clean separation between stored data and presentation (see example in next 2 slides) 6

7 Root-based dictionary (Cherokee) 7 - Stem entries just cross-refer to root - Root entries list stems as subentries - Subentries give full description

8 Stem-based dictionary (Cherokee) 8 - Stem entries give full description - Root entries cross-refer to stems - No subentries

9 Pathways to publishing  First create a “configured view” to display the lexical entries as desired  Then use the Pathway plug-in to take this stream of configured content and lay it out onto pages for a publishable dictionary  http://code.google.com/p/pathway/ http://code.google.com/p/pathway/  Publishing tools supported so far:  Prince XML (to PDF)  Open Office (to ODF)  Adobe InDesign 9

10 Lexical interchange  Supports two import formats:  From Shoebox / Toolbox via SFM  “Standard Format Markers” = backslash codes  User configures the mapping of markers to conceptual equivalents in FLEx database  The default mapping is for MDF SFM  From WeSay / Lexique Pro via LIFT  Lexicon Interchange FormaT: an XML application for interchange of lexicons  http://code.google.com/p/lift-standard/ http://code.google.com/p/lift-standard/ 10

11 Lexicon export  The entire database for a language project can be dumped to Fieldworks XML  http://fieldworks.sil.org/supportdocs/FieldWorks XML model.doc http://fieldworks.sil.org/supportdocs/FieldWorks XML model.doc  The complete lexical database (a subset of the whole project) can be exported to:  LIFT XML  MDF-based SFM (either root- or stem-based)  http://fieldworks.sil.org/supportdocs/Export options in Flex.doc http://fieldworks.sil.org/supportdocs/Export options in Flex.doc 11

12 More lexicon export  Any configured view can be exported to:  A streamlined version of Fieldworks XML  MDF-based SFM  XHTML + CSS for presentation  Furthermore, one can create a Fieldworks XML Template (FXT) to define a custom export format (XML, SFM, plain text)  http://fieldworks.sil.org/supportdocs/FXT export options.doc http://fieldworks.sil.org/supportdocs/FXT export options.doc 12

13 Interoperation with GOLD  FLEX is preloaded with a grammatical categories catalog that is based on an early GOLD  http://www.sil.org/computing/fieldworks/flex/categories.html http://www.sil.org/computing/fieldworks/flex/categories.html  Similarly, a Morphosyntactic Gloss Assistant is preloaded with morphosyntactic properties from an early GOLD; see p. 10 of:  http://www.sil.org/~simonsg/preprint/FLExParser Preprint.pdf http://www.sil.org/~simonsg/preprint/FLExParser Preprint.pdf  Thus morphosyntactic information in lexicon and texts is implicitly aligned with GOLD  The remaining step is for us to map to GOLD ids when they are standardized; then we can easily export GOLD ids in LIFT and other XML 13

14 Uptake  October 2009: FLEx 3.0 released in Fieldworks 6.0. Free download from:  http://www.sil.org/computing/fieldworks/FW_downloads.htm http://www.sil.org/computing/fieldworks/FW_downloads.htm  323 members of a reasonably active Google Group (~3,000 messages)  http://groups.google.com/group/flex-list http://groups.google.com/group/flex-list  185 language projects have registered as users  Over 30 did a 4-day FLEx workshop led by Beth Bryson at InField 2010. Beth will also do a one-day FLEx workshop at ICLDC, Feb 2011. 14


Download ppt "SIL FieldWorks Language Explorer: The lexicon component Gary Simons SIL International Lexicon Tools and Lexicon Standards Nijmegen, 4–5 August 2010."

Similar presentations


Ads by Google