Presentation is loading. Please wait.

Presentation is loading. Please wait.

11 CLARIN? ISOCAT! Ineke Schuurman ISOcat content coördinator CLARIN-NL Amsterdam 30-08-2012.

Similar presentations


Presentation on theme: "11 CLARIN? ISOCAT! Ineke Schuurman ISOcat content coördinator CLARIN-NL Amsterdam 30-08-2012."— Presentation transcript:

1 11 CLARIN? ISOCAT! Ineke Schuurman ISOcat content coördinator CLARIN-NL Amsterdam 30-08-2012

2 22 Overview ISOcat –general –use in CLARIN An example Your task wrt ISOcat

3 33 ISOcat ISOcat: Data Category Registry Data Category Registry defining widely accepted data categories (DCs) http://www.isocat.org Registry that stores DCs for language resources and their metadata, together with properties of the DCs (definition, administration, examples, etc.)

4 44 Use in CLARIN what is in resource A meant with DC X ? –There may be several (valid) definitions !!! Does X have the same meaning in resources A and B ? In CLARIN needed first and foremost for tools (so that they ‘know’ what the meaning of elements in resources are) –Especially important for: search in data and metadata –But also for other tools that apply to data (cf. last talk on TTNWW) Human use is only secondary, but … humans must after all fill the ISOcat registry, and make the right mappings

5 5 An example with ‘ev’ Have a look at these two tags: –WW(pv,tgw,ev) –N(soort,ev,dim,onz,stan) All parts of such tags, like ev, are to be included in ISOcat. The full tags are to be included as well. ev, enkelvoud, sg, sing, singular, singulier, …

6 6 singular All these representations can be mapped on one DC: singular -DC-4918 word form indicating that one entity is involved In full: http://www.isocat.org/datcat/DC-4918

7 77 Other cats ISOcat: defining DCs ongoing RELcat: relating DCs started SCHEMAcat: a registry of Schemas, a schema being a description of the structure of your dataformatjust started

8 88 Call 4 projects Each call 4 project must check, for each DC used in your resource or its metadata, whether a corresponding DC exists in ISOcat –If not, extend ISOcat with such a DC, with all its properties (definitions, examples, etc.) create a schema with a mapping that maps each DC used in the resources and metadata to an ISOcat DC All this will be explained in tutorials

9 99 Call 4 projects do NOT underestimate this ISOcat task! Good news: DCs used in some common formats are already included in ISOcat –CGN / D-Coi tagset –TEI header elements –Many DCs concerning metadata Contact ASAP a CLARIN-centre to help you with this OR contact the helpdesk (helpdesk@clarin.nl)helpdesk@clarin.nl

10 10 CLARIN-NL Thank you for your attention. Any questions?

11 11

12 12 XML-format CGN CGN-format <pw ref=“fn000248.20.4” w=“is” pos=“WW(pv,tgw,ev)” lem=“zijn” … pq=“man” /> VU-DNC FoLiA-format is … is


Download ppt "11 CLARIN? ISOCAT! Ineke Schuurman ISOcat content coördinator CLARIN-NL Amsterdam 30-08-2012."

Similar presentations


Ads by Google