CLARIN-NL ISOcat workshop 2011 part 2 Ineke Schuurman Menzo Windhouwer.

Slides:



Advertisements
Similar presentations
CLARIN Metadata & ISO DCR Daan Broeder. Max-Planck Institute for Psycholinguistics TKE ES05 Workshop, August 14th Dublin.
Advertisements

Putting together a METS profile. Questions to ask when setting down the METS path Should you design your own profile? Should you use someone elses off.
The Call 4 ISOcat workshop 20 June 20131CLARIN-NL ISOcat workshop Ineke Schuurman a, Menzo Windhouwer b a Universiteit Utrecht and Katholieke.
Example queries for Federated search Jan Odijk CLARIN Federated Search Workshop Copenhagen, 24 Apr
DANS is an institute of KNAW and NWO Data Archiving and Networked Services EASY Dublin Core and CMDI Georgi Khomeriki, Marnix van Berchum, Menzo Windhouwer.
ISOcat Data Category Registry Defining widely accepted linguistic concepts Menzo Windhouwer 1CLARIN-NL MD tutorial, September 2009.
ISOcat introduction 19 June 20121CLARIN-NL ISOcat workshop.
Flexgen Payroll Deduction Maintenance. Highlights Set up Payroll Deduction Controls Employer Sponsored Health Care Reporting Employer Matching and the.
Data Category specifications 19 June 20121CLARIN-NL 2012 ISOcat tutorial.
CLARIN-NL/VL procedure 20 June 20131CLARIN-NL ISOcat workshop.
11 CLARIN? ISOCAT! Ineke Schuurman ISOcat content coördinator CLARIN-NL Amsterdam
Dobrin / Keller / Weisser : Technical Communication in the Twenty-First Century. © 2008 Pearson Education. Upper Saddle River, NJ, All Rights Reserved.
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Metadata Component Framework Possible Standardization Work.
The current state of Metadata - as far as we understand it - Peter Wittenburg The Language Archive - Max Planck Institute CLARIN Research Infrastructure.
ISOcat: known issues 10 May /20111CLARIN-NL ISOcat workshop.
Resume Writing Skills 123. Course Objectives Explain What is a Resume Explain the Importance of a Resume Differentiate between a Resume and Curriculum.
SENTENCE COMPLETION SENTENCE COMPLETION Part 4 Notes SAT Preparation Mrs. Erdman Part 4 Notes SAT Preparation Mrs. Erdman.
Data Category specifications 20 March 20121CLARIN-NL ISOcat workshop.
CLARIN-NL: Dealing with ISOcat Ineke Schuurman. ISOcat and CLARIN Projects call 1 CLARIN-NL Joint Flemish/Dutch pilot Whenever relevant, elements are.
Principles of the GOLD Ontology & Conversion of GOLD to DCIF Presenters: Anthony Aristar, Evelyn Richter.
Agenda CMDI Workshop 9.15 Welcome 9.30 Introduction to metadata and the CLARIN Metadata Infrastructure (CMDI) 10.15Coffee 10.30Use of ISOCat within CMDI.
The ISO-DCR 17 January /20111CMDI tutorial Marc Kemps-Snijders a, Menzo Windhouwer b, Sue Ellen Wright c a Meertens Institute, b MPI for.
TITLE OF SLIDE: RECOMMENDED FONT SIZE 40, BOLD AUTHORS: RECOMMENDED FONT SIZE 36 DEPARTMENTAL INFORMATION: RECOMMENDED FONT SIZE 36 This is a text box.
ISOcat demo and providing RELcat input Menzo Windhouwer The Language Archive tla.mpi.nl Data Archiving and Networked Solutions
Writing your dissertation. Overview Dissertation structure and components Writing Software assistance A look at past dissertations.
CLARIN-NL Call 3 ISOcat follow-up 10/10/20121CLARIN-NL ISOcat Call 3 follow-up.
DC specifications or “Do’s and don’ts” when creating a DC.
Content of the Data Category Registry 10 May /20111CLARIN-NL ISOcat workshop.
Metadata & CMDI CLARIN Component Metadata Infrastructure Daan Broeder et al. Max-Planck Institute for Psycholinguistics CLARIN NL CMDI Metadata Tutorial.
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
CLARIN Metadata Infrastructure Component Metadata and intermediate solutions Daan Broeder Claus Zinn Dieter van Uytvanck - Max-Planck Institute for Psycholinguistics.
ISOcat: known issues 20 June 20131CLARIN-NL ISOcat workshop.
Developments concerning the Community Plant Variety Office of the European Union (CPVO) online application system Meeting on the development of a prototype.
CLARIN-NL Call 4 ISOcat follow-up 2/10/20131CLARIN-NL Call 4 ISOcat follow-up.
Linguistics with CLARIN Storing resources in CLARIN Jan Odijk LOT Winterschool Amsterdam,
ISOcat introduction 20 June 20131CLARIN-NL ISOcat workshop.
ISOcat introduction 20 March 20121CLARIN-NL ISOcat workshop.
CLARIN-NL ISOcat workshop 2012 part 2 ( ) Ineke Schuurman Menzo Windhouwer.
ISOcat: known issues 19 June 20121CLARIN-NL ISOcat workshop.
Copyright 2010, The World Bank Group. All Rights Reserved. Questionnaire Design Part II Disclaimer: The questions shown in this section are not necessarily.
11 CMDI/ISOcat And Semantic Operability Ineke Schuurman ISOcat content coördinator CLARIN-NL Menzo Windhouwer ISOcat system administrator Utrecht
CLARIN Issues Peter Wittenburg MPI for Psycholinguistics Nijmegen, NL.
A Data Category Registry- and Component- based Metadata Framework Daan Broeder et al. Max-Planck Institute for Psycholinguistics LREC 2010.
ISOcat: How to create a DC (including “do’s and don’ts”) 19 June 20121CLARIN-NL ISOcat tutorial.
CLARIN-NL Requirements and Desiderata Jan Odijk CLARIN-NL Call 3 Info-session Utrecht, 25 Aug 2011.
Beyond ISOcat 20 June 2013CLARIN-NL ISOcat tutorial1.
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands TLA/MPI requirements for a Semantic Registry.
Agenda CMDI Tutorial 9.30 Welcome & Coffee Introduction to metadata and the CLARIN Metadata Infrastructure (CMDI) 10.30CMDI & ISO-DCR 10.50The CMDI.
CLARIN Concept Registry: the new semantic registry Ineke Schuurman, Menzo Windhouwer, Oddrun Ohren, Daniel Zeman
ISOcat status
CLARIN Requirements for a Semantic Registry Daan Broeder The Language Archive – MPI Ineke Schuurman CLARIN-NL/VL – KU Leuven & Utrecht.
1 ISOCAT Proposed solutions for Problems encountered in DUELME-LMF Jan Odijk Nijmegen 21 Sep 2010.
1 CLARIN? ISOCAT! Ineke Schuurman Hilversum,
Creating & Testing CLARIN Metadata Components A CLARIN-NL project Folkert de Vriend Meertens Institute, Amsterdam 18/05/2010.
Ontology Evaluation, Metrics, and Metadata in NCBO BioPortal Natasha Noy Stanford University.
ISOcat tutorial DCR data model and guidelines. Simple and complex DCs Simple Data CategoryComplex Data CategoryConceptual Domain Data CategoryDescription.
DC Architecture WG meeting Wednesday Seminar Room: 5205 (2nd Floor)
A Data Category Registry- and Component- based Metadata Framework Daan Broeder et al. Max-Planck Institute for Psycholinguistics LREC 2010.
ISOcat: How to create a DC (including “do’s and don’ts”) 20 June 20131CLARIN-NL ISOcat tutorial.
TDS-Curator DANS MPI for Psycholinguistics Utrecht Institute of Linguistics OTS languagelink.let.uu.nl/tds/ 9/21/20101CLARIN-NL - Call 1 - ISOcat status.
ATTACKING THE (SAR) OPEN ENDED RESPONSE. Get out a sheet of paper(or 2?)! Your responses to the questions on this power point will be your SAR test grade.
CMD and TEI CMDI interoperability workshop Utrecht Matej Ďurčo, ICLTT, Vienna.
ISOcat introduction 10 May /20111CLARIN-NL ISOcat workshop.
Marc Kemps-Snijders Menzo Windhouwer Sue Ellen Wright
WALT: TALK ABOUT MY OWN ROOM.
SAMANCTA Introduction: A guide to the development, content and functionality Presentation PPT-GNP-01 ver EN.
ISOCAT ISOCAT Problems
Presenter’s Name (if case)
Presentation transcript:

CLARIN-NL ISOcat workshop 2011 part 2 Ineke Schuurman Menzo Windhouwer

Part A Issues brought up by participants –When (not) to adopt an existing DC –What about (CLARIN) standards –What with ‘flagged’ DCs –Relation DCS – profile –What should be included in ISOcat (level of detail, abbreviations, …) –What about TEI, metadata, webservice? –How to deal with larger amounts of data

Part B ISOcat and CLARIN: Do’s and don’ts (version 0.1) – Introduction and discussion

Part 1 –When (not) to adopt an existing DC –What about (CLARIN) standards –What with ‘flagged’ DCs –Relation DCS – profile –What should be included in ISOcat (level of detail, abbreviations, …) –What about TEI, metadata, webservice? –How to deal with larger amounts of data

When (not) to adopt an existing DC –It should ‘match’ with the way you use a specific notion in your annotation scheme, application, … –It should come with the same profile –It should handle the same phenomenon, SpeakerID =/= SingerID

Speaker vs Singer String→Name→Person→Singer→Opera → Opera singer→Tenor →Tenor in La Bohème First: too generic, last: too specific The others are candidates Note that SingerID and SpeakerID are siblings, whereas SingerID is subclass of both Singer and ID (RELcat!)

–When (not) to adopt an existing DC –What about (CLARIN) standards –What with ‘flagged’ DCs –Relation DCS – profile –What should be included in ISOcat (level of detail, abbreviations, …) –What about TEI, metadata, webservice? –How to deal with larger amounts of data

Standards Within ISOcat currently there are little or no standards, Therefore CLARIN NL and VL will set up their own set of ‘standardized DCs’, Ineke will be in charge (she will consult with others)

–When (not) to adopt an existing DC –What about (CLARIN) standards –What with ‘flagged’ DCs –Relation DCS – profile –What should be included in ISOcat (level of detail, abbreviations, …) –What about TEI, metadata, webservice? –How to deal with larger amounts of data

Flagged DCs Never link with ‘deprecated’ DCs ! (in case of doubt: consult with Ineke or Menzo) In other cases the flags show whether the DC specification is correct from a technical point of view. Note that only DCs with a green marking are qualified for standardization

–When (not) to adopt an existing DC –What about (CLARIN) standards –What with ‘flagged’ DCs –Relation DCS – profile –What should be included in ISOcat (level of detail, abbreviations, …) –What about TEI, metadata, webservice? –How to deal with larger amounts of data

DC/DCS and profile Profiles are not added automatically, a DCS may contain elements with various profiles In case the profile you need is not yet available, contact Menzo and Ineke

–When (not) to adopt an existing DC –What about (CLARIN) standards –What with ‘flagged’ DCs –Relation DCS – profile –What should be included in ISOcat (level of detail, abbreviations, …) –What about TEI, metadata, webservice? –How to deal with larger amounts of data

What to include? Cf slide on SingerID/SpeakerID In general: all linguistically meaningful notions mentioned in your schema, manual, definition (cf part B) Abbreviations (PST for /past tense/) are to be mentioned as Data Element Name

–When (not) to adopt an existing DC –What about (CLARIN) standards –What with ‘flagged’ DCs –Relation DCS – profile –What should be included in ISOcat (level of detail, abbreviations, …) –What about TEI, metadata, webservice? –How to deal with larger amounts of data

TEI, metadata, webservice TEI: likely to be taken care of at ‘higher level’, if not YOU are to insert the TEI definitions you use. Metadata: new in CMDI? In that case definition in ISOcat to be provided as well Webservice: to be taken care of in CMDI

–When (not) to adopt an existing DC –What about (CLARIN) standards –What with ‘flagged’ DCs –Relation DCS – profile –What should be included in ISOcat (level of detail, abbreviations, …) –What about TEI, metadata, webservice? –How to deal with larger amounts of data

Larger amounts? in such a case: contact Menzo Windhouwer

Part B: do’s & don’ts Do’s: Create a DCS for your scheme (name project, ann.scheme, …) Provide clear definition (short, to the point) for your scheme, application, …. Take care not to leave concepts used in your definition undefined or vague Use appropriate vocabulary (per profile) Check ‘adopted’ DC’s regularly till standardization !

Do’s (continued) When creating a DC, fill out Justification: used in XYZ, part of tagset N Language section –Always English language section –Strong recommendation: sections for object language(s), for working language manual –Sections in the various languages should match (+/- be translations of each other)

Do’s (continued) When creating a DC, fill out Example section –Note that *negative* examples may be very helpful! (jongens, mannen, niet: gelovigen (is form of ADJ))

Example sections Suppose you want to illustrate a German phenomenon: Ex.sec. in EN language section –German ex with transl in English Ex.sec. in NL language section –German ex with transl in Dutch Ex.sec. in EN linguistic section –EN example Ex.sec. in NL linguistic section –NL example with translation in English

Don’ts Confuse Language and Linguistic section –Latter contains language specific values for closed domains Be (too) language specific in definition Mention scheme in definition Use several definitions in one DC Circular definitions Rely on authority Rely on standardized status –Definition should fit YOUR scheme, etc

. --End --