CLARIN-NL ISOcat workshop 2012 part 2 (10-10-2012) Ineke Schuurman Menzo Windhouwer.

Slides:



Advertisements
Similar presentations
A centre of expertise in digital information management Approaches To The Validation Of Dublin Core Metadata Embedded In (X)HTML Documents Background The.
Advertisements

CS4018 Formal Models of Computation weeks Computability and Complexity Kees van Deemter (partly based on lecture notes by Dirk Nikodem)
Example queries for Federated search Jan Odijk CLARIN Federated Search Workshop Copenhagen, 24 Apr
DANS is an institute of KNAW and NWO Data Archiving and Networked Services EASY Dublin Core and CMDI Georgi Khomeriki, Marnix van Berchum, Menzo Windhouwer.
Introduction to RDF Based on tutorial at
ISOcat introduction 19 June 20121CLARIN-NL ISOcat workshop.
Data Category specifications 19 June 20121CLARIN-NL 2012 ISOcat tutorial.
CLARIN-NL/VL procedure 20 June 20131CLARIN-NL ISOcat workshop.
11 CLARIN? ISOCAT! Ineke Schuurman ISOcat content coördinator CLARIN-NL Amsterdam
1.10 Report Findings to Communicate Research Information to Others
Dobrin / Keller / Weisser : Technical Communication in the Twenty-First Century. © 2008 Pearson Education. Upper Saddle River, NJ, All Rights Reserved.
Affecting Learners Positively. The teacher provides the correct form, clearly indicating that what the student had said is incorrect I go to the store.
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Metadata Component Framework Possible Standardization Work.
The current state of Metadata - as far as we understand it - Peter Wittenburg The Language Archive - Max Planck Institute CLARIN Research Infrastructure.
Basic Scientific Writing in English Lecture 3 Professor Ralph Kirby Faculty of Life Sciences Extension 7323 Room B322.
ISOcat: known issues 10 May /20111CLARIN-NL ISOcat workshop.
SENTENCE COMPLETION SENTENCE COMPLETION Part 4 Notes SAT Preparation Mrs. Erdman Part 4 Notes SAT Preparation Mrs. Erdman.
CLARIN-NL: Dealing with ISOcat Ineke Schuurman. ISOcat and CLARIN Projects call 1 CLARIN-NL Joint Flemish/Dutch pilot Whenever relevant, elements are.
Language and Thought.
Principles of the GOLD Ontology & Conversion of GOLD to DCIF Presenters: Anthony Aristar, Evelyn Richter.
CLARIN-NL Second Open Call Jan Odijk CLARIN-NL Call 2 Info-session Amsterdam, 26 Aug 2010.
Agenda CMDI Workshop 9.15 Welcome 9.30 Introduction to metadata and the CLARIN Metadata Infrastructure (CMDI) 10.15Coffee 10.30Use of ISOCat within CMDI.
CLARIN-NL ISOcat workshop 2011 part 2 Ineke Schuurman Menzo Windhouwer.
Advanced Business Communication Spring Advanced Business Communication Spring 2012 Overview Last week we talked about policy and procedure documents.
The ISO-DCR 17 January /20111CMDI tutorial Marc Kemps-Snijders a, Menzo Windhouwer b, Sue Ellen Wright c a Meertens Institute, b MPI for.
Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University.
DC specifications or “Do’s and don’ts” when creating a DC.
Content of the Data Category Registry 10 May /20111CLARIN-NL ISOcat workshop.
Scientific writing style Exact  Word choice: make certain that every word means exactly what you want to express. Choose synonyms with care. Be not.
ISOcat: known issues 20 June 20131CLARIN-NL ISOcat workshop.
MS. SUHA JAWABREH LECTURE # 21 Oral Communication.
Expressing Yourself Effective Communication. Number your white board to 15.
Semantics and Syntax of Dublin Core Usage in Open Archives Initiative Data Providers of Cultural Heritage Materials Arwen Hutt, University of Tennessee.
Collecting primary data: use of questionnaires Lecture 20 th.
ISOcat introduction 20 June 20131CLARIN-NL ISOcat workshop.
Legislative Texts. The legislative process in the EU Proposal, recommendation, communication from Commission, Green Paper, consultation, studies, draft.
ISOcat introduction 20 March 20121CLARIN-NL ISOcat workshop.
ISOcat: known issues 19 June 20121CLARIN-NL ISOcat workshop.
11 CMDI/ISOcat And Semantic Operability Ineke Schuurman ISOcat content coördinator CLARIN-NL Menzo Windhouwer ISOcat system administrator Utrecht
TESOL Materials Design and Development Week 5: Workshop & Lecture on Student Learning Objectives (SLOs) and using Language Analysis in Lesson Planning.
A Data Category Registry- and Component- based Metadata Framework Daan Broeder et al. Max-Planck Institute for Psycholinguistics LREC 2010.
XML for Text Markup An introduction to XML markup.
1 Version /05/2004 © 2004 Robert Oshana Requirements Engineering Use cases.
ISOcat: How to create a DC (including “do’s and don’ts”) 19 June 20121CLARIN-NL ISOcat tutorial.
CLARIN-NL Requirements and Desiderata Jan Odijk CLARIN-NL Call 3 Info-session Utrecht, 25 Aug 2011.
Grammar Translation Method
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands TLA/MPI requirements for a Semantic Registry.
Agenda CMDI Tutorial 9.30 Welcome & Coffee Introduction to metadata and the CLARIN Metadata Infrastructure (CMDI) 10.30CMDI & ISO-DCR 10.50The CMDI.
Designing Classes CS239 – Jan 26, Key points from yesterday’s lab  Enumerated types are abstract data types that define a set of values.  They.
CLARIN Concept Registry: the new semantic registry Ineke Schuurman, Menzo Windhouwer, Oddrun Ohren, Daniel Zeman
ISOcat status
1 ISOCAT Proposed solutions for Problems encountered in DUELME-LMF Jan Odijk Nijmegen 21 Sep 2010.
1 Serbian Association of Accountants and Auditors (SAAA) IFRS and ISA TRANSLATION.
1 CLARIN? ISOCAT! Ineke Schuurman Hilversum,
Creating & Testing CLARIN Metadata Components A CLARIN-NL project Folkert de Vriend Meertens Institute, Amsterdam 18/05/2010.
Direct Method.
ISO TC 37/CLARIN DISCUSSION UTRECHT, DECEMBER 9/ Thinning Down a Bloated Cat SUE ELLEN WRIGHT DECEMBER 2013.
A Data Category Registry- and Component- based Metadata Framework Daan Broeder et al. Max-Planck Institute for Psycholinguistics LREC 2010.
ISOcat: How to create a DC (including “do’s and don’ts”) 20 June 20131CLARIN-NL ISOcat tutorial.
Dec 2015 How work the translation process ? eMotion.
Interpersonal Communication Techniques Billy Edwards.
BELL RINGER At 2:30 pm, Brian’s home was burglarized while he was at work. His wife yells at him because she said he should have taken the day off, gotten.
ATTACKING THE (SAR) OPEN ENDED RESPONSE. Get out a sheet of paper(or 2?)! Your responses to the questions on this power point will be your SAR test grade.
The Exam 40% of your grade Marked out of 80 Every 2 marks are 1% of your overall GCSE PE grade Lets get as many as possible and not drop silly marks!!!
Language as a Marker of Cultural Identity for TCKs Mode of Pronunciation as an Adaptive Strategy.
Online Information and Education Conference 2004, Bangkok Dr. Britta Woldering, German National Library Metadata development in The European Library.
ISOcat introduction 10 May /20111CLARIN-NL ISOcat workshop.
Marc Kemps-Snijders Menzo Windhouwer Sue Ellen Wright
Truth Tables Hurley
ISOCAT ISOCAT Problems
Presentation transcript:

CLARIN-NL ISOcat workshop 2012 part 2 ( ) Ineke Schuurman Menzo Windhouwer

Issues brought up by participants –Which elements are to be included in ISOcat –(CLARIN) standards, TEI etc –Type of DC –When to create a new DC/adapt an existing one –When to create several DCSs –Name of DC, several DCs with same name –How to deal with larger amounts of data

What to include? ALL concepts dealing with linguistics/ metadata –Van Dale EN-NE include (overgankelijk werkwoord) 1) omvatten 2) (mede) opnemen ==> 'overgankelijk werkwoord' / 'transitive verb' is to be included, same for 'overg.ww', 'trns.v.' One and the same DC!

What to include? ‘transitive verb’ Several entries in ISOcat –DC-1405 A verb which takes a direct object; that is, a verb that expresses an action which directly affects another person or thing. –DC-3532 A transitive verb is a verb that takes a direct object, and describes a relation between two participants [Crystal 1997: 397; Payne 1997: 171] –And several more, so... which one to select?

When (not) to adopt an existing DC –It should ‘match’ with the way you use a specific notion in your annotation scheme, application, … –It should come with the same profile and type That being said –Reuse a CLARIN NL/VL DC when possible (contact Ineke when such a definition is incorrect)

Same name Not really a problem when it are good DCs, not even when coming with the same profile PositivePolarity –In general, positive polarity refers to an assertion that contains no marker of negation [Crystal 1980: 299]. (DC-3405) –the property of a word or concept to express positive sentiment (myDC-xx) Whether you can reuse DC-3405 depends on your use of the concept!

Same name Do not avoid reuse of a name when it is the name commonly used! Another type of duplicate names where one concept entails the other one: –meewerkend voorwerp –meewerkend en belanghebbend voorwerp –event (also called 'eventuality', and including 'state') –event (sister of 'state')

What defines a good DC? Reusable definition NOT conversation (DC-2661) Communication event with more than two participants mother tongue (DC-2955) […] a speaker’s mother tongue

What defines a good DC? Correct definition NOT (?) Actor (DC-4146) a participant in an action or process Question: is an addressee to be considered an actor? (used in DC-4158, no proper definition yet)

What defines a good DC? Meaningful definition NOT annotation format (DC-2562) Specifies the annotation format that is used … source language (DC-2494) Indicates if a language is a source language

Not that good examples Mother tongue (DC-2955) Specifies whether the language is a speaker’s mother tongue Mother’s language (DC-4516) […] NOT necessarily the mother tongue […] - There is no definition of concept ‘mother tongue’ (Relation with /home language/, /primary language/, /heritage language/?) - And why ‘speaker’?

Rule Make your definition as general as possible as specific as necessary

Standards Within ISOcat currently there are little or no standards, Therefore CLARIN NL and VL will set up their own set of ‘standardized DCs’, Ineke will be in charge, selecting new flag “recommended by CLARIN NL/VL”

Standards Another issue wrt standards 'included' in ISOcat - Athens Core DC's (recommended by metadata/CMDI): we are currently adapting them in order to avoid tautologies and/or correct smaller ‘errors’ Target language: indicates if the language is the target language Conversation: […] three or more participants Same may be necessary for TEI Headers etc

DC/DCS and profile Profiles are not added automatically, a DCS may contain elements with various profiles (although you may decide to create several DCSs) (do select proper names!) In case the profile you need is not yet available, contact Menzo and Ineke

Part B: do’s & don’ts Do’s: Create a DCS for your scheme (name project, ann.scheme, …) Provide clear definition (short, to the point) for your scheme, application, …. Take care not to leave concepts used in your definition undefined or vague Use appropriate vocabulary (per profile) Check ‘adopted’ DC’s regularly till standardization !

Do’s (continued) When creating a DC, fill out Justification: used in XYZ, part of tagset N Language section –Always English language section –Strong recommendation: sections for object language(s), for working language manual –Sections in the various languages should match (+/- be translations of each other)

Do’s (continued) When creating a DC, fill out Example section –Note that *negative* examples may be very helpful! (jongens, mannen, niet: gelovigen (is form of ADJ))

Example sections Suppose you want to illustrate a German phenomenon: Ex.sec. in EN language section –German ex with transl in English Ex.sec. in NL language section –German ex with transl in Dutch Ex.sec. in EN linguistic section –EN example Ex.sec. in NL linguistic section –NL example with translation in English

Don’ts Confuse Language and Linguistic section –Latter contains language specific values for closed domains Be (too) language specific in definition Mention scheme in definition Use several definitions in one DC Circular definitions Rely on authority Rely on standardized status –Definition should fit YOUR scheme, etc

Procedure - 1

Procedure - 2

. --End --