The 2010 Secretary’s Annual Report on ISO/TC37/SC4 “Language resource management” /
Contents Membership Working Groups Thematic Domain Groups Task Forces Project Items On-going Ballots Meetings Issues and Proposals 2
Membership
Membership Organization P-members (23 ->24) O-members (9 -> 8) Liaisons (12 ->11) Key contact persons 4
Organization Chairperson Body: AFNOR (France) Name: Romary, Laurent Secretary Body: KATS (Korea) Name: Choi, Key-Sun Choi 5
P-members (24) 1. AENOR (Spain) 2. AFNOR (France) 3. ANSI (USA) 4. ASI (Austria) 5. BSI (United Kingdom) 6. CSK (Korea, DPR) 7. DIN (Germany) 8. DS (Denmark) 9. GOST R (Russian FED.) 10. JISC (Japan) 11. KATS (Korea, Rep.) 12. MSA (Malta) 1. NEN (Netherlands) 2. NSAI (Ireland) 3. PKN (Poland) 4. SABS (South Africa) 5. SAC (China) 6. SCC (Canada) 7. SIS (Sweden) 8. SN (Norway) 9. TISI (Thailand) 10. TSE (Turkey) 11. UNI (Italy) 12. UNMZ (Czech Rep.) 6
O-members (8) and liaisons (11) 1. ASRO (Romania) 2. BDS (Bulgaria) 3. DSSU (Ukraine) 4. ISS (Serbia) 5. NBN (Belgium) 6. SFS (Finland) 7. STAMEQ (Vietnam) 8. SUTN (Slovakia) 1. ISO/IEC JTC 001/SC ISO/TC 46/SC ISO/TC 184/SC ELRA 5. Infoterm 6. LISA 7. OMG 8. TEI 9. TERMNET 10. UIC 11. UNESCO 7
Key persons (Miss) Hyojin Won International Standards Support, KSA Jenny Pellaux ISOCS-TPM 8
Working Groups
WG 01 “Basic descriptors and mechanisms for language resources” Convenor: Nancy Ide WG 02 “Representation schemes” Convenor: Kiyong Lee WG 03 “Multilingual information representation” Convenor: Nasredine Semmar WG 04 “Lexical resources” Convenor: Nicoletta Calzolari WG 05 “Workflow of Language resource management” Convenor: 10
Thematic Domain Groups
Status: ad hoc Established in May 2004, Lisbon Triple Function: (1) Liaison to ISOCat (2) Incubator for new work item proposals (3) Working with international groups: e.g. ISA with IWCS, LREC, FLaReNet 12
TDG 01 Metadata: Peter Wittenburg TDG 02 Morphosyntactic data categories: Gil Francopoulo TDG 03 Semantic content representation: Harry Bunt Activity 01 Discourse relations: Koiti Hasida Activity 02 Dialogue acts: Harry Bunt Activity 03 Referential structures and Links: Laurent Romary Activity 04 Logico-semantic relations: Scott Farrar Activity 05 Temporal entities and relations: Kiyong Lee Activity 06 Semantic roles and argument structures: Thierry Declerk TDG 04 Syntactic data categories: Thierry Declerk TDG 05 Machine readable dictionary: Monte George TDG 06 Multilingual Ontology: Koiti Hasida TDG 07 Lexical semantics: Monica Monachini 13
Task Forces
Task Force for the Harmonization of Principles (TFH) Convenor: Nancy Ide Task Force for Terminology Coordination (TFTC) Convenor and liaison to TC 37/TCG: Alex C. Fang 15
Project Items
14 Active project items: WG 01 (4), WG 02(9), WG 03 (1) 3 Unregistered project items 2 ISO Published Standards ◦ISO : 2006 “Language resource management - Feature Structures - Part 1: Feature Structure Representation (FSR)” ◦ISO 24613: 2008 “Language resource management - Lexical Markup Framework (LMF)” 17
WG 01: BASIC DESCRIPTORS AND MECHANISMS FOR LANGUAGE RESOURCES Convenor: Nancy Ide 4 Projects 18
WG 01-01: WD “Language resource management - Feature structures – Part 1: Feature structure representation (FSR)” Project leaders: Kiyong Lee, Gerald Penn revision of ISO :2006 Feature Structures Part 1: Feature structure representation (FSR:2006) Joint work with TEI: Lou Burnard WG 01-02: FDIS “Language resource management - Feature Structures - Part 2: Feature Systems Declaration (FSD)” Project leaders: Kiyong Lee, Gerald Penn WG 01-03: DIS “Language resource management - Linguistic Annotation Framework (LAF)” Project leader: Nancy Ide WG 01-04: DIS “Language resource management - Persistent identification and access in language technology applications (PID)” Project leader: Daan Broeder 19
WG 02: REPRESENTATION SCHEMES Convenor: Kiyong Lee 9 Projects 20
21 WG 02-01: DIS “Language resource management - Morphosyntactic annotation framework (MAF)” Project leader: Eric de la Clergerie WG 02-02: DIS “Language resource management - Word segmentation of Text – Part 1: Basic concept s and general principles (WordSeg-1)” Project leader: SUN Maosong WG 02-03: WD “Language resource management - Word Segmentation of Text – Part 2: Word Segmentation for Chinese, Japanese and Korean (WordSeg-2)” Project leaders: SUN Maosong, Key-Sun Choi, Hitoshi Isahara
WG 02-04: FDIS “Language resource management - Syntactic annotation framework (SynAF)” Project leader: Thierry Declerck WG 02-05: DIS “Language resource management - Semantic Annotation Framework – Part 1: Time and events (SemAF/Time, ISO-TimeML)” Project leader: Kiyong Lee Editors: James Pustejovsky (chair), Branimir Boguraev, Harry Bunt, Nancy Ide, Kiyong Lee (Cancellation date: ) WG 02-06: DIS “Language resource management -Semantic Annotation Framework – Part 2: Dialogue acts (SemAF/ Dacts ) ” Project leader: Harry Bunt Editors: Harry Bunt (chair), Jan Alexadersson, Jean Carletta, Jae-woong Choe, Volha Petukhova, Alex C. Fang, Koiti Hasida, Andrei Popescu-Belis, Claudia Soria, David Traum, 22
WG 02-06: NP “Language resource management - Semantic Annotation Framework – Part 3: Named entities (SemAF/NE) ” Project leader: Gil Francopoulo WG 02-07: NP “Language resource management - Semantic Annotation Framework – Part 4: Semantic roles (SemAF/SRL) ” Project leader: Martha Palmer WG 02-08: NP “Language resource management - Semantic Annotation Framework – Part 5: Discourse Structures entities (SemAF/DS) ” Project leader: Gil Francopoulo WG 02-09: PWI “Language resource management - Semantic Annotation Framework – Part 6: Space (SemAF/ISO-Space) ” Project leader: James Pustejovsky 23
WG3 MULTILINGUAL INFORMATION Convenor: Nasredine Semmar 1 Project 24
WG 03-01: DIS “Language resource management - Multilingual information framework (MLIF)” ◦Project leader: Samuel Cruz-Lara ◦Limit date:
WG 4 Lexical Resources Convener: Nicoletta Calzolari 1 Project 26
WG 4-1: ISO Lexical Markup Framework (LMF) ◦Project leaders: Monte George, Gil Francopoulo ◦Status: ISO International Standard
Unregistered PWI ISO NP (OMG) “Language resource management – Simplified natural languages – Part 1: Basic concepts and general principles (simpL-1)” Project leaders: Thierry Declerck, Sung-Kwon Choi Editor: Doug Lawrence ISO NP 2462x “Language resource management – Segmentation rules eXchange (SRX)” Proposed project leader: Arle Lommel ISO PWI 2462x (OMG) “Language resource management – Temporal Vocabulary ” Proposed project leader: Mark Linehan 28
On-going Ballots
end date NP 2462x SRX FDIS WordSeg NP SemAF-SRL FDIS SynAF NP SemAF-DS DIS WordSeg DIS SemAF-Dacts
Meetings
Meetings /16: Tilburg, The Netherlands WG 2: MAF, SynAF, SemAF-Dacts /26: Fragrant Hill Hotel, Beijing, China WG 2 WordSeg-1/2 editorial meeting /05: Brandeis, Waltham, MA, USA WG 1-2, FLaReNet, SILT 32
Meetings /20: City University of Hong Kong WG 1, WG 2, WG 3, WG 4, ISOCat ISA-5, ICGL /22: Beijing Xijiao Hotel, Beijing, China WG2 WordSeg-2 Editorial Meeting /21: Valletta, Malta Tutorial + LRT workshop + WG2 + TDG 3, LREC /20: Dublin, Ireland TC 37 and SCs Annual Meetings /15: DIN, Berlin, Germany TDG 1 + WG 2 + WG 4 33
Meetings /11 Oxford, United Kingdom WG 2 + ISA-6, IWCS 2011 ( /14) : to be discussed /19: TC 37 + SCs meetings, Seoul Palace Hotel, Seoul, South Korea : to be discussed 34
Issues and Proposals
Cross-institutional collaborations ISO/TC 37/SC 4 generic models for LR mana gement target expert groups with wi de international coverage stability - consensus ISO/TC 37/SC 4 generic models for LR mana gement target expert groups with wi de international coverage stability - consensus TEI – Text Encoding Initiative reference XML vocabularies specification infrastructure ( ODD) back office format for ISO d ocuments reactivity larger community TEI – Text Encoding Initiative reference XML vocabularies specification infrastructure ( ODD) back office format for ISO d ocuments reactivity larger community W3C dedicated application profile fo r web-based applications articulation with other web-bas ed standards (e.g. web service) industry based requirements bridge to various industries, e. g. localization W3C dedicated application profile fo r web-based applications articulation with other web-bas ed standards (e.g. web service) industry based requirements bridge to various industries, e. g. localization
Consequences for SC 4 Work on a wide coverage of language resource levels ◦Ex.: Systemacity of SemAF components (Time, Space, Dialogue Acts, Named entities, discourse structures, semantic roles) Articulate SC 4 standards with industry standards ◦Ex.: MLIF XLIFF, TMX, SMIL Avoid maintaining XML formats as ISO standard ◦Ex.: SynAF. Tiger or TEI can be good serialisations
Proposal for wiki Website of TC37/SC4 – Purpose: To give information to the experts To communicate with standard users To show the feasible solution based on standards – Maintenance Convenor and project leaders will put the information – Idea collection stage Organization of wiki – Please access to: id: WikiSysop (case-sensitive) pw: isowiki$&14
Practical problems PWI -> NP stage (1) Working draft (2) Editorial or consulting group Management: co-PLs Editorial: DIS -> FDIS stage (1) Producing documents in MS Word format (ODD) (2) Figures Volume control on each document 39
Thank you.