Terminology standards – enhancing language ISO/TC 37 Semantic Interoperability ISO TC 37 Secretariat c/o Infoterm Christian Galinski Bamako (Mali) /07
ISO/TC 37 – Bamako /07 Overview UNESCO’s IFAP Area 4 IFAP UNESCO and multilinguality Advocating open access solutions Language in industry eContent development Global semantic interoperability Standards for... Terminology standardization Terminology? Content entities Terminology eContent Terminology in ISO/TC 37 + Language resources & LR management + Content resources Standardization of terminological principles and methods ISO/TC 37 ISO/TC 37/SC 1 ~ 4 ISO/TC 37 Outlook Semantic interoperability – HOW?
ISO/TC 37 – Bamako /07 What is terminology? The description of the specialized vocabulary of an application domain Cf. Eugen Wüster: conceptual view knowledge representation at concept level Monolingual or multilingual Mainly nouns (in cl. multi-words nominal units), some verbs, adjectives and adverbs A strong yet practical simplification of lexical description Increasing occurrence of non-verbal knowledge representations
ISO/TC 37 – Bamako /07
IFAP Areas of intervention What are IFAP’s areas of intervention? Area 1: Development of international, regional and national information policies Area 2: Development of human resources and capabilities for the information age Area 3: Strengthening institutions as gateways for information access Area 4: Development of information processing and management tools and systems (Multilingualism) standards ISO/TC 37 methodology standards: terminology language resources (at the level of concepts) other content entities (at the level of concepts)
ISO/TC 37 – Bamako /07 UNESCO and multilinguality Promoting a wider, more equitable access to information (« Recommendation on the promotion of multilingualism and universal access to Cyberspace »/ Initiative Raising awareness of issues of equitable access and multilingualism Encouraging Member States to Develop strong policies which promote and facilitate language diversity on the Internet Guidelines for Terminology Policies Create widely-available online tools and applications (such as terminologies, automatic translators, dictionaries) for content in local languages Share of best practices and information ISO/TC 37
ISO/TC 37 – Bamako /07 Advocating open access solutions “ Member States and international organizations should encourage open access solutions including the formulation of technical and methodological standards for information exchange, portability and interoperability, as well as online accessibility of public domain information on global information networks.” (UNESCO Recommendation on Multilingualism and Access to Cyberspace) “ Governments should promote the development and use of open, interoperable, non-discriminatory and demand-driven standards. ” (WSIS Action Plan) Open source software? + Open content?
ISO/TC 37 – Bamako /07 Language in industry Exchange of content entities: e.g. entry in a product catalogue Name of company (® enterprise) Name of product (model) (™ enterprise) Generic name of product (e.g. © Harmonized System) Class (name under which the product falls) (e.g. © Verbal/textual description (© enterprise) Picture (© rights owner) Technical data (unified) branch properties (e.g. © OAGi) Standardized characteristics (e.g. © DIN) Enterprise product specific data (e.g. for collaborative business) Enterprise internal data (maybe confidential/secret) 225/55/16 V
ISO/TC 37 – Bamako /07 eContent DEVELOPMENT Workflow management for content development: net-based, distributed, cooperative creation of structured content CO-OPERATION INTEROPERABILITY STANDARDIZATION Re-use in applications: (based on the “single-source” principle) eLearning eGovernment eHealth eBusiness other e...s multilingual multimodal multimedia complying with multi-channel output accessibility requirements
ISO/TC 37 – Bamako /07 THE CHALLENGE: (user point-of-view) throughout the enterprise/organization requested e.g. in e-government between enterprises/organizations requested by the market within industry consortia requested by industry branches between industry consortia ??? (urgently needs harmonization and especially open standards) between different e…s requested by the user between different language communities requested by the end user within the standardization world Global Semantic Interoperability
ISO/TC 37 – Bamako /07 STANDARDS FOR: hw sw methodology standards Technology ITU, ISO, IEC, industry Business models UN/ECE, ISO, industry “Language” ISO/TC 37, research consortia Transfers/transactions ITU, UN/ECE, industry Standards* MoU/MG – why? Content ? Methodology!!! semantic interoperability Legal issues ? *standards should be examined, whether they support, allow or hinder multilinguality and cultural diversity (very important for SMEs) and semantic interoperability at large
ISO/TC 37 – Bamako /07 Terminology standardization Standardization of terminologies Terminological data Linguistic and non-linguistic representations Designations: term, abbreviation, graphic symbol, formula, acoustic symbol, etc. Descriptions: definition, explanation, non-linguistic [descriptive] representation, etc. Source-related data Data management related data (field, record, holding) Classification (multiple) Terminology-related data: names, phraseology,... Standardization of terminological principles and methods generic for many types of content entities
ISO/TC 37 – Bamako /07 Terminology? content entities Terminology? knowledge representations Nomenclature, taxonomy, typology, partonomy,... Glossary, vocabulary,... Terminological phraseology Graphical symbols and other non-linguistic representations? Properties, characteristics, attributes,... Ontology Names? to be further studied + closely related: Thesauri, classification schemes, keywords Encyclopedic (knowledge) entries Knowledge-enriched terminology entries Names, proper names,... Ontologies, topic maps,... ONE methodology
ISO/TC 37 – Bamako /07 Terminology eContent embedded terminology (or combination of terminology + …) Texts: translation, localization, internationalization… Speech: communication… Image: CAD/CAM… Multimedia: video, presentations… knowledge-rich terminology Encyclopedic knowledge: Wikipedia… “Knowledge” management: incl. true “content management” document management, communication management, information management “popularized” terminology “Terminology and other language and content resources” ONE methodology
ISO/TC 37 – Bamako /07 Terminology today Given its pervasive occurrence in all (written or spoken) domain communication, terminology today has to be considered an economic factor especially in product data description and management (incl. eCatalogues and product classification) quality management inter-cultural aspects of management and marketing translation and localization information, documentation, software development knowledge transfer, teaching and training, … Multilinguality and cultural diversity terminology science as a field of fundamental research as well as applied R&D impact on standardization
ISO/TC 37 – Bamako /07 Terminology in ISO/TC 37 Multifunctional nature of terminology: Terminology as knowledge representation Terminologies as means of domain communication Terminologies as means of access to other kinds of information (objects) Terminologies as means of knowledge ordering at micro-level
ISO/TC 37 – Bamako /07 + Language resource management Language resources: Text corpora tagging (on the basis of grammar models) Lexicographical data Words Collocations Morphology Terminology Speech data LR management: Input / import Metadata (incl. bundling/bindings etc.) Data modelling & metamodel(s) Exchange / interoperability etc.
ISO/TC 37 – Bamako /07 + other kinds of content entities Textual & non-linguistic types of content: Audio information (e.g. read-out written content) av information (e.g. sign language) Multimedia information Haptic information (e.g. in “intelligent cars”) … Increasingly different (technical) types of content co-occur or are embedded in each other or are combined with each other – e.g. traffic telematics
ISO/TC 37 – Bamako /07 ISO/TC 37 – Standardization of terminological principles and methods Fundamental principles Vocabulary of terminology Terminography Language resource management Terminology work (especially systematic ~~) Applications based on terminology methods Content management? eContent mContent Multilingual, multimodal, multimedia, universal accessibility, multi-channel Re-usability interoperability/ies Resource-sharing peer2peer
ISO/TC 37 – Bamako /07 ISO/TC 37 Old title: Terminology and other language resources Old scope: Standardization of principles, methods and applications relating to terminology and other language resources New title: Terminology and language and content resources New scope: Standardization of principles, methods and applications relating to terminology and other language and content resources in the contexts of multilingual communication and cultural diversity As is the case with terminologies, language resources in general have to be considered as multilingual, multimedia and multimodal from the outset. Generic fundamental standards for all activities involving language
ISO/TC 37 – Bamako /07 ISO/TC 37/SC 1 (1) Title: Principles and methods Old scope: Standardization of basic principles and methods for developing scientific and technical terminologies and other language resources New scope: ??? still under discussion ISO/TC 37/SC 1 prepares the meta-standards for the documents prepared by ISO/TC 37/SCs 2, 3 and 4, which cannot be consistent and coherent without these standards. The same applies to the documentation of content management in organizations.
ISO/TC 37 – Bamako /07 ISO/TC 37/SC 1 (2) The following standards are under the direct responsibility of ISO/TC 37/SC 1: ISO 704:2000Terminology work – Principles and methods ISO 860:1996Terminology work – Harmonization of concepts and terms ISO :2000Terminology work – Vocabulary – Part 1: Theory and application The following standards are under preparation: ISO/CD 704Terminology work – Principles and methods ISO/CD 860Terminology work – Harmonization of concepts and terms ISO/PWI Terminology work – Vocabulary – Part 1: Theory and application ISO/WD 22134Practical guide for socioterminology
ISO/TC 37 – Bamako /07 ISO/TC 37/SC 2 (1) Title: Terminography and lexicography New scope: Standardization of terminological and lexicographical working methods, procedures, coding systems, workflows, and cultural diversity management, as well as related certification schemes Tens of thousands of terminology commissions, committees and other terminological entities (especially terminology standardizing SCs and WGs within the standardization framework) are using ISO/TC 37/SC 2 standards. This indirectly improves the overall degree of re-usability and interoperability of the resulting data and documents.
ISO/TC 37 – Bamako /07 ISO/TC 37/SC 2 (2) The following standards are under the direct responsibility of ISO/TC 37/SC 2: ISO 639-1:2002Codes for the representation of names of languages – Part 1: Alpha-2 code ISO 639-2:1998Codes for the representation of names of languages – Part 2: Alpha-3 code ISO 1951:1997Lexicographical symbols and typographical conventions for use in terminography ISO 10241:1992International terminology standards -- Preparation and layout ISO 12199:2000Alphabetical ordering of multilingual terminological and lexicographical data represented in the Latin alphabet ISO 12616:2002Translation-oriented terminography ISO 15188:2001Project management guidelines for terminology standardization
ISO/TC 37 – Bamako /07 ISO/TC 37/SC 2 (3) The following standards are under preparation: ISO/CD 639-3Codes for the representation of names of languages – Part 3: Alpha-3 code for comprehensive coverage of languages ISO/WD 639-4Codes for the representation of names of languages – Part 4: Implementation guidelines and general principles for language coding ISO/WD 639-5Codes for the representation of names of languages – Part 5: Alpha-3 code for language families and groups ISO/CD 639-6Codes for the representation of names of languages – Part 6: Extension coding for language variation ISO/DIS 1951Presentation/representation of entries in dictionaries ISO/CD Terminological entries in standards – Part 1: General requirements ISO/AWI Terminological entries in standards ISO 12615Bibliographic references and source identifiers for terminology ISO/PWI TR 22128Quality assurance guidelines for terminology products ISO/PWI 22130Additional language coding ISO/NP 23185Assessment and benchmarking of terminological holdings
ISO/TC 37 – Bamako /07 ISO/TC 37/SC 3 (1) Old title: Computer applications for terminology New title: Terminology management systems and content interoperability New scope: Standardization of principles and requirements for semantic interoperability, terminology and content management systems, and knowledge ordering tools Software developers are taking the documents of ISO/TC 37/SC 3 for designing terminology management systems (TMS) or terminology management modules to be integrated into content management as well as information and knowledge management systems. In this way the terminological principles and methods (provided by ISO/TC 37/SC 1) are directly integrated as ‘defaults’ into concrete system design for handling all kinds of information.
ISO/TC 37 – Bamako /07 ISO/TC 37/SC 3 (2) The following standards are under the direct responsibility of ISO/TC 37/SC 3: ISO :2000Terminology work – Vocabulary – Part 2: Computer applications ISO 6156:1987Magnetic tape exchange format for terminological/ lexicographical records (MATER) - withdrawn ISO 12200:1999Computer applications in terminology – Machine-readable terminology interchange format (MARTIF) – Negotiated interchange ISO 12620:1999Computer applications in terminology – Data categories ISO 16642:2003Computer applications in terminology – Terminological markup framework
ISO/TC 37 – Bamako /07 ISO/TC 37/SC 3 (3) The following standards are under preparation: ISO/PWI TR 12618Computational aids in terminology – Design, implementation and use of terminology management systems ISO/CD Computer applications in terminology – Data categories – Part 1: Model for description and procedures for maintenance of data category registries for language resources ISO/CD Computer applications in terminology – Data categories – Part 2: Terminological data categories
ISO/TC 37 – Bamako /07 ISO/TC 37/SC 4 (1) Title: Language resource management Scope: Standardization of specifications for computer- assisted language resource management Given the fact that linguistic infrastructures are being established or re-enforced as part of the rapidly evolving information and communication society; professional activities involving language resource sharing and standardization are increasing in diverse areas: governmental or non-governmental organizations, public or private institutions, educational institutions, commercial enterprises, etc., both, globalization and localization necessitate multilingual communication; there is an increasing need for new standardization as well as urgent recognition of existing de facto standards and their transformation into International Standards.
ISO/TC 37 – Bamako /07 ISO/TC 37/SC 4 (2) The following standards are under preparation: ISO/AWI 21829Terminology for language resources ISO/CD Language resource management – Feature structures – Part 1: Feature structure representation ISO/WD 24611Language resource management – Morphosyntactic annotation framework ISO/WD 24612Language Resource Management – Linguistic Annotation Framework ISO/WD 24613Language resource management – Lexical markup framework ISO/AWI Word segmentation of written texts for mono-lingual and multi-lingual information processing – Part 1: General principles and methods ISO/AWI Word segmentation of written texts for mono-lingual and multi-lingual information processing – Part 2: Word segmentation for Chinese, Japanese and Korean ISO/NP Word segmentation of written texts for mono- lingual and multi-lingual information processing – Part 3: Word segmentation for other languages
ISO/TC 37 – Bamako /07 ISO 16642* (family of) metamodels* Datamodels ISO 12200** Datamodels** eBusiness Datamodels other e...s** Datamodels other e...s** Data categories ISO 12620*** Domain data dictionaries*** DDDsDDDsDDDsDDDs ************ Basic principles and requirements concerning multilingual e/m-content development, data categories/metadata, data modelling, rules for repositories (maintained in MAs/RAs/Reg’s) *ISO TMF; ISO EXPRESS; ISO SDAI; … **ISO MARTIF; ISO PLIB ~ IEC ***ISO Data categories; ISO Fastener dictionary; IEC Core dictionary; … State-of-the-art METHODOLOGYAPPLICATIONS
ISO/TC 37 – Bamako /07 Semantic interoperability standards Content-related requirements Workflow methodology Metadata Metadata repositories Data modelling principles and requirements Micro data models Metamodels Content repositories Federation of repositories …
ISO/TC 37 – Bamako /07 CONFERENCES Terminology Summer School - Cologne (Germany) /23 TAMA 2005 “Terminology in Advanced Management Applications” – Wiesbaden (Germany) TKE 2005 “Terminology and Knowledge Engineering” – Copenhagen (Denmark) /19 OFMR 2006 “Open Forum on Metadata Registries” – Japan /22
Thank you for your attention ISO/TC 37 c/o Infoterm – International Information Centre for Terminology Aichholzgasse 6/12 A-1120 Vienna – Austria Tel: Fax: ISO/TC 37 Secretariat: Secretary: Christian Galinski Chairman: Håvard Hjulstad (SN) ADDRESS: