Standardization supporting cultural diversity 1: Character repertoires, ordering and assignments to the 12-key telephone keypad for European languages.

Slides:



Advertisements
Similar presentations
ESSnet Stanprep The CEN Standardisation Process. CEN Overview: A standard (French: Norme, German: Norm) is a technical publication that is used as a rule,
Advertisements

World Class Standards Standards Mandate M 376 – Phase 2 European public procurement of accessible ICT Mandate M European Accessibility requirements.
Current Languages of Europe
CHARACTERS Data Representation. Using binary to represent characters Computers can only process binary numbers (1’s and 0’s) so a system was developed.
Introduction to Human Language Technologies Tomaž Erjavec Karl-Franzens-Universität Graz Tomaž Erjavec Lecture: Character sets
Sophia Antipolis, September 2006 Multilinguality, localization and internationalization Miruna Bădescu Finsiel Romania.
Globalisation & Computer Systems week 5 1. Localisation presentations 2.Character representation and UNICODE UNICODE design principles UNICODE character.
Week 4 Number Systems.
Buongiorno – Italian – means good day, good morning
Character Encoding, F onts. Overview Why do character encoding and fonts matter to linguists? How can you identify problems? Why do these problems arise?
True/False 1.The Slavic language group has the most native speakers. False – Germanic.
Languages of Europe. Languages of Europe Europe is slightly larger than the United States, but the population is more than double. We speak English.
Globalisation & Computer systems Week 5/6 Character representation ACII and code pages UNICODE.
Standardization system in the European Union Werner STERK Federal Ministry of Economics and Technology Unit “Standardization, Conformity Assessment, Metrology”
© 2001, Penn State University Encoding on the Internet Elizabeth J. Pyatt CETS.
Languages of Europe SS6G11 The student will describe the cultural characteristics of Europe a. Explain the diversity of European languages as seen in a.
1 Standardization, Internationalization Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section.
1 Standardisation supporting cultural diversity: From 5 to 28 STF QD Expanding the language coverage of the ETSI spoken command vocabulary standard. Mike.
ETSI STF 488: Martin Böcker, Nikolaos Floratos, Loïc Martínez, Mike Pluke, Bruno Von Niman, Gill Whitney Cognitive accessibility to mobile ICT 15th International.
EUROPEAN DAY OF LANGUAGES. The European Year of Languages 2001 was organised by the Council of Europe and the European Union. Its activities celebrated.
Nat 4/5 Computing Science Data Representation Lesson 3: Storing Text
DATA REPRESENTATION - TEXT
OneM2M TP March 2017 Bruno Chenard.
SAP Digital Business Services June 2016
Europe SS6G11a. Explain the diversity of languages as seen in a comparison of German, English, Russian, French, and Italian.
Unit 2.6 Data Representation Lesson 2 ‒ Characters
Role of CEN and CENELEC in support to SMEs
Update on EFIS (ECO Frequency Information System) CEPT/ECC WGFM Civil/Military Meeting Dublin, November
Review of priority areas of reform
CEN/TC 389 Innovation Management
GREECE-ALBANIA IPA CROSS BORDER COOPERATION PROGRAMME
ERC/REC in EFIS 59th SRD/MG meeting Vienna, August 2013
EU-Russia Cooperation in the Areas of Science, Research and Innovation
European Day of Languages QUIZ
European Day of Languages QUIZ
European Day of Languages QUIZ
Homeroom Bell Ringer Take out agenda and open it to your behavior card. Take out signed progress report and give it to Ryan.
Representing Characters
Data Representation Question: Characters
The EUROPEAN UNION EUROPEAN UNION.
HFT 2008 Workshop: Guidelines for Generic UI Elements for 3G Mobile Devices, Services and Applications Bruno von Niman, Matthias Schneider & David.
Certification system for prepackages
Agenda What is a standard, who uses standards and what are they for?
Chapter 3 The DATA DIVISION.
ETSI STF333: European accessibility requirements for public procurement of products and services in the ICT domain (Phase 1, EC Standardisation Mandate.
Data Representation Conversion 05/12/2018.
CEN/ISSS DATSCG Luc Van den Berghe CEN/ISSS DATSCG
Conclusions and Next Steps
EUROPEAN LANGUAGES EUROPEAN LANGUAGES © Brain Wrinkles.
IRRS REFRESHER TRAINING Lecture 4
Germanic, Slavic, and Romance
EU and multilingualism
Smart Grids activities in ETSI
INFuture 2009, Zagreb, /7 17/2/19 Transcription and transliteration in a computer data processing Greta Šimičević Faculty of Humanities and Social.
Swedish Standards Organizations EC M.376 coordination meeting
Standardisation - What to expect from it?
Chapter 4 Company Code Global Parameters
The contribution of European Standardization to e-Accessibility
ETSI Standardization Activities on Smart Grids
Copy all of the information that you see on each slide.
ESO response to EU RFID Mandate M/436
LO1 – Understand Computer Hardware
ETSI STF333: European accessibility requirements for public procurement of products and services in the ICT domain (Phase 1, EC Standardisation Mandate.
The ETSI Standardisation Process
Summary of issues and results from GSC-10 User Workshop
Enabling and Improving the Use of Mobile e-Services
Outline Background: development of the Commission’s position
Developments related to future EU Nomenclature 14 December 2018
Subject Name: SOFTWARE ENGINEERING Subject Code:10IS51
Report of User WG Meeting
Presentation transcript:

Standardization supporting cultural diversity 1: Character repertoires, ordering and assignments to the 12-key telephone keypad for European languages and  languages used in Europe Martin Böcker, Karl Ivar Larsson and Bruno von Niman (ETSI STF 300) 20th International Symposium on Human Factors in Telecommunication

Agenda Background ETSI STF 300 Overview of the task Methodology Character repertoires, ordering and keypad assignments Summary

Background to STF 300 The problem: How to enable people to use ICT in their own language? Before ES 202 130, there was only a standard on assigning ‘A’ to ‘Z’ to the 12-key keypad. The assignment of other Latin, Greek and Cyrillic characters was not standardised.

Background to STF 300 The problem: ES 202 130 defined the assignment of major European languages to the 12-key keypad and defined sorting orders. Languages covered were those of the European Union countries (status 2006), candidate countries (Romania, Bulgaria, and Turkey), and the countries of the European Free Trade Area EFTA (Norway, Iceland, Switzerland, and Liechtenstein) as well as Russia.

Character Repertoires and Ordering

Overview of the task Languages not covered so far but to be covered now: Official languages of the remaining countries (e.g. Croatian and Ukrainian); Official minority languages of European countries (e.g. Welsh and Sorbian); Important, but not officially recognised European minority languages (e.g. Basque and Breton); Important immigrants’ languages spoken in Europe (e.g. Arabic and Urdu); Other languages of interest to manufacturers (e.g. Hebrew and Pinyin).

Overview of the task Co-operate with key industry players and recognized experts Update ETSI Standard 202 130 ”Character repertoires, ordering rules and assignment to the 12-key telephone keypad (European languages)”

Overview of the task Devices with telecommunication functionality the largest consumer product segment in the world Cultural and linguistic diversity one of the key strengths of Europe Easy, correct and efficient text input, search and retrieval via the telephone keypad a basic user requirement Takes into account work previously performed in ETSI, ITU-T, CEN/TC304 and ISO/IEC JTC1

Method Identify list of languages to be covered Initial proposal based on studies Industry consensus meeting Initial international round of comments Voting according to ETSI procedures

Character Repertoires and Ordering Letter repertoires and ordering Language-independent repertoires and ordering (e.g. Latin, Greek, and Cyrillic) Language-specific repertoires and ordering Keypad assignment of digits and letters Language-independent keypad assignment (e.g. Latin, Greek, and Cyrillic) of digits and letters Language-specific keypad assignment of digits and letters

Letter Repertoires and Ordering Principles 1 Combine repertoire and ordering information in one table Provide language-independent tables per script (e.g. Latin, Cyrillic, Greek)

Letter Repertoires and Ordering Principles 2 Describe letters in terms of standardized identifiers: Letter: Representation of the letter GSM 03.38 7-bit coding ISO/IEC 6937 coding ISO/IEC 10646 (Unicode) identifier ISO/IEC 10646 (Unicode) name Order characters according to established standards E.g. the Latin and Cyrillic language-independent repertoires are ordered according to ENV 13710

Letter Repertoires and Ordering Principles 3 Language-independent repertoires: Latin: covers all Latin-based letters covered by the scope of the document Cyrillic: Repertoire according to ISO/IEC 8859-5:1998 (applies to Bulgarian, Belarussian, Macedonian, Russian, Serbian and Ukrainian) Greek-script repertoire is identical with the Greek language-specific repertoire Develop repertoires for the other scripts included in the scope (e.g. Arabic and Georgian) Provide minimum Latin subset (“A – Z”) to be used with the non-Latin-based repertoires

Letter Repertoires and Ordering

Letter Repertoires and Ordering Principles 4 Language-specific repertoires List essential alphabet of a particular language and letters typically used in that language (from various recognised sources) Usage type: A classification of each letter according to the following principles: A: Letters essential to the language B: Letters commonly used in writing the language, but not essential for it Notes: Indication of special character ordering conditions for the language (explained in table notes at the bottom of the table)

Letter Repertoires and Ordering

Letter Repertoires and Ordering Example Ordering in Czech ábeti amoniak anton ápoteka äbeti bertil …

Letter Repertoires and Ordering Principles 5 Repertoire of digits and special characters Only one (European) language-independent table of digits and special characters is provided The need for language-specific tables is to be discussed The digits and special characters are ordered (at present) according to ISO/IEC 14651 resp. CEN ENV 13710

Letter Repertoires and Ordering

Keypad Assignment Tables Principles 6 The keypad assignment tables contain the following information: Key: the key of the 12-key keypad the respective letters are assigned to Letter: Representation of the letter ISO/IEC 10646 (Unicode) identifier ISO/IEC 10646 (Unicode) name

Keypad Assignment Tables Principles 7 If a character is assigned to a key of the 12-key keypad, it shall be assigned to the key specified in the respective table Letters with diacritical marks are assigned to the same key of the 12-key keypad as their respective basic letters (if existent), i.e. "ä" is is assigned to key "2" because "a" is assigned to "2" according to ITU-T E.161 A character may be additionally assigned to other keys Complete language-independent and language-specific tables may be implemented in any combination

Keypad Assignment Tables Principles 8 Non-Latin-based repertoires (e.g. the Greek-language repertoire and the Cyrillic-script repertoire) are assigned together with the minimum Latin-script repertoire Additional characters not covered by the present document may be assigned to a key Only tables for the assignment of small letters are specified, capital letters shall be assigned in the same way as the respective small letter

Keypad Assignment Tables Principles 9 Latin-script letters are assigned in the following order: Letters assigned to that particular key according to ITU-T E.161 (e.g. "abc" to key "2") The digit for the respective key according to ITU-T E.161 Type A letters according to the tables in Section 6 (e.g. "ä" on key "2" for German) Type B letters according to the tables in Section 6 (e.g. "à" on key "2" for German) (e.g. the resulting assignment for key "2" for German is "abc2äà")

Keypad Assignment Tables

Keypad Assignment Tables Principles 10 The language-independent Latin-script assignment: Letters are assigned to the above-mentioned principles and ordered according to ISO/IEC 14651 resp. CEN ENV 13710

Keypad Assignment Tables

Keypad Assignment Tables Principles 11 Non-Latin-based letter (e.g. Greek-script and Cyrillic-script letters) are assigned in the following order: Letters assigned to that particular key in alphabetic order (e.g. "абвг" to key "2"), alternative assignments have been discussed and discarded The digit for the respective key according to ITU-T E.161 Latin letters assigned to that particular key according to ITU-T E.161 (e.g. abc to key "2") For example, the resulting assignment for key "2" for Russian is "абвг2abc")

Keypad Assignment Tables

Keypad Assignment Tables Principles 12 The characters of the Non-Latin-based scripts (e.g. the Greek-language and the Cyrillic-script tables) are ordered according to ISO/IEC 14651 resp. CEN ENV 13710

Keypad Assignment Tables

Keypad Assignment Tables for Digits and Special Characters Principles 13 Digits and special characters are addressed in the language-independent table Language-specific tables are to be discussed The numbers of the respective keys are assigned according to ITU-T E.161 Currencies are assigned to the ‘*’-key Mathematical symbols are assigned to the ‘#’-key The plus-sign (including the functionality as the international access code ’00’) are assigned to the ‘0’-key All other special characters are assigned to the ‘1’-key

Special characters The full set of special characters specified in the present document (in table 61) must be supported. In addition, other characters may also be supported. The order of appearance specified in table 61 is only a recommendation, valid for a language-independent implementation. Alternative orders of appearance of special characters are allowed. Language-specific orders of appearance are also allowed (e.g. having the inverted question mark and the inverted exclamation mark used in Spanish higher up in the list, for a Spanish-language implementation).

Special characters The full set of special characters must be accessible via one single entry point. It is recommended that this entry point is the "1" key. In addition, a device may use different other keys to access different sets of special characters and/or digits. In this case, Rule 1 and Rule 6 must still be followed. Thereby, the possibility to implement language-specific keypad assignments of special characters and digits is made possible.

Keypad Assignments

Summary ETSI Standard 202 130 offers character repertoires, sorting orders and keypad assignments for the major European languages. STF 300 will update the standard to also cover further European and non-Europen, Latin-based and non-Latin scripted languages including important minority languages. The standard is an excellent example of work funded by the Commission of the EU, hosted by ETSI and realised by indstrial consensus.