Computer Science and Software Engineering University of Wisconsin - Platteville Note 9. Internationalization Yan Shi SE 3730 / CS 5730 Lecture Notes Part.

Slides:



Advertisements
Similar presentations
An Overview Of Windows NT System Student: Yifan Yang Student ID:
Advertisements

 Use the Left and Right arrow keys or the Page Up and Page Down keys to move between the pages. You can also click on the pages to move forward.  To.
Drives, Directories and Files. A computer file is a block of arbitrary information, or resource for storing information. Computer files can be considered.
Solutions for Multilingual Literature by XSL Formatter 6,800 known languages.
The right asset. In the right place. At the right time. International Printing Unicode ©2008 ZIH Corp.
11/13/01CS-550 Presentation - Overview of Microsoft disk operating system. 1 An Overview of Microsoft Disk Operating System.
Representing Information as Bit Patterns
INFORMATION TECHNOLOGY, THE INTERNET, AND YOU
Binary Expression Numbers & Text CS 105 Binary Representation At the fundamental hardware level, a modern computer can only distinguish between two values,
Internationalization of Java Platform Presenter: Ataru Nakazawa Advisor: Xiaoping Jia Date: January 23, 2004.
PZ01BX Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ01BX - Standardization, Internationalization Programming.
CS 0008 Day 2 1. Today Hardware and Software How computers store data How a program works Operators, types, input Print function Running the debugger.
Windows XP Language Interface Packs (LIPs) - Localized OSs for the Masses Russ Rolfe Program Manager.
CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011.
26 April 2001 Unicode and Windows XP, IUC 18 (Hong Kong) Unicode and Windows XP Cathy Wissink Program Manager, Globalization Windows Division Microsoft.
Computer Systems Nat 4/5 Computing Science Data Representation Lesson 3: Storing Text.
CHARACTERS Data Representation. Using binary to represent characters Computers can only process binary numbers (1’s and 0’s) so a system was developed.
Chapter 4: Operating Systems and File Management 1 Operating Systems and File Management Chapter 4.
 COMPUTER & INDUSTRY LANGUAGE For Commercial Art.
Introduction to Human Language Technologies Tomaž Erjavec Karl-Franzens-Universität Graz Tomaž Erjavec Lecture: Character sets
1 JCM 106 Computer Application for Journalism Lecture 1 – Introduction to Computing.
Sophia Antipolis, September 2006 Multilinguality, localization and internationalization Miruna Bădescu Finsiel Romania.
Unicode & W3C Jataayu Software C. Kumar January 2007.
TERMS TO KNOW. Programming Language A vocabulary and set of grammatical rules for instructing a computer to perform specific tasks. Each language has.
Localizing OpenClinica Hiroaki Honshuku: SQA 1. © What is Character Encoding?  Morse Code (1840) → Latin Alphabet  ASCII (1963)  The American Standard.
IBM Maximo Asset Management © 2007 IBM Corporation Tivoli Technical Exchange Calls Aug 31, Maximo - Multi-Language Capabilities Ritsuko Beuchert.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
Internationalization (I18N) Sufficiency Testing Presented to Seattle Area Software Quality Assurance Group June 19, 2003.
Week 4 Number Systems.
Spring /6.831 User Interface Design and Implementation1 Lecture 22: Internationalization.
Topics Introduction Hardware and Software How Computers Store Data
1 CS 502: Computing Methods for Digital Libraries Lecture 4 Text.
4 1 Operating System Activities  An operating system is a type of system software that acts as the master controller for all activities that take place.
Localization Michelle Johnston, Firebird Services Ltd.
MAC OS – Unit A Page: 10-11, Investigating Data Processing Understanding Memory.
Company Confidential 1 This presentation is solely for the use of Patni personnel. No part of it may be circulated, quoted, or reproduced for distribution.
INFOCODING BASICS & EXAMPLES OF CURRENT USE Introduction to Computer Science Using Ruby (c) 2010 Gideon Frieder.
Chapter Three The UNIX Editors. 2 Lesson A The vi Editor.
Bing Hong OSIsoft Internationalization &
Character Encoding, F onts. Overview Why do character encoding and fonts matter to linguists? How can you identify problems? Why do these problems arise?
Oracle9i Database Administrator: Implementation and Administration 1 Chapter 14 Globalization Support in the Database.
Week 7 Lecture 2 Globalization Support in the Database.
Introduction to Unix (CA263) File Editing By Tariq Ibn Aziz.
Chapter Three The UNIX Editors.
Copyright © 2006 Prentice-Hall. All rights reserved.1 Computer Literacy for IC 3 Unit 1: Computing Fundamentals Project 1: Identifying Types of Computers.
10 Localization Tips for e-Learning. Page Localization Tips for e-Learning Hello. Buenos Dias. Ciao. Konichiwa. Zdravstvuite.
Lesson: 2 Common Features and Commands After completing this lesson, you will be able to: Identify the main components of the user interface. Identify.
Module 7: SQL Server Special Considerations. Overview SQL Server High Availability Unicode.
PYP002 Intro.to Computer Science Microsoft Word1 Lab 04 - a Microsoft Windows Applications Common Features.
1 Problem Solving using Computers “Data....Representation, and Storage.
© 2001, Penn State University Encoding on the Internet Elizabeth J. Pyatt CETS.
MISSION CRITICAL COMPUTING SQL Server Special Considerations.
Systems and User Interface Software. Types of Operating System  Single User  Multi User  Multi-tasking  Batch Processing  Interactive  Real Time.
CHAPTER 1 COMPUTER SCIENCE II. HISTORY OF COMPUTERS (1.1) Eniac- one of the worlds first computers Used more electricity than an entire city block of.
THE CODING SYSTEM FOR REPRESENTING DATA IN COMPUTER.
1 Non-Numeric Data Representation V1.0 (22/10/2005)
DATA REPRESENTATION - TEXT
Unit 2.6 Data Representation Lesson 2 ‒ Characters
Computer Science II Chapter 1.
INTERNATIONALIZATION
Topics Introduction Hardware and Software How Computers Store Data
Localization Testing Chapter 9.
Representing Characters
Computer Science I CSC 135.
Topics Introduction Hardware and Software How Computers Store Data
INFOCODING BASICS & EXAMPLES OF CURRENT USE
University of Warith AL-Anbiya’a
DESIGNING WEB INTERFACE Presented By, S.Yamuna AP/CSE 8/1/2019
ASCII and Unicode.
Presentation transcript:

Computer Science and Software Engineering University of Wisconsin - Platteville Note 9. Internationalization Yan Shi SE 3730 / CS 5730 Lecture Notes Part of the contents are from Ibrahim Meru’s presentation slides

Terminology  Internationalization (I18N) —the process of designing a software application so that it can be adapted to various languages and regions without engineering changes —Making an application independent of any particular language or culture  Localization (L10N) —the process of adapting internationalized software for a specific region or language by adding local-specific components and translating texts.  Globalization (G11N) —G11N = I18N + L10N + multilingual support —Application can handle users from multiple countries/regions and languages (simultaneously)

Scope of I18N Example

Special Attention for G11N  Design and Implementation —DO NOT hard code your texts in the code —Be aware of language and cultural differences  Testing —Must have testers that recognize language and cultural defects  Deployment and Sales —Must follow business rules and regulations of the countries in which you see —Copyrights and anti-piracy practices  Installation —The install must be multi-language to direct users to their native language.  Support and maintenance —Must be able to communicate in the language and during regular business hours. —All documentation must be kept synchronized in multiple languages with the product.

Character Sets  ASCII: —the most popular character standard. —use only 7 bits  maximum of 128 —adequate for English  Code Pages: —a table of values describing the character set for a particular language —One code page per language/set of languages —There are hundreds of code pages —Different vendors may have difference code page numbering  Unicode: —an effort to include all characters from previous code pages into single character enumeration. —use 2 bytes

Code Page 437  Standard in U.S.  work for English and German  8-bit code point —0-127: ASCII — : international text characters

Interesting to Know (Alt code): How to type German on US keyboard? PART 1 - For this German character, type... These codes work with most fonts. Some fonts may vary. For the PC codes, always use the numeric (extended) keypad on the right of your keyboard and not the row of numbers at the top. (On a laptop you may have to use "num lock" and the special number keys.) German letter/symbolPC Code: Alt +Mac Code: option + ä0228u, then a Ä0196u, then A é0233E ö0246u, then o Ö0214u, then O ü0252u, then u Ü0220u, then U ß0223S

Some Other Code Pages  Microsoft Windows OEM Code Pages: (US) 720 (Arabic) 737 (Greek) 775 (Baltic) 850 (Multilingual Latin I): works for most Western European languages.850 (Multilingual Latin I) 852 (Latin II): works for Central and Eastern European languages.852 (Latin II) 855 (Cyrillic) 857 (Turkish) 858 (Multilingual Latin I + Euro) 862 (Hebrew) 866 (Russian)  874 (Thai) 874 (Thai)  932 (Japanese Shift-JIS) 932 (Japanese Shift-JIS)  936 (Simplified Chinese GBK) 936 (Simplified Chinese GBK)  949 (Korean) 949 (Korean)  950 (Traditional Chinese Big5) 950 (Traditional Chinese Big5)  1258 (Vietnam) 1258 (Vietnam)

Size of Text Messages  English requires fewer characters than most other western languages. As a rule of thumb, —French is 15% longer, —German is 25% longer. —Eastern languages, traditional or simplified Chinese, Japanese, and Korean require much fewer characters (2-3 character positions per word).  Special consideration must be made for UI design and functionality to handle different length text messages of the languages supported.  Message lengths also greatly complicates business forms and report designs.  E.g.: “Contact customer support for help”

Keyboard Test  Languages and cultures have different characters and special characters.  Keyboards differ from country to country to support their character sets and usage patterns.  These keyboards generate interrupts that must match the loaded code page.

German Keyboard

Arabic Keyboard

Traditional Chinese Keyboard

Hot Key Test  We may want Hot keys and Shortcuts to be different because the words on the menus are different. —“Copy” alt-c, what should it be for “kopieren”?  Hot key conventions differ – sometimes applications just stick with the English Hot key or short cut regardless of what the local command starts with.

Text Filter and Special Character Test  Sometimes software will block other codes than ASCII. These codes may be needed to support non-English languages.  Special characters in the middle of names may cause problems. —For example “O’Kelly”, ñ, ß, Ü.

Translation Test  The sentence structure of typical English “S-V- O”, etc.  Sentence structure may differ from language to language. Therefore, the software must be language sensitive w.r.t. sentence structure. —use variables in messages to assume any order:

Sorting Rules  Where do the characters of a specific language need to fall into a collating sequence?  This needs to be localized for people to use lists naturally.  English sorts by normal ASCII value sequence.  How to sort Chinese names?

Other Peripherals  Printer: —Some printers does not support certain languages. —Testers must be aware of these non-I18N printers and test for compatibility. —Sizes of papers may also cause issues: A4 or Letter?  Mouse with non-standard drivers  Wireless support: GMS, CDMA, 3G, LTE  Data storage: DVD, flash drive…

OS Localization Test  There is not just Windows 7, it is Windows 7 German, French, Chinese, etc. Need to test completely on all supported OS localizations.

Data Format  "01/02/03" ?  Time zones and daylight savings?  vs. 240,125 vs  Money symbols vary:  $125, > £125,000?  Address formats  Phone number formats  Calendar formats  Measurement units! (Mars Lander)

Colors  Colors are interpreted differently among regions.

Icon Design  Avoid humor, puns, slang, special, mythological, and religious symbols in icons.  Do not require user to understand subtleties of originating language, culture.  Ensure your icons are not offensive. —Thumbs up: insulting in Turkey —“Ok” sign: insulting in Brazil, other countries

Summary  I18N, L10N, G11N  Design Considerations: