Presentation is loading. Please wait.

Presentation is loading. Please wait.

CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University.

Similar presentations

Presentation on theme: "CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University."— Presentation transcript:

1 CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University

2 CIT3611 Week 5: Code sets2 Internationalisation - Basic Rules n Never hard-code translatable text n Do not reuse the same string in different context n 1 byte 1 character 1 glyph n Watch for strings with several parameters

3 CIT3611 Week 5: Code sets3 Internationalisation - Goals Making sure: n Your application is able to process text from any locale n The interface can be localised without changes in the source code n The documents or data created by your application are easy to localise

4 CIT3611 Week 5: Code sets4 Internationalisation - Code Sets Character set is like a "bag" of characters. Example: A, B, d, ñ n Code set, coded character set or code-page, is the same as the character set, but a specific value, the code (or code-point) affects each character. Example: A=65, B=66, d=100, ñ=241

5 CIT3611 Week 5: Code sets5 Code Sets - Get Your Facts Straight n The vocabulary pertaining to code sets is often used incorrectly. n The terms code set and code page are interchangeable. n Microsoft documentation is confusing regarding code sets. n Nadine Kano's book helps

6 CIT3611 Week 5: Code sets6 ANSI Windows not the real ANSI n The first version of Windows used ISO-8859-1 (Latin-1) for code set. Then Microsoft introduced 24 extra characters (codes from 0x80 to 0x9F) that are not part of Latin-1. n Noticeable in some of the fonts still shipped with Windows: MS Sans Serif has no glyph defined for these code-points. The code set for Windows US should be called Windows Latin- 1, or code-page 1252.

7 CIT3611 Week 5: Code sets7 "ANSI" not "Windows code set" n Some documents name the Windows code set "ANSI" even if when you use it in a different localised version of Windows, it is actually the Windows Cyrillic, or Windows Greek or Windows Turkish code set. n Same way the document uses "OEM" to refer to the DOS code-page, it should use a generic term for the Windows code set, rather than "ANSI."

8 CIT3611 Week 5: Code sets8 Don’t use ‘character sets’ or ‘charsets’ when you mean code sets n Code set is an implementation of the character set n Several code sets can implement the same character sets. In this case, the list of the characters supported is the same, but the codes are different. Eg. UCS-2 and UTF-8 are two different code sets, but they both implement the Unicode character set.

9 CIT3611 Week 5: Code sets9 Don’t mix up file format and file code set n People mix up the content and the container: the format of the file and its code set. They will say: "I saved this file in ASCII" when they really mean "I saved this file in Plain text." A plain text file could be in ASCII, but can also contain extended characters.

10 CIT3611 Week 5: Code sets10 Code Set - Families n DOS n ISO n Macintosh n Windows n IBM mainframe

11 CIT3611 Week 5: Code sets11 Code Sets - Unicode n Unicode an international character set n Has the principal scripts of the world n Unicode standard is foundation for the internationalisation and localisation of software n There are three levels of support for Unicode: 1: Combining characters not allowed 2: Avoid duplicate coded representations 3: All combining characters are allowed

12 CIT3611 Week 5: Code sets12 Han unification n To fit the tens of thousands of Chinese, Japanese and Korean ideograms in a 64-KByte space, Unicode uses the Han unification: where Japanese and Korean characters are derived from the Chinese characters. n In many cases the same symbol will mean the same thing.

13 CIT3611 Week 5: Code sets13 Character Composition n To support complex characters with diacritics, Unicode defines a generic way to encode a complex character. Instead of being coded in whole form, you can code any character with diacritics by using non-spacing marks. n Character composition is used, for example, to encode the Vietnamese characters.

14 CIT3611 Week 5: Code sets14 Surrogates n Hopefully you will not have to deal with surrogates. They are the mechanism put in place in Unicode to access the additional planes of ISO-10646. You can see them as "double- bytes," except they are double-wide-chars.

15 CIT3611 Week 5: Code sets15 Code Sets - Conversion n Converting from one code set to another is easy when you are only dealing with single-byte code sets.

16 CIT3611 Week 5: Code sets16 Screen-based help n plain text "Read Me" files, n tutorial files, n custom integrated help, n sample files and n stand-alone hypertext help.

17 CIT3611 Week 5: Code sets17 General Guidelines n Text Expansion n Jargon, Humor, Use of Gender- or Culture- Related Roles, Characteristics, or Issues n Consistency with Software, Hardware, and Documentation n Hypertext Links n Text Styles and Formatting

18 CIT3611 Week 5: Code sets18 General Guidelines cont. n On-Screen Controls n File Format

19 CIT3611 Week 5: Code sets19 Windows Online Help n "Title" Footnote Text n "Keyword List" Footnote Text n Definitions (Pop-up Topics)

20 CIT3611 Week 5: Code sets20 Prototyping the key to success n Effective prototyping may be the most valuable core competence an innovative organisation can hope to have (Michael Schreg) n ‘Spec Driven’ put much effort into developing a specification before proceding with production n ‘Prototype Driven’ begin with an early prototype, then proceed with many iterations

21 CIT3611 Week 5: Code sets21 Prototyping the essential medium of: n Information transmission n Interaction n Integration n Collaboration

22 CIT3611 Week 5: Code sets22 Work as play, play as work n You can ‘play your way’ to successful, innovative product development n At odds with traditional management models that champion predictability and control

23 CIT3611 Week 5: Code sets23 Supported by research n Research by Tabrizi & Eisenhart (Stanford) looked at 72 product dev projects in 36 countries in Asia, Nth America and Europe n Most effective were those that iterated constantly n Least were the hyper-organised, plan, plan planners n Strong prototyping cultures therefore produce strong products

Download ppt "CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University."

Similar presentations

Ads by Google