15.1.2003Software Engineering 2003 Jyrki Nummenmaa 1 GLOBALISATION I bet it is quite natural to dream about writing software that is being sold around.

Slides:



Advertisements
Similar presentations
Using Microsoft PowerPoint in the Classroom
Advertisements

CSCI 6962: Server-side Design and Programming Input Validation and Error Handling.
 Caesar used to encrypt his messages using a very simple algorithm, which could be easily decrypted if you know the key.  He would take each letter.
Chapter 9 Characters and Strings. Topics Character primitives Character Wrapper class More String Methods String Comparison String Buffer String Tokenizer.
Microsoft ® Office Excel ® 2007 Training Get started with PivotTable ® reports [Your company name] presents:
1 Lab Session-IV CSIT-120 Spring 2001 Lab 3 Revision and Exercises Rev: Precedence Rules Lab Exercise 4-A Machine Language Programming The “Micro” Machine.
1. Discrete / Continuous Representations Of numbers – binary & decimal Bits Hexadecimal - 'Hex' Representing text Bits and Bytes.
8 November Forms and JavaScript. Types of Inputs Radio Buttons (select one of a list) Checkbox (select as many as wanted) Text inputs (user types text)
Internationalization of Java Platform Presenter: Ataru Nakazawa Advisor: Xiaoping Jia Date: January 23, 2004.
PZ01BX Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ01BX - Standardization, Internationalization Programming.
Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address.
Portability CPSC 315 – Programming Studio Spring 2008 Material from The Practice of Programming, by Pike and Kernighan.
Characters and Strings. Characters In Java, a char is a primitive type that can hold one single character A character can be: –A letter or digit –A punctuation.
Fundamentals of Programming in Visual Basic 3.1 Visual basic Objects Visual Basic programs display a Windows style screen (called a form) with boxes into.
CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011.
ASCII and Unicode. ASCII Inside a computer, EVERYTHING is a number – that includes music, sound, and text. In the early days of computers, every manufacturer.
Decisions in Python Comparing Strings – ASCII History.
CHARACTERS Data Representation. Using binary to represent characters Computers can only process binary numbers (1’s and 0’s) so a system was developed.
Computer Science and Software Engineering University of Wisconsin - Platteville Note 9. Internationalization Yan Shi SE 3730 / CS 5730 Lecture Notes Part.
Chapter 9 Collecting Data with Forms. A form on a web page consists of form objects such as text boxes or radio buttons into which users type information.
Sophia Antipolis, September 2006 Multilinguality, localization and internationalization Miruna Bădescu Finsiel Romania.
© The McGraw-Hill Companies, 2006 Chapter 1 The first step.
M1G Introduction to Programming 2 4. Enhancing a class:Room.
CO1552 – Web Application Development Lists, Special Characters, and Tables.
Globalisation & Computer Systems week 5 1. Localisation presentations 2.Character representation and UNICODE UNICODE design principles UNICODE character.
Sakai: Localization & Internationalization Beth Kirschner University of Michigan
SOFTWARE INTERNATIONALIZATION Dallas Ramsden. Internationalization GOAL Software that can run ANYWHERE in the world without having the source code changed.
Spring /6.831 User Interface Design and Implementation1 Lecture 22: Internationalization.
1 Lab Session-III CSIT-120 Fall 2000 Revising Previous session Data input and output While loop Exercise Limits and Bounds Session III-B (starts on slide.
Faculty of Sciences and Social Sciences HOPE JavaScript Validation Regular Expression Stewart Blakeway FML
Software Engineering – University of Tampere, CS DepartmentJyrki Nummenmaa Internationalisation.
Data Representation CS280 – 09/13/05. Binary (from a Hacker’s dictionary) A base-2 numbering system with only two digits, 0 and 1, which is perfectly.
Company Confidential 1 This presentation is solely for the use of Patni personnel. No part of it may be circulated, quoted, or reproduced for distribution.
PHP meets MySQL.
Chapter 3: The UNIX Editors ASCII and vi Editors.
10/8/2015© Jeff Offutt, Menu Design Guidelines Jeff Offutt SWE 432 Design and Implementation of Software for.
Data Compression. How File Compression Works If you download many programs and files off the Internet, you've probably encountered ZIP files before. This.
Text and Graphics September 26, Unit 3.
Chapter 14 Internationalization F Processing Date and Time –Locale –Date –TimeZone –Calendar and GregorianCalendar –DateFormat and SimpleDateFormat F Formatting.
Liang, Introduction to Java Programming, Fifth Edition, (c) 2005 Pearson Education, Inc. All rights reserved Chapter 26 Internationalization.
Chapter 12: Internationalization Processing Date and Time Processing Date and Time  Locale  Date  TimeZone  Calendar and GregorianCalendar  DateFormat.
Getting Started with MATLAB 1. Fundamentals of MATLAB 2. Different Windows of MATLAB 1.
Globalisation & Computer systems Week 5/6 Character representation ACII and code pages UNICODE.
Oracle9i Database Administrator: Implementation and Administration 1 Chapter 14 Globalization Support in the Database.
1 Data Representation Characters, Integers and Real Numbers Binary Number System Octal Number System Hexadecimal Number System Powered by DeSiaMore.
Copenhagen, 6 June 2006 EC CHM Multilinguality Anton Cupcea Finsiel Romania.
STAYING SAFE: Here are some safety tips when using Change your password regularly and keep it in a safe place. Don’t share your password with anyone.
GREP. Whats Grep? Grep is a popular unix program that supports a special programming language for doing regular expressions The grammar in use for software.
Getting Started with MATLAB (part2) 1. Basic Data manipulation 2. Basic Data Understanding 1. The Binary System 2. The ASCII Table 3. Creating Good Variables.
Chapter 14 Internationalization F Processing Date and Time –Locale –Date –TimeZone –Calendar and GregorianCalendar –DateFormat and SimpleDateFormat F Formatting.
Week 7 Lecture 2 Globalization Support in the Database.
CIT3611 Software i18n Wk 4: Code sets, Online Help, Prototyping David Tuffley School of Computing & IT Griffith University.
 2008 Pearson Education, Inc. All rights reserved JavaScript: Introduction to Scripting.
Programming Fundamentals. Overview of Previous Lecture Phases of C++ Environment Program statement Vs Preprocessor directive Whitespaces Comments.
1 Chapter 20 Internationalization. 2 Objectives F To describe Java's internationalization features (§ 20.1). F To construct a locale with language, country,
CPS120: Introduction to Computer Science Variables and Constants.
Understanding Character Encodings Basics of Character Encodings that all Programmers should Know. Pritam Barhate, Cofounder and CTO Mobisoft Infotech.
© 2001, Penn State University Encoding on the Internet Elizabeth J. Pyatt CETS.
CS 330 Programming Languages 09 / 25 / 2007 Instructor: Michael Eckmann.
Chapter 7 Continued Arrays & Strings. Arrays of Structures Arrays can contain structures as well as simple data types. Let’s look at an example of this,
Unit 2.6 Data Representation Lesson 2 ‒ Characters
CPSC 315 – Programming Studio Spring 2012
Portability CPSC 315 – Programming Studio
Introduction to TouchDevelop
Fundamentals of Data Representation
from one screen looking out
Cooper Part II Making Well-Behaved Products Different Needs
ASCII and Unicode.
Presentation transcript:

Software Engineering 2003 Jyrki Nummenmaa 1 GLOBALISATION I bet it is quite natural to dream about writing software that is being sold around the world… However, there may be some small obstacles on the way to selling your software worldwide. Today we study potential problems and solutions. Terms: Localisation = adjusting software locally Globalisation, internationalisation = creating a software in such a way that it is eays to localise it to different countries.

Software Engineering 2003 Jyrki Nummenmaa 2 History As a little warm-up, consider the situation about 20 years ago. At that time there was an increasing interest into creating software products, which could be sold to different customers within Finland. For customising the software, it was important to put all user interface constants into a place, where they are easy to change. Program code is not such a place -> parameter files or database are a much better choice.

Software Engineering 2003 Jyrki Nummenmaa 3 Simple example x.html deals with internationalisation issues. x.html We have a quick look a their example. In the example, multilingual texts are managed using - a locale, identified by a (country, language) pair - resource bundles, one per locale, - property files, where strings are identified by keys Strings are identified by keywords within a locale.

Software Engineering 2003 Jyrki Nummenmaa 4 What should be globalised? Messages Labels on GUI components Online help Sounds Colors Graphics Icons Dates Times Numbers Currencies Measurements Phone numbers Honorifics and personal titles Postal addresses Page layouts -Our example in the previous slide only dealt with a simple message! -Labels can also be managed in a fairly straightforward manner, if enough space is reserved for them. -Now let’s have a look at the rest…

Software Engineering 2003 Jyrki Nummenmaa 5 Locales As we saw, globalisation in Java was based on the use of Locales. A local is identified by a language (compulsory), country (optional) and variant (optional). A class, whose behaviour is based on the use of a locale, is called locale-sensitive. You can find locales available to a locale-sensitive class by using the getAvailableLocales() method. There is also a default locale for a Java Virtual Machine, and it can be accessed by Locale.getDefault() Different objects may use different locales.

Software Engineering 2003 Jyrki Nummenmaa 6 Identify what needs to be managed through locales As you think about locales, you will find out that you have - data items such as messages and sounds, which change altogether with the locale, and - data items, which remain the same, but whose formatting changes, e.g. dates and numbers - possibly data items not to be localised (internal use, interface to another application, …). Design the globalisation - identify which is which. Arrange your data items into resource bundles (e.g. items for the same form in the same bundle, so that you will not need to load unnecessary objects).

Software Engineering 2003 Jyrki Nummenmaa 7 Formats - numbers Numbers are formatted differently in different countries, e.g.: ,246 – France ,246 – Germany 345, US Java includes a NumberFormat class that can be used to format numbers, currencies (no exchange rates, though :) and percentages You can use the NumberFormat class to both create formatted strings and parse strings. You can also provide your own patterns, if this is not enough for you…

Software Engineering 2003 Jyrki Nummenmaa 8 Dates and Times Similarly as with numbers, dates and time are represented differently. Also similarly, there is a DateFormat class, which you can use to create standard date and time formats. Here again, you may customise – and you may also define your own names for things such as weekdays etc.

Software Engineering 2003 Jyrki Nummenmaa 9 Messages containing variable parts Examples: - 405,390 people have visited your website since January 1, (1) - The number has been activated. (2) Word order may change between languages, which may make it impossible to correctly translate message (1) assuming that it is the text between the number and the date. In message (2) the word “activated” may require different translation in some languages (e.g. French) depending on the gender of the word for the device name. Basic rule of thumb: If you can avoid messages containing these variable parts, then do so!

Software Engineering 2003 Jyrki Nummenmaa 10 Class MessageFormat With the MessageFormat class you can define a message template, which gives the message text and shows where to format the changing data and how. With ChoiceFormat, you can choose between strings using based on a number you give as a parameter (this is particularly handy for managing plurals).

Software Engineering 2003 Jyrki Nummenmaa 11 Characters US Ascii – 7 bit ISO 8859-X where X is some digit – an 8-bit system – if 8th bit is 0, then the first 7 bits represent a US Ascii character. Windows 125x codepages – similar to ISO 8859-X, but not the same of course – typical Windows interoperability nightmare… Unicode – meant to represent all characters from all languages. Needs more bits (usually done with 16) but there are several encoding schemes. Some, for instance, use two bytes (16 bits) for some characters and one byte (8 bits) for some…

Software Engineering 2003 Jyrki Nummenmaa 12 Chinese and Japanese Thousands of symbols. Unicode can do – but you need more pixels on the screen as well. In Japanese there are several writing systems. Text input can be done as followed: 1. The user types in the word in some phonetic writing system based on latin characters. 2. The system shows the characters (there may be many) matching the phonetic writing. 3. The user picks the right character.

Software Engineering 2003 Jyrki Nummenmaa 13 Korean In the Korean writing system (hangul), characters are composed from parts based on which character follows which. There is a limited number of building blocks ie. character parts (can’t remember, but maybe around 25).

Software Engineering 2003 Jyrki Nummenmaa 14 Writing order Latin – left to right. In Chinese and Japanese, traditional writing order is top-down, and columns left-to-right. Nowadays adjusted to ordinary left-to-right. In Arabic and Hebrew, the text itself is written from right-to-left, but all latin names (like yours, probably) are written left-to-right in the middle of right-to-left.

Software Engineering 2003 Jyrki Nummenmaa 15 Character properties Don’t do: if ((ch >= 'a' && ch = 'A' && ch <= 'Z')) // ch is a letter In Java, char represents a Unicode character. You can use class Character to check for things such as white space, digits, upper and lower case. E.g.: Character.isDigit(ch), Character.isLetter(ch), Character.isLowerCase(ch) You can also use.getType() and predefined constants to check things like: if (Character.getType('a') == Character.LOWERCASE_LETTER)

Software Engineering 2003 Jyrki Nummenmaa 16 Comparing characters and strings You can use the Collator class, e.g.: Collator myCollator = Collator.getInstance(); if( myCollator.compare("abc", "ABC") < 0 ) System.out.println("abc is less than ABC"); else System.out.println("abc is greater than or equal to ABC"); getInstance() takes also a locale as a parameter. You can customise the rules used in the comparisons.

Software Engineering 2003 Jyrki Nummenmaa 17 Finding boundaries of words, sentences, etc. The boundaries may, of course, be defined differently in different languages. Initialise BreakIterator with one of these methods: - getCharacterInstance - getWordInstance - getSentenceInstance - getLineInstance E.g. BreakIterator sentenceIterator = BreakIterator.getSentenceInstance(currentLocale); One BreakIterator only works with one type of breaks.

Software Engineering 2003 Jyrki Nummenmaa 18 Colors, gestures, other symbols E.g. in far east there is a lot of symbolism in colors, names, numbers, etc. (e.g. red is a good color, 4 is a bad number, etc.) Also, for instance hand gestures vary from one place to another – what is good here may be bad elsewhere. Even in Europe there is variance. Consider tick marks: x (good here, bad in UK), √ (not exactly like this, however good in UK, bad here).

Software Engineering 2003 Jyrki Nummenmaa 19 Higher cultural issues General customs How to do business How to be polite How to say no How to avoid ”loosing face” in far east. What to avoid in particular. These issues may have impact on software as well.

Software Engineering 2003 Jyrki Nummenmaa 20 Conclusions The final conclusion is: ”This is all quite complicated, and if you have to get deeper into these things, find someone who really knows.” When you start writing your software, think a bit on the need of globalisation. If you know that English (or Finnish) is sufficient, then it makes life easier. If you know that globalisation is needed, you should start globalising when you start writing your software! Java offers lots of resources. If you want to re- invent the wheel, this may not be the best place.