Presentation is loading. Please wait.

Presentation is loading. Please wait.

Globalisation & Computer systems Week 4 writing systems and their implications for globalisation character representation ASCII extended ASCII code pages.

Similar presentations


Presentation on theme: "Globalisation & Computer systems Week 4 writing systems and their implications for globalisation character representation ASCII extended ASCII code pages."— Presentation transcript:

1 Globalisation & Computer systems Week 4 writing systems and their implications for globalisation character representation ASCII extended ASCII code pages Practical: code pages in VB

2 Week 6 Writing systems and their implication for globalisation Directionality (Arabic, Hebrew) Code space: Chinese Context sensitive characters: Arabic Compositionality (Amharic)

3 Representation bits and bytes characters code points glyphs fonts standardization

4 Representation What is a bit? ‘a binary digit’, i.e either 0 or 1 What is a byte? ‘the fixed no. of bits that can be treated as a unit by the computer hardware’ A byte can be used to express a character such as “A”

5 Representation ASCII: American standard code for information interchange A standard character encoding system The bytes were originally 7-bits Given this, how many bit patterns? Each pattern maps onto a decimal code point, and that maps onto a character

6 Representation Glyphs the pictures used to represent a given character; many to one: The character “A” -> A A AA A A A A A

7 Representation Glyphs the pictures used to represent a given pictures used to represent a given character; many to one: The character “A” -> A A AA A A A A A Fonts the collection, or ‘picture gallery’ of glyphs

8 Representation ASCII: The problem with 7-bit bytes… What about French la tête What about Greek κεφαλη Extend ASCII to 8-bit bytes ISO (International organization for standardization) Now 256 bit-patterns

9 Representation Extended ASCII: With 8-bit bytes you get 256 bit-patterns For consistency, the first 128 code-points remain the same from ISO-7 The next 128 used for a range of languages For each language, you need an interpretation of these 128 code points The encoding is handled by a code page

10 Representation Extended ASCII: For code point 154: CP_EASTEUROPE (code page 1250): š CP_RUSSIAN (code page 1251): љ What about code point 65 for these two code pages? Now represent your names with your own orthographies in mind, using the code pages

11 Representation Code pages in VB Public Enum ValidCharsets ANSI_CHARSET = 0 GREEK_CHARSET = 161 THAI_CHARSET = 222 End Enum Private Sub Form_Load() Dim X As New StdFont X.Charset = 161 X.Bold = True X.Size = 8 X.Name = "Times New Roman" Set frmTest.Font = X Set frmTest.Label1.Font = X Set frmTest.Text1.Font = X frmTest.Label1.Caption = Chr(181) + Chr(225) + Chr(226) frmTest.Text1.Text = Chr(181) + Chr(225) + Chr(226) End Sub

12 Representation and UNICODE What about Chinese? Thousands of characters – 256 bit-patterns clearly not enough

13 Representation and UNICODE What about Chinese? Thousands of characters – 256 bit-patterns clearly not enough Make the bytes bigger… Bytes have 16-bits, which gives 65536 bit- patterns UNICODE

14 UNICODE – design principles Reference: The Unicode Standard, Version 3. 2000. Online: http://www.unicode.org/unicode/uni2book/


Download ppt "Globalisation & Computer systems Week 4 writing systems and their implications for globalisation character representation ASCII extended ASCII code pages."

Similar presentations


Ads by Google