Presentation is loading. Please wait.

Presentation is loading. Please wait.

Supporting Complex Scripts (such as Arabic and Hebrew) in your Windows 2000™ Application F. Avery Bishop Senior Program Manager Microsoft Corporation.

Similar presentations


Presentation on theme: "Supporting Complex Scripts (such as Arabic and Hebrew) in your Windows 2000™ Application F. Avery Bishop Senior Program Manager Microsoft Corporation."— Presentation transcript:

1 Supporting Complex Scripts (such as Arabic and Hebrew) in your Windows 2000™ Application F. Avery Bishop Senior Program Manager Microsoft Corporation

2 Agenda:  Overview of character encoding, Unicode  Guidelines for supporting complex scripts  Right-to-left layout of applications  Multilingual User Interface

3 Overview of Character Encoding and Unicode

4 Why do character set differences matter?  Historically, they fragmented code bases for both Windows and applications Single byte: European editions Single byte: European editions Double byte: Far East editions Double byte: Far East editions Bi-directional: Middle East editions Bi-directional: Middle East editions  Make it difficult to share data  Make it difficult to develop multilingual applications

5 Example: Multiple Hebrew Character Encodings  8bit Hebrew encodings still in use Windows codepage 1255 Windows codepage 1255 OEM (DOS) codepage 862 OEM (DOS) codepage 862 Visual Hebrew encodings (many exist) Visual Hebrew encodings (many exist)

6 Example: Multiple Arabic Character Encodings  8bit Arabic encodings supported in Internet Explorer 4.0/CS ASMO-708 ASMO-708 DOS 720 DOS 720 ISO 8859-6 ISO 8859-6 Windows Codepage 1256 Windows Codepage 1256 Other proprietary encodings Other proprietary encodings

7 Logical vs Visual Encoding  Logical: Storage order is same as typing order Storage order is same as typing order Allows natural text processing: Allows natural text processing: Search Search Resizing (e.g., in web pages) Resizing (e.g., in web pages) IPC: Select, cut & paste IPC: Select, cut & paste  Visual Natural text processing difficult or impossible Natural text processing difficult or impossible Cannot always map back to logical order Cannot always map back to logical order

8 What is Unicode?  A 16-bit character encoding A mapping of characters to numbers A mapping of characters to numbers Syntax rules for display of complex scripts Syntax rules for display of complex scripts Not a font or glyph encoding! Not a font or glyph encoding! Not a sort algorithm! Not a sort algorithm!  Includes all characters in common use in modern scripts (and others)  Basis for the ISO 10646 character encoding standard  Native text encoding for Windows NT

9 Unicode ™ / ISO 10646  16-bit international character encoding  Windows 2000 uses Unicode version 2.0 0x0000 0xFFFF Punctuation Future use ASCII Private use Compatibility Indian Greek Arabic, Hebrew Latin Ideographs (Hanzi, Kanji, Hanja) Symbols Hangul Kana Thai A 00419662FF964F850000 (null)

10 Relatives of Unicode  ISO/IEC 10646 32 bit ISO standard of 64K X 64K “planes” 32 bit ISO standard of 64K X 64K “planes” Unicode repertoire is plane 0 Unicode repertoire is plane 0  UTF-7 7 bit transformation format 7 bit transformation format Not widely used Not widely used  UTF-8 8 bit transformation format 8 bit transformation format Used in web pages and some email Used in web pages and some email

11 Unicode in Win32: the W and A Entry Points  Two kinds of window classes: Unicode, ANSI  Win32 API has two versions of most functions: “W” (wide) version handles Unicode “W” (wide) version handles Unicode “A” (ANSI –  ) assumes the system default code page (character encoding) “A” (ANSI –  ) assumes the system default code page (character encoding)

12 Unicode in Win32 …  Macros resolve to W or A entry point  Example: Macro for RegisterClassEx #ifdef UNICODE #define RegisterClassEx RegisterClassExW #else #define RegisterClassEx RegisterClassExA #endif  To create Unicode application: Compile with –DUNICODE or Compile with –DUNICODE or Use W routines explicitly Use W routines explicitly

13 For Applications that Must Also Run on Windows 98…  Use Unicode everywhere with single binary, two code paths: On Windows NT use W entry points On Windows NT use W entry points On Windows 98, convert Unicode  ANSI, use A entry points On Windows 98, convert Unicode  ANSI, use A entry points See sample GLOBALDV for example See sample GLOBALDV for example  See April Microsoft Systems Journal for details and other options

14 Summary: Use Unicode if you can!  Represent all text with one unambiguous encoding  Support multilingual text easily  Avoid special processing for variable byte- length characters  Use standard encoding recognized throughout the industry and the world  Support new scripts that are only supported through Unicode

15 Guidelines for Supporting Complex Scripts in Applications

16 1. Displaying Complex Scripts in Plain-text  In Win32 apps use standard edit control  Use standard win32 API display functions Win32 APIs: ExtTextOutW or DrawTextW Win32 APIs: ExtTextOutW or DrawTextW ScriptString API in Uniscribe ScriptString API in Uniscribe

17 Pitfalls in Enabling for Complex Scripts  When displaying typed text: Do not output characters one by one! Do not output characters one by one! Do save text in a buffer and display the whole string with Uniscribe or Win32 API Do save text in a buffer and display the whole string with Uniscribe or Win32 API  To measure line lengths: Do not sum cached character widths Do not sum cached character widths Do use a GetTextExtent function or Uniscribe Do use a GetTextExtent function or Uniscribe

18 2. Displaying Complex Scripts in Simple Formatted Text  In Win32 applications use rich edit control  In web pages for Internet Explorer 5.0, use Document Object Model

19 3. Displaying CS in Text with Advanced Formatting and Layout  Use script APIs (“Uniscribe”)  See MSJ article of November 1998

20 Overview of Uniscribe  Background and Purpose of Uniscribe  Low level APIs  High level APIs  For details see November 1998 MSJ article

21 The Uniscribe DLL: USP10.DLL  Platforms Windows 2000 Windows 2000 Windows NT 4 Windows NT 4 Windows 98 Windows 98 Windows 95 (excluding Far East) Windows 95 (excluding Far East)  Single worldwide binary  Installs with Windows2000, IE5, Office 2000

22 Hides language details  Syllable structure (Indian, Thai)  Contextual shaping (Arabic, Indic)  Caret placement (all)  Wordbreak (Thai)  National digits (Arabic, Indic, Thai)  Bidirectional layout (Arabic, Hebrew)

23 Hides Unicode OS details  APIs are Unicode on all platforms  Hides glyph codes  Hides font differences Shaping tables Shaping tables Fixed repertoire fonts Fixed repertoire fonts

24 Uniscribe Structure Uniscribe Arabic shaping engine Layout XtoCP & CPtoX Justify Shape, Place and TextOut Unicode BiDi algorithm Itemize GDI Client Measurer Renderer Display Caret Mouse ExtTextOut ETO_ GLYPH_INDEX GetCharABC- WidthsI GetGlyphOutline CMAP & width tables, Open- Type library Hindi shaping engine Tamil shaping engine Thai shaping engine Vietnamese shaping Hebrew engine

25 Shaping engines  Per script  Understand language rules  Understand font features OpenType provides full control OpenType provides full control Many older fixed layout fonts Many older fixed layout fonts

26 USERGDI LPK. DLL Uni- scribe Application

27 Low level APIs Support  Formatting text Style runs Style runs Measurement Measurement Paragraph filling Paragraph filling Rendering Rendering  Information needed for font fallback

28 Summary  Script… Itemize Itemize Shape, Place Shape, Place Break, Layout Break, Layout TextOut TextOut CPtoX, XtoCP CPtoX, XtoCP

29 High level APIS  Purpose  Analysis  Display  Font fallback

30 Purpose  For Windows 2000 ExtTextOut ExtTextOut DrawText DrawText System edit control System edit control  Cross-platform Unicode plaintext display  Easier than low level APIs

31 Summary of ScriptString APIs:  ScriptString… Analyse Analyse … query analysis... … query analysis... Out Out Free Free  Provides simple font fallback

32 Implementing Right-to-left Layout in Applications

33 Background On RTL Layout (“Mirroring”) For BiDi Localization  Localized Arabic and Hebrew Windows ® is laid out from Right to Left  In the past was done “ad hoc” or not at all  Windows 2000 and BiDi Windows 98 include mechanisms to “automatically” mirror shell and applications  Also helpful for multilingual user interface support

34 Mirroring in System Based on Coordinate Transformation  Origin (0,0) in upper RIGHT corner of window  X scale factor = -1, x values increase from right to left Default (LTR) Window OriginOrigin Increasing x 01 Mirrored (RTL) Window OriginOrigin Increasing x 01

35 More Background on Mirroring…  Developers use programming interfaces and Windows style bits  Automatic inheritance of RTL property: Child window of RTL window defaults to RTL Child window of RTL window defaults to RTL You can disable inheritance of RTL Property You can disable inheritance of RTL Property  APIs provided to disable mirroring of bitmaps

36 Implementing Mirroring in Win32 Applications: Standard Windows  Use SetProcessDefaultLayout: Affects all Windows created thereafter Affects all Windows created thereafter SetProcessDefaultLayout(LAYOUTRTL) ; SetProcessDefaultLayout(LAYOUTRTL) ; SetProcessDefaultLayout(0) ; // Reset to LTR SetProcessDefaultLayout(0) ; // Reset to LTR  Or call CreateWindowEx: Use extended style WS_EX_LAYOUTRTL Use extended style WS_EX_LAYOUTRTL To inhibit mirroring in child windows, also set WS_EX_NOINHERITLAYOUT To inhibit mirroring in child windows, also set WS_EX_NOINHERITLAYOUT

37 Changing Layout of Existing Window BOOL IsRTLLayout ; // TRUE iff window is to be mirrored //... Get new value of IsRTLLayout LONG lExStyles = GetWindowLongA(hWnd, GWL_EXSTYLE) ; // Check whether new layout is opposite current layout if(!!(IsRTLLayout) != !!(lExStyles & WS_EX_LAYOUTRTL)){ lExStyles ^= WS_EX_LAYOUTRTL ; // Toggle layout lExStyles ^= WS_EX_LAYOUTRTL ; // Toggle layout // Set extended styles to new value // Set extended styles to new value SetWindowLongA(hWnd, GWL_EXSTYLE, lExStyles) ; SetWindowLongA(hWnd, GWL_EXSTYLE, lExStyles) ; // Update client area // Update client area InvalidateRect(hWnd, NULL, TRUE) ; InvalidateRect(hWnd, NULL, TRUE) ;}

38 Controlling Mirroring of a Device Context  SetLayout(HDC hDc, DWORD dwLayout) dwLayout = 0 ; // will layout LTR dwLayout = LAYOUTRTL ;// will layout RLT dwLayout = LAYOUTRTL | LAYOUT_BITMAPORIENTATIONPRESERVED ; // will layout RTL, but not bitmaps  GetLayout(HDC hDc, DWORD *pdwLayout) Tells what the layout settings are for a hDc

39 Mirroring in Win32 Applications: Dialogs  Set WS_EX_LAYOUTRTL in dialog template  Visual Studio 6 Dialog editor: Has option for RTL layout Has option for RTL layout BUG in Visual Studio 6: BUG in Visual Studio 6: Writes WS_EX_LAYOUT_RTL to RC file! Writes WS_EX_LAYOUT_RTL to RC file! Must correct RC file by hand to compile Must correct RC file by hand to compile Will be fixed in future version Will be fixed in future version

40 Mirroring in Win32 Applications: Message Boxes  Set MB_RTLLAYOUT option bit

41 Guidelines for using RTL Layout  Using coordinates Use GetWindowRect with care Use GetWindowRect with care Use client, rather than screen coordinates Use client, rather than screen coordinates Do not mix screen coordinates and client coordinates Do not mix screen coordinates and client coordinates Use MapWindowPoints to map rectangles, instead of ClientToScreen and ScreenToClient Use MapWindowPoints to map rectangles, instead of ClientToScreen and ScreenToClient  Windows 95 does not support mirroring!

42 Implementing Multi-language User Interface in Applications

43 Guidelines for Multilanguage User Interface  Initialize to current UI language Windows 2000: GetUserDefaultUILanguage() Windows 2000: GetUserDefaultUILanguage() Others: Use the language of the O/S Others: Use the language of the O/S See function InitUiLang in Globaldev sample code See function InitUiLang in Globaldev sample code

44 Guidelines for Multilanguage User Interface  Allow user to select UI language Put language-dependent resources in resource DLLs Put language-dependent resources in resource DLLs Use naming convention, e.g., res.dll Use naming convention, e.g., res.dll Find all resource DLLs, put up list box of choices Find all resource DLLs, put up list box of choices  See module UPDTLANG.CPP in Globaldev Sample

45 Summary  Use Unicode to encode if you can  Use controls to display text and accept user input  Use Uniscribe for advanced formatting  Use new RTL layout API for applications localized to RTL languages  Consider multilingual user interface

46

47

48 Further Information and Resources  http://www.microsoft.com/globaldev (Watch for updates!)  MSJ articles, e.g., Uniscribe: http://www.microsoft.com/msj/1198/multilang/ multilangtop.htm Uniscribe: http://www.microsoft.com/msj/1198/multilang/ multilangtop.htm http://www.microsoft.com/msj/1198/multilang/ multilangtop.htm http://www.microsoft.com/msj/1198/multilang/ multilangtop.htm Multilingual UI: Multilingual UI: http://www.microsoft.com/msj/0499/multilangU nicode/multilangUnicodetop.htm http://www.microsoft.com/msj/0499/multilangU nicode/multilangUnicodetop.htm  Send suggestions to nlshelp@microsoft.com


Download ppt "Supporting Complex Scripts (such as Arabic and Hebrew) in your Windows 2000™ Application F. Avery Bishop Senior Program Manager Microsoft Corporation."

Similar presentations


Ads by Google