Presentation is loading. Please wait.

Presentation is loading. Please wait.

Localizing OpenClinica Hiroaki Honshuku: SQA 1. © What is Character Encoding?  Morse Code (1840) → Latin Alphabet  ASCII (1963)  The American Standard.

Similar presentations


Presentation on theme: "Localizing OpenClinica Hiroaki Honshuku: SQA 1. © What is Character Encoding?  Morse Code (1840) → Latin Alphabet  ASCII (1963)  The American Standard."— Presentation transcript:

1 Localizing OpenClinica Hiroaki Honshuku: SQA 1

2 © What is Character Encoding?  Morse Code (1840) → Latin Alphabet  ASCII (1963)  The American Standard Code for Information Interchange  Characters, Numerals, Symbols, Control Characters  7-bit: 0~127  0x41 = letter ‘A’, 0x61 = letter ‘a’  ISO-8859-n  8-bit: 0-255  iso-8859-1: Latin-1, covers most of European Language  iso-8859-5: Cyrillic alphabet  No CJK (Chinese, Japanese, Korean) support  Morse Code (1840) → Latin Alphabet  ASCII (1963)  The American Standard Code for Information Interchange  Characters, Numerals, Symbols, Control Characters  7-bit: 0~127  0x41 = letter ‘A’, 0x61 = letter ‘a’  ISO-8859-n  8-bit: 0-255  iso-8859-1: Latin-1, covers most of European Language  iso-8859-5: Cyrillic alphabet  No CJK (Chinese, Japanese, Korean) support 2

3 © What is Character Encoding (cont.)  iso-8859-1 versus iso-8859-5 3 iso-8859-1iso-8859-5 A0x65A0x176 B0x66B0x178

4 © What is Character Encoding (cont.)  iso-8859-1 versus iso-8859-5  CJK Encoding Mess  Chinese: Big5 (Traditional), GB18030 (Simplified)  Japanese: iso-2022-JP, EUC-JP, Shift-JIS  Korean: EUC-KR, KS C 5861  iso-8859-1 versus iso-8859-5  CJK Encoding Mess  Chinese: Big5 (Traditional), GB18030 (Simplified)  Japanese: iso-2022-JP, EUC-JP, Shift-JIS  Korean: EUC-KR, KS C 5861 4 iso-8859-1iso-8859-5 A0x65A0x176 B0x66B0x178

5 © What is Character Encoding (cont.)  iso-8859-1 versus iso-8859-5  CJK Encoding Mess  Chinese: Big5 (Traditional), GB18030 (Simplified)  Japanese: iso-2022-JP, EUC-JP, Shift-JIS  Korean: EUC-KR, KS C 5861  Windows propriety Encoding  CP1252, CP932, etc  iso-8859-1 versus iso-8859-5  CJK Encoding Mess  Chinese: Big5 (Traditional), GB18030 (Simplified)  Japanese: iso-2022-JP, EUC-JP, Shift-JIS  Korean: EUC-KR, KS C 5861  Windows propriety Encoding  CP1252, CP932, etc 5 iso-8859-1iso-8859-5 A0x65A0x176 B0x66B0x178

6 © Unicode  1887: Apple + Xerox  1991: Unicode Consortium  1887: Apple + Xerox  1991: Unicode Consortium 6

7 © Unicode  1887: Apple + Xerox  1991: Unicode Consortium  UTF-8: 1,112,064 Code Points  Standard  ASCII Compatible  Unix, Linux, Mac OS  Big Endian  1887: Apple + Xerox  1991: Unicode Consortium  UTF-8: 1,112,064 Code Points  Standard  ASCII Compatible  Unix, Linux, Mac OS  Big Endian 7

8 © Unicode  1887: Apple + Xerox  1991: Unicode Consortium  UTF-8: 1,112,064 Code Points  Standard  ASCII Compatible  Unix, Linux, Mac OS  Big Endian  UTF-16 (UCS-2) : 1,112,064 Code Points  Windows Only  Little Endian: Requires BOM (Bite Order Marker)  1887: Apple + Xerox  1991: Unicode Consortium  UTF-8: 1,112,064 Code Points  Standard  ASCII Compatible  Unix, Linux, Mac OS  Big Endian  UTF-16 (UCS-2) : 1,112,064 Code Points  Windows Only  Little Endian: Requires BOM (Bite Order Marker) 8

9 © OpenClinica and i18n  i18n Support since 3.1.3  OpenClinica i18n Work in Progress  Data Mart  Response OptionText  CRF Name  Discrepancy Note data passing  Escaping Ctrl Chars and MS Propriety Chars  Should detect at CRF upload  Hard-coded strings  Missing encode declaration in some Export formats  i18n Support since 3.1.3  OpenClinica i18n Work in Progress  Data Mart  Response OptionText  CRF Name  Discrepancy Note data passing  Escaping Ctrl Chars and MS Propriety Chars  Should detect at CRF upload  Hard-coded strings  Missing encode declaration in some Export formats 9

10 © Microsoft Specific issues  Display issues on Windows  Pre-Win7, GUI was not fully UTF-8 compatible  Displayed character corruption after saving data  Viewing extracted data  Use UTF-8 compatible Text Editor  Never Copy/Paste from MSOffice  Display issues on Windows  Pre-Win7, GUI was not fully UTF-8 compatible  Displayed character corruption after saving data  Viewing extracted data  Use UTF-8 compatible Text Editor  Never Copy/Paste from MSOffice 10

11 © Demonstration  Search Subjects and Tables  CRF and Data Entry  Discrepancy Notes  Rules  Data Import  Data Extract  Search Subjects and Tables  CRF and Data Entry  Discrepancy Notes  Rules  Data Import  Data Extract 11

12 © How to Localize  Documentation  https://docs.openclinica.com/3.1/technical- documents/openclinica-and-internationalization https://docs.openclinica.com/3.1/technical- documents/openclinica-and-internationalization  UTF-8 Converter  i18n strings needs to be Hex value  http://www.branah.com/unicode-converter http://www.branah.com/unicode-converter  Calendar Widget can take UTF-8 strings  Pseudo Translation  Insert one distinctive non-ASCII character  Duplicate English properties files first  Search “ = “ and replace by “ = \u8a66”  Documentation  https://docs.openclinica.com/3.1/technical- documents/openclinica-and-internationalization https://docs.openclinica.com/3.1/technical- documents/openclinica-and-internationalization  UTF-8 Converter  i18n strings needs to be Hex value  http://www.branah.com/unicode-converter http://www.branah.com/unicode-converter  Calendar Widget can take UTF-8 strings  Pseudo Translation  Insert one distinctive non-ASCII character  Duplicate English properties files first  Search “ = “ and replace by “ = \u8a66” 12

13 © How to Localize (cont.) 1. Duplicate English properties files  Exclude licensing.properties 1. Duplicate English properties files  Exclude licensing.properties 13

14 © How to Localize (cont.) 1. Duplicate English properties files  Exclude licensing.properties 2. Rename duplicated files to your Locale NO 1. Duplicate English properties files  Exclude licensing.properties 2. Rename duplicated files to your Locale NO 14

15 © How to Localize (cont.) 1. Duplicate English properties files  Exclude licensing.properties 2. Rename duplicated files to your Locale 3. Date Format  Edit format.properties file 1. Duplicate English properties files  Exclude licensing.properties 2. Rename duplicated files to your Locale 3. Date Format  Edit format.properties file 15

16 © How to Localize (cont.) 1. Duplicate English properties files  Exclude licensing.properties 2. Rename duplicated files to your Locale 3. Date Format  Edit format.properties file 4. Translate per GUI page  Avoids possible legacy strings  Use Text Editor that supports global search 1. Duplicate English properties files  Exclude licensing.properties 2. Rename duplicated files to your Locale 3. Date Format  Edit format.properties file 4. Translate per GUI page  Avoids possible legacy strings  Use Text Editor that supports global search 16

17 © Thank You! 17


Download ppt "Localizing OpenClinica Hiroaki Honshuku: SQA 1. © What is Character Encoding?  Morse Code (1840) → Latin Alphabet  ASCII (1963)  The American Standard."

Similar presentations


Ads by Google