Multilanguage -Internationalization -The language is not enough Michio Kimura M.D. Ph.D. Director and Professor of Medical Informatics Department Hamamatsu.

Slides:



Advertisements
Similar presentations
Japanese Records and Whether or not to Switch from MARC 8 to Unicode Storage (with an Innovative Interfaces Millennium local system) The University of.
Advertisements

Creating XHTML Documents Dr John Cowell phones off (please) 1CSCI1412-HW-6.
 Fundamentals of Web Design.  Describe the history and theory of XHTML  Understand the rules for creating valid XHTML documents  Apply a DTD to an.
Representing Information as Bit Patterns
Binary Expression Numbers & Text CS 105 Binary Representation At the fundamental hardware level, a modern computer can only distinguish between two values,
Internationalization of Java Platform Presenter: Ataru Nakazawa Advisor: Xiaoping Jia Date: January 23, 2004.
Media: Text “Words and symbols in any form, spoken or written, are the most common system of communication.” ~ unknown.
Introduction to XML Extensible Markup Language
Lecture 3 1 ISO/IEC and Unicode It is a coded character set(codeset) –Designed for text processing and exchange Features: –Universal: characters.
CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011.
Introduction to Chinese Domain Name ZHANG Hong Aug 24, 2003.
Unicode, character sets, and a a little history. Historical Perspective First came EBCIDIC (6 Bits?) Then in the early 1960s came ASCII – Most computers.
CHARACTERS Data Representation. Using binary to represent characters Computers can only process binary numbers (1’s and 0’s) so a system was developed.
Unicode & W3C Jataayu Software C. Kumar January 2007.
Localizing OpenClinica Hiroaki Honshuku: SQA 1. © What is Character Encoding?  Morse Code (1840) → Latin Alphabet  ASCII (1963)  The American Standard.
Globalisation & Computer Systems week 5 1. Localisation presentations 2.Character representation and UNICODE UNICODE design principles UNICODE character.
Health Information Standardization and Asian Languages Michio Kimura M.D. Ph.D. Director and Professor of Medical Informatics Department Hamamatsu University.
UNICODE Character Sets and Coding Standards Han Unification and ISO10646 Encoding Evolution and Unicode Programming Unicode.
Encoding and fonts Edward Garrett Software Developer, ELAR.
Chapter 3 Representing Numbers and Text in Binary Information Technology in Theory By Pelin Aksoy and Laura DeNardis.
Representing text Each of different symbol on the text (alphabet letter) is assigned a unique bit patterns the text is then representing as.
Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman Chapter 10 This presentation © 2004, MacAvon Media Productions Characters & Fonts.
Chapter 6 Text and Multimedia Languages and Properties
Data Representation Prepared by Dr P Marais (Modified by D Burford)
APPX Unicode Support APPX Release 6.0 will support Unicode APPX will support languages worldwide.
Data and Program Representation
Globalisation & Computer systems Week 4 writing systems and their implications for globalisation character representation ASCII extended ASCII code pages.
Project : Equinox-jsf Issue : You can write in Korean, but can ’ t edit or export in Korean. Description : Korean font is converted to Uni-Code when it.
Web page - A Web page is a simple text file that contains a set of HTML tags (code) that describe (to the browser) what should go on a web page. It may.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
Building digital libraries in Indian languages: case studies with Hindi and Kannada B.S. Shivaram Trainee ( ) National Center for Science Information.
INFOCODING BASICS & EXAMPLES OF CURRENT USE Introduction to Computer Science Using Ruby (c) 2010 Gideon Frieder.
ICT Foundation 1 Copyright © 2010, IT Gatekeeper Project – Ohiwa Lab. All rights reserved. Character representation.
Character Encoding, F onts. Overview Why do character encoding and fonts matter to linguists? How can you identify problems? Why do these problems arise?
1 3 Computing System Fundamentals 3.5 Data Representation.
Globalisation & Computer systems Week 5/6 Character representation ACII and code pages UNICODE.
Anlab ( ) Kim, Yangjung Characters & Fonts.
1 Credits Prepared by: Rajendra P. Srivastava Ernst & Young Professor University of Kansas Sponsored by: Ernst & Young, LLP (August 2005) XBRL Module Part.
1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,
Data Files on Computers Text Files (ASCII) Files that can be created by typing on the keyboard while using a text editor such as notepad or TextEdit.
SEC (1.4) Representing Information as bit patterns.
Lis508 lecture 2: characters to textual documents Thomas Krichel
CS 111 – Sept. 1 Intro to data representation Binary numbers –Convert binary  decimal –Convert decimal  binary Text –ASCII and Unicode Commitment: –For.
UNIT 2 LESSON 3 CS PRINCIPLES. OBJECTIVES Students will be able to: Construct a binary communication protocol for playing Battleship using the Internet.
Introduction of XML & XHTML Webmaster - Fort Collins, CO Copyright © XTR Systems, LLC Overview of XML & XHTML Instructor: Joseph DiVerdi, Ph.D., MBA.
1 Problem Solving using Computers “Data....Representation, and Storage.
M204 - Data Representation
© 2001, Penn State University Encoding on the Internet Elizabeth J. Pyatt CETS.
Representing Characters in a Computer System Representation of Data in Computer Systems.
CS 101 – Sept. 11 Review linear vs. non-linear representations. Text representation Compression techniques Image representation –grayscale –File size issues.
CC111 Lec#2 The System Unit The System Unit: Processing and Memory Lecture 2 Binary System.
Lecture Coding Schemes. Representing Data English language uses 26 symbols to represent an idea Different sets of bit patterns have been designed to represent.
1 Non-Numeric Data Representation V1.0 (22/10/2005)
Nat 4/5 Computing Science Data Representation Lesson 3: Storing Text
DATA REPRESENTATION - TEXT
Unit 2.6 Data Representation Lesson 2 ‒ Characters
INTERNATIONALIZATION
Characters & Fonts Digital Multimedia, 2nd edition
Information Support and Services
Representing Information as bit patterns
Phnom Penh International University (PPIU)
Data Encoding Characters.
Workshop on XML-Based Library Applications 5
Lecture 3 ISE101: Computing Fundamentals
Data Representation Question: Characters
Characters & Fonts Digital Multimedia, 2nd edition
Digital Encodings.
How Computers Store Data
INFOCODING BASICS & EXAMPLES OF CURRENT USE
Introduction to UNICODE (ஒருங்குறி)
Presentation transcript:

Multilanguage -Internationalization -The language is not enough Michio Kimura M.D. Ph.D. Director and Professor of Medical Informatics Department Hamamatsu University School of Medicine JIRA DICOM Committee Advisory

Michio Kimura M.D. Ph.D. Hamamatsu University School of Medicine Three types of representation -- We have 2 patient names in HIS zAlphabetic zIdeographic zPhonetic yIdeographic names xhave many ways to pronounce xare difficult to sort

Michio Kimura M.D. Ph.D. Hamamatsu University School of Medicine Multi-Byte Character Codes in Use in Asia zKorea: KS X 1001, and 1001 annex 3 yHanguls(phonetic) and Ideographics zChina(PR): GB zTaiwan(ROC): CNS 11643, and Big-5 zJapan: JIS X yKatakana, Hiragana(Ph.) and Ideographics yJunior school pupils must read/write 810 letters. zVarieties: 6879(JIS) to 48711(CNS)

Michio Kimura M.D. Ph.D. Hamamatsu University School of Medicine Byte-wise Representation of ISO2022

Michio Kimura M.D. Ph.D. Hamamatsu University School of Medicine RFC 1468: Japanese Character Encoding for Internet Messages zISO-2022-JP zWithin 7-bit, safe for most nodes zEvery line starts/ends with ASCII yNo carryover shifting zISO-2022-KR is also used in Korea zSame method is in DICOM(Supplement 9), and HL7 v.2.3.1

Michio Kimura M.D. Ph.D. Hamamatsu University School of Medicine UNICODE: ISO10646 z“Allocating 2 bytes for every character, UNICODE can represent every character in the world without any status nor shifting technique.” z16 bits=65,536 y-> CJK unified ideographics

Michio Kimura M.D. Ph.D. Hamamatsu University School of Medicine CJK Unified Ideographics

Michio Kimura M.D. Ph.D. Hamamatsu University School of Medicine Why we do not use UNICODE as Message? (I know it is used inside, but, we do not like it go outside as message format.) zIf Chinese “Bone” and our “Bone” are to be recognized same, because of symmetry, how about using these? zUNICODE consortium says “Introduction of Language information”. xWe cannot write “Chinese language textbook written in Japanese. xWe cannot accommodate Koreans living in Japan with their name properly in Korean letter, but their address is Japanese, of course. yOriginal UNICODE dream is gone.

Michio Kimura M.D. Ph.D. Hamamatsu University School of Medicine HL7 Japan’s answer to HL7 v.3 zIn XML, UNICODE will be default in zEven in UNICODE v3.1, “over-unification” problem is not solved. zBut with XML schema and XML namespace, font information can be set in each tag. yBy this, Korean name in Japanese address can be described. zOriginal UNICODE dream (all languages in the same time) is gone, but “many 1 byte languages + one 2 byte language” is not bad. yPokémon zAnswer: “UNICODE can be default, provided that we can continue to use each local practice now being used.”

Michio Kimura M.D. Ph.D. Hamamatsu University School of Medicine Language representation is not the only issue zLanguage used in; yConversation with patients ySchool education xMedical, Nurse, Technicians yMedical record xSigns and symptoms xReports zStructure of data types yAddress x250 Wu-Hsing street x Handa cho