Download presentation
Presentation is loading. Please wait.
Published byEdwina Hardy Modified over 9 years ago
1
Introduction of XML & XHTML Webmaster - Fort Collins, CO Copyright © XTR Systems, LLC Overview of XML & XHTML Instructor: Joseph DiVerdi, Ph.D., MBA
2
Introduction of XML & XHTML Webmaster - Fort Collins, CO Copyright © XTR Systems, LLC Character Sets A Brief Digression...
3
Introduction of XML & XHTML Webmaster - Fort Collins, CO Copyright © XTR Systems, LLC Character Sets Character –A Unit of a Written Language System ay, bee, see, dee, eff, gee, aych, eye Glyph –An Actual Printed or Displayed Character = a b c 5, $ ó
4
Introduction of XML & XHTML Webmaster - Fort Collins, CO Copyright © XTR Systems, LLC Character Sets A Character May Associate With Several Glyphs –Close Quote - " or » A Glyph May Correspond to Several Characters –Comma - Pause in Sentence or Decimal Indicator In Certain Languages
5
Introduction of XML & XHTML Webmaster - Fort Collins, CO Copyright © XTR Systems, LLC Character Sets Each Character Is Assigned –A Specific Numeric Value Number of Characters in a Character Set –Limited by the Bit-depth of Its Encoding 8-Bit Encoded Character Set - 256 characters 16-Bit Encoded Character Set - 65,536 characters HTML v2.0 & v3.2 are based on ISO 8859-1 –8-Bit Character Set AKA Latin-1
6
Introduction of XML & XHTML Webmaster - Fort Collins, CO Copyright © XTR Systems, LLC Character Sets ISO-8859-1 Character Set –8-Bit Depth First 128 Values From US-ASCII Numeric ValueGlyphDescription 13CRcarriage return 480digit zero 64Auppercase aye 94^caret 177±plus-or-minus 191¿inverted question mark 255ÿlowercase wye w/umlaut
7
Introduction of XML & XHTML Webmaster - Fort Collins, CO Copyright © XTR Systems, LLC Character Sets (continued) Common 8-bit character sets ISO 8859-1Latin-1 ISO 8859-5Cyrillic ISO 8859-6Arabic ISO 8859-7Greek ISO 8859-8Hebrew SHIFT_JISJapanese EUC_JPJapanese
8
Introduction of XML & XHTML Webmaster - Fort Collins, CO Copyright © XTR Systems, LLC Uses of Character Sets LanguagesCountriesCharacter Sets Frenchfr iso-8859-1 Greekeliso-8859-7 Hebrewiwiso-8859-8 Hungarianhuiso-8859-2 Icelandicisiso-8859-1 Italianitiso-8859-1 Japanesejashift_jis, iso-2022-jp, euc-jp Romanianroiso-8859-2 Russianrukoi-8-r, iso-8859-5 Serbiansriso-8859-5 Slovakskiso-8859-2 Spanishesiso-8859-1 Turkishtriso-8859-9 Ukrainianukiso-8859-5
9
Introduction of XML & XHTML Webmaster - Fort Collins, CO Copyright © XTR Systems, LLC Character Sets (continued) 256 Characters are Sufficient –For Certain Languages Insufficient for Others –Japanese (kanji) –Chinese –Korean –Vietnamese Hence the Need For –16-Bit Encoded Character Sets
10
Introduction of XML & XHTML Webmaster - Fort Collins, CO Copyright © XTR Systems, LLC Character Sets 16-Bit Encoded Character Sets –Two Contiguous Bytes Represent One Character 65,536 Possible Characters in One Set –Unicode is a 16-bit Character Set Developed by the Unicode Consortium –Practically Identical to ISO 10646-1 First 256 Slots Allocated to ISO 8859-1 –Backwards Compatible (woo-hoo!)
11
Introduction of XML & XHTML Webmaster - Fort Collins, CO Copyright © XTR Systems, LLC Character Sets A Brief Digression... Bottom Line –Specify Your Encoding As Required –Important For International Applications Multi-Lingual Applications There, now you know about it.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.