Introduction of XML & XHTML Webmaster - Fort Collins, CO Copyright © XTR Systems, LLC Overview of XML & XHTML Instructor: Joseph DiVerdi, Ph.D., MBA
Introduction of XML & XHTML Webmaster - Fort Collins, CO Copyright © XTR Systems, LLC Character Sets A Brief Digression...
Introduction of XML & XHTML Webmaster - Fort Collins, CO Copyright © XTR Systems, LLC Character Sets Character –A Unit of a Written Language System ay, bee, see, dee, eff, gee, aych, eye Glyph –An Actual Printed or Displayed Character = a b c 5, $ ó
Introduction of XML & XHTML Webmaster - Fort Collins, CO Copyright © XTR Systems, LLC Character Sets A Character May Associate With Several Glyphs –Close Quote - " or » A Glyph May Correspond to Several Characters –Comma - Pause in Sentence or Decimal Indicator In Certain Languages
Introduction of XML & XHTML Webmaster - Fort Collins, CO Copyright © XTR Systems, LLC Character Sets Each Character Is Assigned –A Specific Numeric Value Number of Characters in a Character Set –Limited by the Bit-depth of Its Encoding 8-Bit Encoded Character Set characters 16-Bit Encoded Character Set - 65,536 characters HTML v2.0 & v3.2 are based on ISO –8-Bit Character Set AKA Latin-1
Introduction of XML & XHTML Webmaster - Fort Collins, CO Copyright © XTR Systems, LLC Character Sets ISO Character Set –8-Bit Depth First 128 Values From US-ASCII Numeric ValueGlyphDescription 13CRcarriage return 480digit zero 64Auppercase aye 94^caret 177±plus-or-minus 191¿inverted question mark 255ÿlowercase wye w/umlaut
Introduction of XML & XHTML Webmaster - Fort Collins, CO Copyright © XTR Systems, LLC Character Sets (continued) Common 8-bit character sets ISO Latin-1 ISO Cyrillic ISO Arabic ISO Greek ISO Hebrew SHIFT_JISJapanese EUC_JPJapanese
Introduction of XML & XHTML Webmaster - Fort Collins, CO Copyright © XTR Systems, LLC Uses of Character Sets LanguagesCountriesCharacter Sets Frenchfr iso Greekeliso Hebrewiwiso Hungarianhuiso Icelandicisiso Italianitiso Japanesejashift_jis, iso-2022-jp, euc-jp Romanianroiso Russianrukoi-8-r, iso Serbiansriso Slovakskiso Spanishesiso Turkishtriso Ukrainianukiso
Introduction of XML & XHTML Webmaster - Fort Collins, CO Copyright © XTR Systems, LLC Character Sets (continued) 256 Characters are Sufficient –For Certain Languages Insufficient for Others –Japanese (kanji) –Chinese –Korean –Vietnamese Hence the Need For –16-Bit Encoded Character Sets
Introduction of XML & XHTML Webmaster - Fort Collins, CO Copyright © XTR Systems, LLC Character Sets 16-Bit Encoded Character Sets –Two Contiguous Bytes Represent One Character 65,536 Possible Characters in One Set –Unicode is a 16-bit Character Set Developed by the Unicode Consortium –Practically Identical to ISO First 256 Slots Allocated to ISO –Backwards Compatible (woo-hoo!)
Introduction of XML & XHTML Webmaster - Fort Collins, CO Copyright © XTR Systems, LLC Character Sets A Brief Digression... Bottom Line –Specify Your Encoding As Required –Important For International Applications Multi-Lingual Applications There, now you know about it.