How to use Unicode on your computer Michael Appleby Eastern Michigan University A field linguist’s guide to making long-lasting texts and databases LSA Meeting 2007
Introduction This talk will cover: Unicode fonts Unicode data entry Tools that can help you Using Unicode does not need to be difficult!
The challenge: Part 1 Your keyboard might look like this: But you want to enter this:
The challenge: Part 2 Ensuring your text appears correctly for whoever may view your document … …regardless of computer …regardless of operating system Use Unicode and achieve interoperability
Unicode Fonts Find a font covering the ranges you need: IPA Extensions (U+0250 – U+02AF) Spacing Modifier Letters (U+02B0 – U+02FF) Combining Diacritical Marks (U+0300 – U+036F) Websites:
Unicode Fonts Doulos SIL Unicode: Charis SIL Unicode: Lucida variants: Lucida Sans Unicode, Lucida Grande Arial Unicode MS TITUS, Code2000, many others Websites:
Unicode Input There are useful tools built into the software and operating system you already have. Convenient for occasional use. For more intensive use, dedicated software is recommended.
Unicode Input: Character Map Windows XP character map:
Unicode Input: Copy/Paste E.g. copy and paste from Unicode character pages:
Unicode Input: Key combinations Windows XP: Do not use Alt-XXX. Use Alt-+-XXXX (e.g. Alt+00E9). Some applications support typing the hex code followed by Alt-x. Mac OS X: Set up the Unicode Hex Input Keyboard. Use Option-XXXX (e.g. Option-00E9).
Unicode Input: Keyboards A far faster way of inputting a lot of text. Windows XP: Tavultesoft Keyman: SIL IPA keyboard for Keyman:
Unicode Input: Keyboards MacOS X: Ukelele: Other software:
Choosing the right character Many Unicode characters look similar. IPA characters β and θ are in the Greek range … but do not use mathematical ∫ for IPA ʃ. ã can be U+00E3 (precomposed) … … or ã can be a (U+0061) plus ◌ ̃ (U+0303). For aspiration, use ʰ (U+02B0).
Legacy data An example of conversion to Unicode: SIL provide a guide on converting legacy documentation: Much easier to use Unicode in the first place.
Summary of resources Information in this presentation is from: Unicode: Alan Wood: E-MELD School of Best Practice: SIL International: Mailing List: