U.S. Government Language Requirements U.S. Government Language Requirements 7 September 2000 Everette Jordan Department of Defense (301)
Need for Language Technology n Over 30,000 language professionals in the U.S. government –Many more at state and local levels n Extensive material in foreign language, often in legacy formats/encodings –More than 50 percent of Library of Congress is non-English –Many kinds of applications n Assimilation and dissemination n Extensive collaboration n Need for large number of languages –Disaster relief (e.g., Haiti) unpredicatable –Extensive multinational efforts –Language list now being reviewed –Short list as follows:
U.S. Government Languages (List being updated) n n Afrikaans n n Albanian n n Amharic n n Arabic n n Armenian n n Ayamara n n Azerbaijani n n Bangla n n Basque n n Belarusian n n Bengali n n Bosnian n n Bulgarian n n Burmese n n Cantonese n n Catalan n n Chinese n n Croatian n n Czech n n Danish n Dari n Dutch n English n Estonian n Farsi n Finnish n French n Georgian n German n Greek n Guarani n Haitian Creole n Hausa n Hebrew n Hindi n Hungarian n Icelandic n Indonesian n Italian n Japanese
Languages (Continued) n n Kazakh n n Khmer/Cambodian n n Kinyarwanda n n Kirundi n n Korean n n Kurdish n n Lao n n Latvian n n Lithuanian n n Macedonian n n Mongolian n n Nepali n n Norwegian n n Pashto n n Polish n n Portuguese n n Romanian n n Russian n Serbian n Sinhalese n Slovak n Slovenian n Spanish n Swahili n Swedish n Tagalog n Thai n Tibetan n Tigrigna n Turkish n Ukranian n Urdu n Uzbek n Vietnamese
Today Translators and limited Machine Translation Analyst
Archives Identify language, content, and importance Integrated Collaborative Translation Space with Shared Tools Fast Routing Translated and/or Tagged Expert
Types of Technology Needed n Browsers n Text Processors n Web Page Tools n OCR n MT n Search Engines n Translation Managers n Language Learning n Dictionaries n Thesauri n Developers Kits n Info Extraction & Summarization n Knowledge Management n Visualization n Other
Special Requirements n Unicode n UTF 8 n Other major code sets n Code set conversions (extensible) n Language and encoding ID n Mixed languages –Per page –Per database n English interfaces n English sys admin n Work with Microsoft and/or Sun (non- localized) n Work well with other COTS applications n Easy training n Good U.S. support n Comply with W3C guidelines for accessibility n Enable easy extensibility by government