Multilingual, Multi-script Catalog Requirements (An Arcadia Project) ________________________ January 29, 2010.

Slides:



Advertisements
Similar presentations
When parallels collide: Parallel records, parallel fields and hybrid records OCLC Users Group Annual Meeting 3/6/2004 Hsi-chu Bolick University of North.
Advertisements

OCLC Online Computer Library Center Connexion Overview Session OCLC CJK Users Group 2007 Annual Meeting March 24, 2007, Boston.
A worldwide library cooperative OCLC Online Computer Library Center OCLC CJK Users Group 2007 Annual Meeting March 24, 2007, Boston David Whitehair, OCLC.
A Comparative Study of Searching Korean Scripts in OPACs: The Impact of Spaces Miree Ku Duke University.
Problems with Non-roman Character (Korean) Searching Prepared by Prepared by Young Ki Lee Young Ki Lee Senior Cataloging Specialist Senior Cataloging Specialist.
Murray Sargent III Microsoft Corporation Text Services Group, Word Tips & Tricks on Editing and Displaying Unicode Text.
Retrieving Full Text Articles in EBSCO Databases A guide for Allied Health Professionals Medline, CINAHL and SPORTDiscus on the Electronic Health Library.
OCLC Online Computer Library Center Connexion Client 1.30 for Multiscripts Cataloging CJK User Group Meeting, Chicago April 2, 2005 David Whitehair and.
1. 2 DESTINY LIBRARY CATALOG our OPAC software, has several paths to get there 3 (above)
G et I t F aster with F acts A nd Q uestions ILL/DD Committee NCC Open Meeting Atlanta, GA April 3, 2008.
Introduction to ZPORTAL Prepared by Houeida K. Charara Electronic Resources Librarian LAU Libraries ©2010.
NEXT GENERATION OPAC V U F IND AT THE UNIVERSITY OF MICHIGAN Mari Suzuki The Asia Library.
Solutions for Multilingual Literature by XSL Formatter 6,800 known languages.
(Yale University Library’s VuFind implementation) & OPAC requirements for CJK plus ________________________ CEAL Committee on Technical Processing & Committee.
作成( 改編) 1 参考文献の書き方を覚えよ う! 発表やレポート作成に文献等を活用 した場合、その出典を明確にしなく てはなりません。 練習してみましょ う。
Using the Children’s Literature Comprehensive Database Atkinson Library Jackson Community College.
言語とジェンダー. 目的 言語には、性的な存在である人間の自己認識や 世界認識を決定する力が潜んでいる。 – 言語構造の面(言語的カテゴリー ) – 言語運用の面 日常に潜む無意識の言語の力を、記述し、意識 化することが本講義の目的である。 同時に、さまざまな言語、さまざまな文化には、 それぞれに特徴的な問題があり、ジェンダーの.
主要穀物の生産について 1班 07A059 下久保 三奈 07A060 新家 智恵梨 07A061 末田 麻彩 07A095 野澤 彩
Modified Basic search: 1 – clarifies that only one data element may be search at a time 2 – adds the data element subject.
1 The Forest & the Trees: HKCAN beyond CJK Cataloging Presented by Charlene Chou Columbia University HKCAN Seminar & Opening Oct. 4, 2002.
With Microsoft Access 2010 © 2011 Pearson Education, Inc. Publishing as Prentice Hall1 PowerPoint Presentation to Accompany GO! with Microsoft ® Access.
Relational Databases What is a relational database? What would we use one for? What do they look like? How can we describe them? How can you create one?
Databases Ms. Scales. What is a Database? Database  A collection of data organized for fast search and retrieval  Examples: Telephone Directories Hospital.
Batch-conversion of Non-standard Multiscript Records by XSLT Lucas Mak Metadata and Catalog Librarian Michigan State University Catalog Management Interest.
Shared Task Proposal, FIRE 2012 Monojit Choudhury Microsoft Research Lab India.
XP New Perspectives on Microsoft Access 2002 Tutorial 41 Microsoft Access 2002 Tutorial 4 – Creating Forms and Reports.
East Meets Rest Adding East Asian Scripts to Harvard’s ILS Prepared for presentation to the North American Aleph Users’ Group 2 June 2003 Charles Husbands,
2.3 Organising Data for Effective Retrieval
Introduction to Databases. Overview  What is a Database?  What is a Database Management System?  How is information organized in a database?  What.
Classroom User Training June 29, 2005 Presented by:
Libraries Australia Cataloguing Parallel Session Bemal Rajapatirana / Rob Walls.
Improving the Catalogue Interface using Endeca Tito Sierra NCSU Libraries.
CiNii Books is a service that provides information, which has been accumulated by NACSIS-CAT, on books and journals that are held in university libraries.
Updated :02 Hong Kong University of Science & Technology Library XML Name Access Control Repository at the Hong Kong University of Science.
RDA in NACO Module 10 Non-Latin Languages. 2 2 RDA and AACR2 in Non-Latin Authority Work As in other areas, most NACO instructions on NAR creation are.
Introduction to database systems
Maximize Your Success Obtaining Japanese materials via ILL.
National Bibliographies the Chinese Experience Ben Gu ( 顾犇 ), Director Acquisitions & Cataloging Department / Online Library Cataloging Center National.
Character Encoding, F onts. Overview Why do character encoding and fonts matter to linguists? How can you identify problems? Why do these problems arise?
PROVIDING REMOTE ACCESS TO MAP SET AND SERIES HOLDINGS USING DIGITAL INDEX MAPS AS A DISCOVERY TOOL By Paige G. Andrew Faculty Maps Cataloger Pennsylvania.
SUMMON ® 2.0 DISCOVERY REINVENTED. What is Summon 2.0? A new, streamlined, modern interface New and enhanced features providing layers of contextual guidance.
Speak Mandarin in 500 Words Lesson 3 1 第三課 這是什麼?
Key Applications Module Lesson 21 — Access Essentials
CiNii Articles is a service that provides information on scholastic articles, with an emphasis on Japanese papers. It allows users to find the articles.
A worldwide library cooperative OCLC Online Computer Library Center OCLC CJK Users Group 2007 Annual Meeting March 24, 2007, Boston David Whitehair, OCLC.
Building User Services with OCLC’s WorldCat Local Washington State University Libraries Al Cornish, Head of Library Systems Lihong Zhu, Head of Technical.
Improving Access to Geoscience Resources via Content Enhancement Linda R. Musser Pennsylvania State University October 2011.
9/26/2007OCLC Orientation & Services1 What is OCLC?
Exploring Microsoft Access Chapter 1 Introduction to Microsoft Access: What Is A Database?
MICROSOFT ACCESS With your host: Daniel McAllister.
Connexion Comparison Client or Browser? Fran Juergensmeyer Waukegan Public Library 2 nd Annual WILIUG Conference June 16, 2006 Cataloging from A (Authority)
Demonstration of HKCAN database Outline Database system overview Software characteristics Database status.
The Climate as the Major Determinant Shaping Japanese National Character : True or False? B11567 Saki Yokomuro.
ARABIC SCRIPT CATALOGUING at Georgetown University in Qatar Stefan Seeger MENA-IUG 5 th Annual Conference, Dubai 2010.
STEAM Content and Alt Format. STEAM – step up from STEM Science Technology Engineering Arts Math.
How to Use the Library Catalog Objective: Students will understand how to locate information using a library catalog.
HOW TO SEARCH …………………. By Mamoun Al Rahhal. How to Search Before you start a search operation. Optional selections. Function buttons. Filling Criteria.
CHAPTER 1 – INTRODUCTION TO ACCESS Akhila Kondai September 30, 2013.
© 2001, Penn State University Encoding on the Internet Elizabeth J. Pyatt CETS.
How to Use the Library Catalog Objective: Students will understand how to locate information using a library catalog.
Michael J. Duffy IV Western Michigan University Music Library Association 2016 WorldCat Discovery: Updates from the MLA/MOUG OCLC.
Databases. What is a Database? A database is an organized collection of information or data. Databases can be paper-based or electronic. Information (text.
Data Virtualization Demoette… ODBC Clients
Summon discovers contents from one search box!
Unicode Implementation in the Yale Catalog
Databases.
Mapping UTF-8 Input (IME) to MARC-8 (EACC) Data: Chinese Example
Presentation transcript:

Multilingual, Multi-script Catalog Requirements (An Arcadia Project) ________________________ January 29, 2010

Jan 2010 Outline _____________________________________________________ Background about the Arcadia non-Roman script project Introductions Orbis vs. YUFind and systems like YUFind Requirements discussion Wrap-up

Jan 2010 Project Goals _____________________________________________________ Gap analysis of multilingual, multi-script functionality in Lucene-Solr-Solrmarc discovery applications (e.g., YUFind) Identification of desirable functionality Collaboration opportunities, community interest Recommendations with level-of-effort analysis

Jan 2010 Orbis vs. Yufind _____________________________________________________

vs Chinese example: “ 中日韩经济合作的新起点 ” N-gram tokens, where N=2:

Jan 2010 Background: NR Scripts in Catalog Records _____________________________________________________

Jan 2010 JACKPHY _____________________________________________________ Japanese Arabic Chinese Korean Persian Hebrew Yiddish

Jan 2010 One-to-Many (CJK) _____________________________________________________ Example: “Mao Zedong” 毛泽东 Simplified 毛澤東 Traditional 毛沢東 Kanji (Modern)

Jan 2010 One-to-Many (CJK) _____________________________________________________ “Mao Zedong” in simplified Chinese characters retrieves 527 results

Jan 2010 One-to-Many (CJK) _____________________________________________________ The same search in traditional Chinese characters yields154 hits. Also Note paired fields

Jan 2010 One-to-Many (Digraphs) _____________________________________________________ ו וירטשאפט The Yiddish word “Virtshaft” is entered here with two separate vavs (i.e., key stroke ‘u’ in Microsoft’s Hebrew IME): U05D5 + U05D5

Jan 2010 One-to-Many (Digraphs) _____________________________________________________ N = 49 results

Jan 2010 One-to-Many (Digraphs) _____________________________________________________ װירטשאפט The same word is this time entered as a double-vav digraph = U05F0 (via MS Hebrew IME key combo right-alt+u)

Jan 2010 One-to-Many (Digraphs) _____________________________________________________ N = 11 results

Jan 2010 NR Spelling Suggestions _____________________________________________________ Unhelpful suggestion?

Jan 2010 Labels and Facets _____________________________________________________ Should script/language of query determine script/language of facets?

Jan 2010 Labels and Facets _____________________________________________________ Better would be: 杉本つとむ, (11) 高橋幹夫, (11) 野口武彦. (8) 渡辺信一郎, (7) OR: Sugimoto, Tsutomu, (11) Takahashi, Mikio, (11) Noguchi, Takehiko. (8) Watanabe, Shin’ichirō, (7) But not both mixed together. Let end user decide?

Jan 2010 Labels and Facets _____________________________________________________ We would like to choose our preference of display script here. For example, 江戸 By: 野村兼太郎, Published: 1942 Format: Book, Electronic Resource 江戶 の 翻訳家たち By: 杉本 つとむ, Published: 1995 Format: Book, Electronic Resource We would like to ask library users the best option for displaying parallel field data: 江戶 / 田中優子編. Contributors: 田中優子, Format: Book Language: Japanese Published: 東京 : 作品社, Series: 日本の名随筆. 03 别卷 ; 94 江戶 / 田中優子編. Edo / Tanaka Yūko hen. Contributors: 田中優子, Tanaka, Yūko, Format: Book Language: Japanese Published: 東京 : 作品社, Tōkyō : Sakuhinsha, Series: 日本の名随筆. 03 别卷 ; 94 Nihon no meizuihitsu. 03 Bekkan ; 94

Jan 2010 Language/Script of Interface _____________________________________________________ OCLC’s brief record display Interface easily flipped to one of several languages

Jan 2010 Language/Script of Interface _____________________________________________________ OCLC’s detailed record display with Japanese language interface

Language/Script of Interface OCLC WorldCat.org does localization of labels and instructions as well as localization of mapped facet values. Examples here in Chinese.

Jan 2010 Language/Script of Interface _____________________________________________________

Jan 2010 Language/Script of Interface & Text Directionality _____________________________________________________

Jan 2010 Sorting of Results _____________________________________________________ 江戸文学俗信辞典 Edo bungaku zokushin jiten 江戸文学地名辞典Edo bungaku chimei jiten 江戸文学辞典Edo bungaku jiten 江戸文様辞典Edo mon’yo jiten

Jan 2010 Sorting of Results _____________________________________________________ Also note bi- directional text

Jan 2010 Sorting within result sets: Options to Consider _____________________________________________________ For multiple languages sharing a script, e.g. Chinese ideographs, Arabic, Hebrew, or Latin, how would the users prefer to see the result sets sorted? We consider here the Chinese & Arabic cases…

Jan 2010 Sorting within Result Sets: Options to Consider _____________________________________________________ Sorting of results returned in Chinese script— Three sort strategies: (a) sort by Romanized equivalents; (b) sort by pronunciation; or (c) sort by radical- stroke?

Jan 2010 Sorting within Results Sets: Arabic script _____________________________________________________ How to handle additional Arabic-script characters in use for languages such as Persian, Kurdish, and/or Urdu? ڤ (vah, derived from ﻑ, fah) پ‎(pah) ﭺ (chah, derived from ج, g ̌ im) گ (gaf) ژ (zāī, derived from ز, zayin)

Jan 2010 Discussion User Needs and Expectations