Problems with Non-roman Character (Korean) Searching Prepared by Prepared by Young Ki Lee Young Ki Lee Senior Cataloging Specialist Senior Cataloging Specialist.

Slides:



Advertisements
Similar presentations
You will see later why I show this DVD.
Advertisements

Korean collections RECON project in NII/NACSIS-CAT network March at CEAL CKM Meeting.
When parallels collide: Parallel records, parallel fields and hybrid records OCLC Users Group Annual Meeting 3/6/2004 Hsi-chu Bolick University of North.
CJK Character Validation – Impact from EACC to Unicode Migration 2006 CEAL Conference Committee on Technical Processing Ai-lin Yang East Asian Library,
Japanese Records and Whether or not to Switch from MARC 8 to Unicode Storage (with an Innovative Interfaces Millennium local system) The University of.
OCLC Online Computer Library Center Connexion Overview Session OCLC CJK Users Group 2007 Annual Meeting March 24, 2007, Boston.
FROM RLIN TO OCLC CONNEXION DIFFERENT WORKFLOWS AND DIFFERENT PRACTICE Teresa Mei East Asian Catalog Librarian Cornell University Library.
OCLC Online Computer Library Center OCLC Cataloging Update Connexion client 1.50 & more OCLC CJK Users Group Annual Meeting San Francisco, CA April 8,
Updates on Descriptive Cataloging of East Asian Material: CJK Examples of AACR2 - Chapter CEAL Conference Committee on Technical Processing Serial.
Library of Congress Report to Committee on Technical Processing CEAL Young Ki Lee Regional and Cooperative Cataloging Division Library of Congress.
A worldwide library cooperative OCLC Online Computer Library Center OCLC CJK Users Group 2007 Annual Meeting March 24, 2007, Boston David Whitehair, OCLC.
A Comparative Study of Searching Korean Scripts in OPACs: The Impact of Spaces Miree Ku Duke University.
OCLC Online Computer Library Center OCLC CJK Update Hisako Kotaka, Senior Product Manager Product Development Division x6480
Canada The Bath Profile and The Journey To Interoperability Carrol D Lunau Bath Profile Maintenance Agency July 7, 2003
1 Bibliographic Fields, MARC Holdings Tags, and Control Record Notes used by Bibliographic Records Management staff when preparing New Titles, Title Changes.
Basic Millennium Cataloging Training Karen Y. UC Berkeley February
How To Use OPAC.
Welcome to the New Library Website. The website has been revised to offer: Quick catalogue access to diverse resources and archives in the library Awareness.
ELECTRONIC TABLES OF CONTENTS David Williamson Bruce Knarr John Byrum January 30, 2002.
KORMARC and MARC21 Sun-Yoon Lee University of Southern California.
OCLC Online Computer Library Center Connexion Client 1.30 for Multiscripts Cataloging CJK User Group Meeting, Chicago April 2, 2005 David Whitehair and.
Acquiring Chinese E-Books: Where to Start and How to Get Here-- University of Pittsburgh Library System's Experience Hong Xu March 24, 2007.
DIKLA GRUTMAN 2014 Databases- presentation and training.
Getting Started with MarcEdit
INFORMATION SOLUTIONS Citation Analysis Reports. Copyright 2005 Thomson Scientific 2 INFORMATION SOLUTIONS Provide highly customized datasets based on.
Module D: Describing content of works and expressions User task: select.
1 In-Class Exercise 1 (cont.) society in East Asia consumers behaviors cultural anthropology research global influence of culture societal/social change.
Prepared by Houeida Kammourié-Charara InfoCommons Librarian © 2012.
Cataloging: Millennium Silver and Beyond Claudia Conrad Product Manager, Cataloging ALA Annual 2004.
1 Character Codes Related Problems - UNICODE OPAC and Millennium at WASEDA Univ. Library - Tsutomu SUZUKI Waseda University.
Hong Kong Chinese Authority (Name) Project Latest developments CEAL 2002 Annual Meeting Washington, D.C. Maria Lau HKCAN Workgroup.
In CUNY, there are many ways of searching using the on-line CUNYPLUS catalog. CONTINUE To Ways to Search CONTINUE TO BASIC Searching.
Batch-conversion of Non-standard Multiscript Records by XSLT Lucas Mak Metadata and Catalog Librarian Michigan State University Catalog Management Interest.
SIRSI Online Catalog WLAC Heldman Learing Resource Center.
East Meets Rest Adding East Asian Scripts to Harvard’s ILS Prepared for presentation to the North American Aleph Users’ Group 2 June 2003 Charles Husbands,
Library Workshop for EPA Sep Outline 2 Find Library resources for research  iSearch  ProQuest Education Databases RefWorks – a web-based.
Intended for novice users as an introduction to the online catalog’s capabilities. The guide would be available on the New Brighton Public Library’s website.
Libraries Australia Cataloguing Parallel Session Bemal Rajapatirana / Rob Walls.
EIUG 14 South Bank University To scope or not to scope; or What the users Will Janet Aucock Bibliographic Data Services Manager University of St Andrews.
Tutorial: Search and Browse Project MUSE. Search for Books and Journals Type search terms, keywords, phrases (“”) and Boolean Operators (AND, OR, NOT)
Basic Catalog Searching Rich Edwards Innovative Coordinator Washington State Library.
Library of Congress Report to Committee on Technical Processing CEAL Young Ki Lee Regional and Cooperative Cataloging Division Library of Congress.
Highlights from recent MARC changes Sally McCallum Library of Congress.
AURAK Library OPAC How to Access and Use AURAK Library Online Public Access Catalog (OPAC)? AURAK SAQR LIBRARY.
Keyword vs. Controlled Vocabulary Searching 12 Basic Skills for IQ.
Using Technological Devices to Improve Cataloging Productivity Jee-Young Park Korean Cataloger Columbia University.
RDA Toolkit is an integrated, browser-based, online product that allow user to interact with a collection of cataloging-related documents and resources.
Understanding InfoHawk Indexes Technical Background for Libraries Staff Patricia Baird Sue Julich.
Searching Voyager: #2: Finding a Book by Its Title Zale Library at Paul Quinn College David Hamrick, 2012 “Now, voyager, sail thou forth to seek and find…”
A worldwide library cooperative OCLC Online Computer Library Center OCLC CJK Users Group 2007 Annual Meeting March 24, 2007, Boston David Whitehair, OCLC.
Connexion Comparison Client or Browser? Fran Juergensmeyer Waukegan Public Library 2 nd Annual WILIUG Conference June 16, 2006 Cataloging from A (Authority)
Demonstration of HKCAN database Outline Database system overview Software characteristics Database status.
The physical parts of a computer are called hardware.
Health: International Finding information on health care delivery in other countries.
Anatomy of Subject Results Search Screen. A subject search will result in.
Twenty-Minute Library Tips A presentation about the Library of Congress classification system and how you can use it to find books in our library! Beth.
Browse Index Keyword Index Browse Index – 1xx, 24x, 6xx, 7xx Keyword Index – 1xx, 2xx, 6xx, 7xx, plus 5xx.
WHAT IS MAAGNET?  Maag’s online library catalog  Contains all materials owned by Maag  Books, Journals, Audio/Video, Ebooks, Microforms, Government.
Home Page Searching the SHARE-Catalog From the Home Page, you can search the SHARE catalog, find information in online databases, search other library.
Once you acquire thousands e-books, then what? Shi Deng, UC San Diego OCLC CJK User Group Meeting March 24, 2007.
ALA-LC Romanization Tables Korean 2009 Edition March 2009.
Cascading Subfields Presentation to the Cataloging Congress 5 October 2001 Spencer M. Anspach.
The ___ is a global network of computer networks Internet.
Michael J. Duffy IV Western Michigan University Music Library Association 2016 WorldCat Discovery: Updates from the MLA/MOUG OCLC.
Go to the link listed below to sign-in to your student portal via the My Campus tab.
12 Basic Skills for IQ: Keyword vs. Controlled Vocabulary Searching.
+ Introduction to the Digitization of Hanguk Bulgyo Chonso Bo Kwang Han, Young Sik Hong, Keum Suk Lee, Yong Kyu Lee, Soon Il Hwang, Jae Soo Lee Institute.
Advanced search techniques in databases
Introduction to Information Retrieval
Korean McCune-Reischauer Romanization Dictionary
Presentation transcript:

Problems with Non-roman Character (Korean) Searching Prepared by Prepared by Young Ki Lee Young Ki Lee Senior Cataloging Specialist Senior Cataloging Specialist Korean/Chinese Team Korean/Chinese Team RCCD RCCD Library of Congress Library of Congress

Topics to be covered 1.Non-roman script (Korean) searching under CJK data fields without spacing 2.No Unified index (Normalization) between Hangul (Korean) and Hancha (Chinese character) 3.Microsoft Korean IME 4.Display of search results 5.CJK Compatibility Database

Title Word Search for Title Word Search for Search ( : the border): -the number of hits on this ti: search is 363 -the ratio of relevant hits only 13 % (13 out of 99) in the 1 st group (Books ) -the records which have the word in any position in the title fields (includes between subfields) are picked up by System, such as : / : / /, : /, etc. -In Voyager (currently with space), same search (tkey ) retrieves only 9 hits

Search9

Title Word Search for Title Word Search for Search ( : the border): -the number of hits on this ti: search is 360 -the ratio of relevant hits only 13 % (13 out of 99) in the 1 st group (Books ) -the records which have the word in any position in the title fields (includes between subfields) are picked up by System, such as : / : / /, : /, etc. -In Voyager (currently with space), same search (tkey ) retrieves only 9 hits

Title Word Search for Title Word Search for Search ( : the border): -the number of hits on this ti: search is 360 -the ratio of relevant hits only 13 % (13 out of 99) in the 1 st group (Books ) -the records which have the word in any position in the title fields (includes between subfields) are retrieved, such as = / = / /, = /, etc. -In Voyager (currently with space), same search (tkey ) retrieves only 9 hits

Title Word Search for Title Word Search for Search ( : the border): -the number of hits on this ti: search is 360 -the ratio of relevant hits only 13 % (13 out of 99) in the 1 st group (Books ) -the records which have the word in any position in the title fields (includes between subfields) are retrieved, such as =, =, etc. -In Voyager (currently with space), same search (tkey ) retrieves only 9 hits

Title Word Search for Title Word Search for Search ( : the border): -the number of hits on this ti: search is 360 -the ratio of relevant hits only 13 % (13 out of 99) in the 1 st group (Books ) -the records which have the word in any position in the title fields (includes between subfields) are retrieved, such as = / = / = / /, = /, etc. -In Voyager (currently with space), same search (tkey ) retrieves only 9 hits

Title Word Search for Title Word Search for Search ( : the border): -the number of hits on this ti: search is 360 -the ratio of relevant hits only 13 % (13 out of 99) in the 1 st group (Books ) -the records which have the word in any position in the title fields (includes between subfields) are retrieved, such as = / = / /, = /, etc. -In Voyager (currently with space), same search (tkey ) retrieves only 9 hits

Title Word Search for Title Word Search for Search ( : the border): -the number of hits on this ti: search is 360 -the ratio of relevant hits only 13 % (13 out of 99) in the 1 st group (Books ) -the records which have the word in any position in the title fields (includes between subfields) are retrieved, such as = / = / / = /, etc. -In Voyager (currently with space), same search (tkey ) retrieves only 9 hits

7

Title Word Search for Title Word Search for Search ( : the border): -the number of hits on this ti: search is 360 -the ratio of relevant hits only 13 % (13 out of 99) in the 1 st group (Books ) -the records which have the word in any position in the title fields (includes between subfields) are retrieved, such as = / = / /, = /, etc. -In LC Online Catalog: (currently with space), title word search retrieves only 9 hits

Title Word Search for Title Word Search for Search ( : philology): -In OCLC, the number of hits on ti: search is 308 -the ratio of relevant hits is only 37% (36 out of 95) in the first group (Books ) -Includes = = = / = / = = = / = /, = /, etc., = /, etc. -In Voyager (currently with space), same search (tkey ) retrieves 32 hits

Title Word Search for Title Word Search for Search ( : name of ancient Korean country) Search ( : name of ancient Korean country) retrieves irrelevant records, such as retrieves irrelevant records, such as = / / / / / = / / / / / CD-ROM = CD-ROM/ / / / /CD-ROM = CD-ROM/ / / / / = / / = / / = / / / / / / = / / / / / / = / / = / / 5 5 = / / /5 / / / / / / / / = / / /5 / / / / / / / / = / / /, etc. = / / /, etc.

2

4

7

Kochoson8

komunso1

Komunso2

Komunso3

Title Word Search for Title Word Search for ( : Korean Economy): ti: search ( : Korean Economy): ti: search -search : the number of hits 300 -search : the number of hits 652 -search : the number of hits 3 -search : the number of hits 0 -search Hanguk kyongje : the number of hits 1,490 Title Phrase search for : ti= search

Title Word Search for Title Word Search for ( : Korean Economy): ti: search ( : Korean Economy): ti: search -search : the number of hits 295 -search : the number of hits 652 -search : the number of hits 3 -search : the number of hits 0 -search Hanguk kyongje : the number of hits 1,490 Title Phrase search for : ti= search

Title Word Search for Title Word Search for ( : Korean Economy): ti: search ( : Korean Economy): ti: search -search : the number of hits 295 -search : the number of hits 652 -search : the number of hits 3 -search : the number of hits 0 -search Hanguk kyongje : the number of hits 1,490 Title Phrase search for : ti= search

Title Word Search for Title Word Search for ( : Korean Economy): ti: search ( : Korean Economy): ti: search -search : the number of hits 295 -search : the number of hits 652 -search : the number of hits 3 -search : the number of hits 0 -search Hanguk kyongje : the number of hits 1,490 Title Phrase search for : ti= search

Title Word Search for Title Word Search for ( : Korean Economy): ti: search ( : Korean Economy): ti: search -search : the number of hits 295 -search : the number of hits 652 -search : the number of hits 3 -search : the number of hits 0 -search Hanguk kyongje : the number of hits 1,499 Title Phrase search for : ti= search

Title Phrase Search for Title Phrase Search for ( : Korean Economy): ti: search ( : Korean Economy): ti: search -search : the number of hits 295 -search : the number of hits 652 -search : the number of hits 3 -search : the number of hits 0 -search Hanguk kyongje : the number of hits 1,490 -search # : the number of hits : 461 (ti: AND ti: ) Title Phrase search for : ti= search

Search ti: nodongja or or or Search ti: nodongja or or or

Korean IME Problems 1. Personal name search with invalid character from Korean IME -Search in pn: : 0 hit. (F9E1) is invalid character from Korean IME -Search in pn: : 157 hits. (674E) is valid MARC21 character 2. Title search with invalid character from Korean IME 2. Title search with invalid character from Korean IME -Search in ti: : 0 hit. (F941) is invalid character from Korean IME -Search in ti: : 21,393 hits. (8AD6) is valid MARC21 character 3. Korean Family name 3. Korean Family name -No MARC 21 equivalent

Display Order 1. Browse search: sorted by Unicode value number – roman – Japanese – Hancha – Hangul 2.Keyword search: sorted by alphabet order of Romanization form number -- Romanization 3.Display order : character by character on designated value

sort2 Unicode total strokes radical (# : stroke) : 9280: (gold) 8 : 9580 : (gate) 8 : 990A: (eat) 6 : 9B (ghost) 10 : AC00

sort3

Display Order 1. Browse search: sorted by Unicode value number – roman – Japanese – Hancha – Hangul 2.Keyword search: sorted by alphabet order of Romanization form number -- Romanization 3.Display order : character by character on designated value NOT word by word

sort1 : C9C4 : CE68 : C911 : C778

Display Order 1.Browse search: sorted by Unicode value number – roman – Japanese – Hancha – Hangul 2.Keyword search: sorted by alphabet order of Romanization form number -- Romanization 3.Display order : character by character on designated value NOT word by word

CJK Compatibility Database 1. The CJK Compatibility Database includes more than 450 non-MARC21 Chinese, Japanese and Korean characters, Hangul syllables and diacritic marks, matched with their MARC21 equivalents. 2. The database is intended to enable catalogers to quickly and conveniently replace a non-MARC21 character with its MARC21 equivalent. 3. The list of characters in the database was initially identified by LC staff, and was supplemented by entries in a similar database at Yale University. 4. The database is a cooperative undertaking, and is intended for the use of all CJK catalogers. If you encounter a non-MARC21 character in the course of your work, please report it to us so that it can be added to the database. Notify Young Ki Lee, Senior Cataloging Specialist, Korean/Chinese Team, Library of Congress, at

Thank you