Download presentation
Presentation is loading. Please wait.
1
CIG annual conference Sept. 2006
SUNCAT: the creation, maintenance and challenges of a national Union Catalogue of serials in the UK Natasha Aburrow-Jones SUNCAT Project Officer Good afternoon; my name is Natasha Aburrow-Jones, and I am the Bibliographic Project Officer for SUNCAT. I am here today to give you an insight into the trials and tribulations of building a national union catalogue for serials – the fun bits, the data bits, the issues that have been raised as a result of being able to stand back and look at serials cataloguing in the UK. CIG annual conference Sept. 2006 CIG, Sept. 2006
2
SUNCAT: a brief history
UKNUC Feasibility Study (2001) SUNCAT Scoping Study (2002) JISC and RSLP funded Based in EDINA, with partners: University of Edinburgh and Ex Libris 3 Phases; 2 stages Stage 1: pilot (Feb.2003-July 2006) Stage 2: service (Aug ) More detail on our website at: I won’t dwell on the events that led to SUNCAT being built, as this has been covered in many other SUNCAT sessions over the past three years, and I don’t want to bore you. Suffice it to say, it was revealed in a feasibility study for a national union catalogue of all materials that there was a real need for improved information about serials held in the UK, both their existence, and their location and associated holdings. One particular issue that was brought to light in this study was the variable quality of bibliographic and holdings data in library catalogues – I’ll refer to this later on. It was ultimately decided that there was no need for a National Union Catalogue for monographs, but that the reverse was true for serials, so a scoping study was undertaken. JISC (Joint Information Systems Council) and RSLP (Research Support Libraries Programme) put up the funding for this serials Union Catalogue, and the Invitation to Tender was won by EDINA, one of the two national JISC-funded data centres, with their partners, the University of Edinburgh, and Ex Libris (who supply the LMS, Aleph 500, software that SUNCAT runs on). There were 3 phases of SUNCAT envisaged: Phase 1 ran from February, 2003 until December 2004, which was concerned with the building of the database; Phase 2 started in January, 2005, and runs until the end of 2006, which is concerned with the running of the pilot service, and consolidation of the database; and Phase 3, from 2007 onwards, which is to be concerned with further consolidation and the running of a steady-state service. A pilot service of SUNCAT was launched in February, 2005; in August, 2006, SUNCAT became the newest fully-fledged EDINA service. CIG annual conference Sept. 2006 CIG, Sept. 2006
3
CIG annual conference Sept. 2006
SUNCAT aims SUNCAT: primary aims For researchers, a single tool for the location of serials, including information about access For librarians, a central repository of high quality bibliographic records for downloading to local catalogues and a location tool for ILL Additionally - to raise consciousness of the importance of quality serials information among UK researchers and librarians Moving on now I want to highlight the 2 primary aims SUNCAT which emerged from the feasibility and scoping studies I mentioned earlier. SUNCAT is aimed at a wide range of groups in higher education, with different purposes for all. For researchers to be a source of information about the location of serials in the UK, including information about access For librarians to be a source of high quality bibliographic records for downloading, so enabling libraries to upgrade records on their local catalogues and also to act as a location tool for inter-library loans An additional, but important, aim is to raise consciousness of the importance of quality serials information among UK researchers and librarians. The UK National Union Catalogue feasibility study identified the variable quality of serials bibliographic and holdings records as a key problem for users seeking serials, and this issue of data quality underpins the vision of SUNCAT as a central source of serials information for the UK research community. CIG annual conference Sept. 2006 CIG, Sept. 2006
4
CIG annual conference Sept. 2006
What is a serial? AACR2 (chap.12, 12.0A) talks about: “continuing resources”, successively issued (i.e., serials) or integrating (e.g., updating loose-leafs, updating Web sites) Certain “categories of finite resources (i.e., those with a predetermined conclusion): resources that exhibit characteristics of serials … but whose duration is limited.” Online Dictionary for Library and Information Science says: serial A publication in any medium issued under the same title in a succession of discrete parts, usually numbered (or dated) and appearing at regular or irregular intervals with no predetermined conclusion. In AACR2 2002, serials are classified as a type of continuing resource. … Serial publications include print periodicals and newspapers, electronic magazines and journals, annuals (reports, yearbooks, etc.), continuing directories, proceedings and transactions, and numbered monographic series cataloged separately. When serials split, merge, or are absorbed, a title change may occur. CIG annual conference Sept. 2006
5
Contributing Libraries: Phase 1
British Library National Library of Scotland National Library of Wales Imperial College, London London School of Economics Manchester Metropolitan University Queens University, Belfast University of Birmingham University of Bristol University of Cambridge University College, London University of Durham University of Edinburgh University of Glasgow University of Leeds University of Manchester University of Newcastle University of Nottingham University of Oxford University of Southampton University of Wales, Cardiff University of Warwick In order to provide a critical mass of serials data, 22 major research libraries were identified to participate in Phase 1 of SUNCAT. (Phase 1 ran from Feb Dec. 2004) They included both national and university libraries selected on the basis of their significant and large research collections, we also purchased the CONSER database and the ISSN Register. With all of these loaded SUNCAT held over 4 million records, which is an estimated 90% of titles held in the UK. CIG annual conference Sept. 2006 CIG, Sept. 2006
6
Contributing Libraries: Phase 2
Aberdeen University Birkbeck College, University of London Bolton University British Film Institute British Geological Survey British Medical Association CCLRC Cranfield University Edinburgh College of Art Essex University Exeter University Hull University IET (formerly the IEE) International Institute for Strategic Studies Kent University King’s College London Lancaster University Leeds Metropolitan University Liverpool University London Business School London School of Hygiene & Tropical Medicine Manchester Public Libraries Medical Research Council Napier University National Art Library National Maritime Museum National Museums & Galleries of Wales National Museums of Scotland Natural History Museum Reading University Royal Botanic Gardens. Kew Royal College of Nursing Royal Geographical Society Royal Institute of British Architects Royal Northern College of Music Royal Society Royal Society of Medicine School of Oriental & African Studies, University of London Sheffield University Sheffield Hallam University Society of Antiquaries of London Stirling University Sussex University University of Central Lancashire University of East Anglia University of London Research Libraries University of Ulster University of Wales, Swansea Wiener Library, Institute of Contemporary History Wellcome Library for the History & Understanding of Medicine Women’s Library, London Metropolitan University York University Zoological Society of London By the end of this year we hope to have received data from around a further 60 libraries. We currently have a commitment from over 50 libraries to participate in SUNCAT and have loaded data from 16 of these, with the rest scheduled to contribute data over the next few months. As you can see from this list the range of libraries includes more large University Libraries, such as Liverpool , Exeter and East Anglia but that we have also started to include public libraries, such as Manchester Public Libraries and also special libraries and learned societies, including the Wellcome Library, the National Art Library the Natural History Museum and The Royal Society. Including smaller specialist libraries ensures that we will extend the number of titles on SUNCAT as these smaller more specialist collections will include more unique titles not widely held in the UK. It also ensures that these collections are made more visible to researchers. The geographical coverage of the catalogue will also be extended with the addition of these libraries with contributors from as far north as Inverness and Stornoway (on the Isle of Lewis in the Western Isles) right down to Exeter in the South and from Belfast in the West right across to East Anglia. CIG annual conference Sept. 2006 CIG, Sept. 2006
7
SUNCAT: technical description
Runs on the Aleph 500 software, supplied by Ex Libris Aleph is an LMS used all over the world, and has the extra functionality needed for a union catalogue A physical union catalogue Records from all contributing libraries stored in one central database No federated searching involved, to improve search results Records deduplicated to view at point of searching SUNCAT runs on the Aleph 500 software, which is the LMS supplied by our partners, Ex Libris. It is used throughout the world, and one of its functionalities is the union view display, specifically designed for union catalogues. Ex Libris have had experience in building union catalogues before, as Melvyl, the Union Catalogue for the California Digital Library, runs on Aleph. SUNCAT is a physical union catalogue, in that all records are sent to EDINA, as a central site, and loaded into a single database. There is no federated searching, whereby a search query is directed at multiple sources, but returned through one interface. Having a physical union catalogue improves search results times, and also has advantages such as not having to keep an eye on any changes in addresses being queried, downtime is within SUNCAT control, and so forth. SUNCAT has a deduplicated union view, that is, SUNCAT stores all the individual records from all contributing libraries, matches those records for the same title together, and displays the bibliographically best one with all the holdings associated with it and from the other contributing libraries. I’ll talk about this further on. CIG annual conference Sept. 2006 CIG, Sept. 2006
8
SUNCAT: Processing files from contributing libraries
File of serials titles (bibliographic and holdings records) sent to SUNCAT via ftp Data specification drawn up by SUNCAT, to harmonize data, and put it into a form suitable for loading into database Data specification approved by contributing library before data is converted Rejection reports / character conversion error reports run Locations tables are added to SUNCAT Data is loaded into SUNCAT Further details on our website: In order to build a union catalogue, we have to have data. SUNCAT is more than a union list of data; it is a collaboration with all contributing libraries and EDINA. This slide covers the basics of what happens when we receive a file from a contributing library. I’ll go into a bit more depth in a moment, about the specifics of the cataloguing changes that we make to the data, in order for it to be in a form suitable for loading into SUNCAT. Essentially, we are prepared to take data in almost any form from our contributing libraries. We like it in MARC21, in a MARC communications format, but have accepted files in various “flavours” of MARC, such as UKMARC, LMS specific MARC (e.g. Talis MARC, similar to UKMARC), and plenty of non-MARC records. We have received them in MARC communications format, in text format, in Excel spreadsheets, in Word documents… As long as there is some structure to the record, we take it, put it into MARC21 as much as we can, and process the data. CIG annual conference Sept. 2006 CIG, Sept. 2006
9
SUNCAT: Standard data manipulation
Bibliographic: Local control number is placed in 001 tag Change in tag 022 (ISSN) lower case “x” to upper case “X” Change 245$h[computer file](or variations thereof) to $h[electronic resource] Strip 510 tags (only indicator 1 = 0, 1, 2) Change 6XX$xPeriodicals to $vPeriodicals only when it is the last subfield in the tag Holdings: All holdings information is placed in an 852 tag Library is described in 852$a; locations in 852$b and $h Summary textual statements are in an 852$3 Holdings will look like: 852$$a<MARC organization code (if applicable)> $$b<sub-location>$$h<shelf mark>$$3<holdings information> L $$aStEdCA$$bA:PE$$3No. 1, Spring No. 5, Autumn/Winter 1996. CIG annual conference Sept. 2006
10
SUNCAT: Non-standard data manipulation
Non-MARC libraries: Two data specifications: one to put data into MARC 21 format, and one for manipulation of that data Usually easy to convert into MARC21, BUT: Records will not be catalogued according to AACR2 Records tend to be minimalistic May be problems in matching with other records CIG annual conference Sept. 2006
11
SUNCAT: Non-standard data manipulation
MARC libraries: Dependent on LMS Themes run through different libraries with same LMS Variations within one type of LMS (due to e.g., historical practices, previous database legacy issues, etc.) Essentially, every library is treated as unique No such thing as a “standard” manipulation! CIG annual conference Sept. 2006
12
SUNCAT: Matching records
Deduplicated union catalogue Uses a complex matching algorithm 3 stage selection process for matching “Preferred” record display List of common titles (LOCT) Matching above format CIG annual conference Sept. 2006
13
Matching: The SUNCAT ID
What is it? SC-ID is a unique identifier at a title level What does it do? SC-ID links records for the same title together which may not have matched appropriately using the standard algorithm How does it do this? By checking every record, and running a refined matching process before load, and assigning an appropriate SC-ID What does it look like? Stored in the 049$a; 9 digits preceded by “SC” and ending with a 2-digit Modulus 11 check What are the results? Removal of overlapping sets Will allow SUNCAT team to merge / separate sets manually which may not have merged / have mismatched with the standard algorithm CIG annual conference Sept. 2006
14
CIG annual conference Sept. 2006
SUNCAT: Data quality Data conversion has shown up issues in data quality – varying standards both in bibliographic and holdings records Matching algorithm does not match all bibliographic records for the same title successfully, due to paucity of data Results in some duplication of records on the database SUNCAT resolving this by: Improving the matching algorithm SUNCAT team matching / unmatching records to appropriate sets Asking contributing libraries to upgrade their preferred and unique records CIG annual conference Sept. 2006 CIG, Sept. 2006
15
SUNCAT: Developments in progress
Librarians’ Interface Allows librarians from contributing libraries to: Download records Access reports regarding non-matched records Notify the SUNCAT team of mismatches or records that should have matched into a set Verify unique records Match poor quality records to the appropriate set, chosen from a pool of records Customise reports of notifications of changes to the preferred record CIG annual conference Sept. 2006
16
SUNCAT: developments in progress
Download from the web Allows download for those libraries who do not use z39.50 CIG annual conference Sept. 2006
17
AIMSS: Automating Ingest of Metadata on Serial Subscriptions (1)
Based on work carried out in Phase 1 Used for electronic journals only ONIX for Serials formats Partnership with Serials Solutions Funded by JISC PALS Metadata & Interoperability Projects 2 Autumn 2005 to Summer 2006 Transmitting information serials information from publishers/aggregators to participants in chain One of the most important issues for SUNCAT is providing up-to-date accurate information about electronic journal holdings from our contributing libraries. This is of course also an issue for libraries in general especially with reference to the content of large packages which can be very volatile. During Phase 1 of SUNCAT we looked at the ONIX for Serials format , which is an internationally developed extensible format for the transmission of data about serials. Fred Guy, the SUNCAT Project Manager became involved with the NISO EDiteur Joint Working Party on the development of the ONIX for Serials formats. Following on from this initial work we formed a project partnership with a large supplier of online serials information to the UK library community, Serials Solutions, to look at improving this transfer of serials data. We received funding from JISC under the Publishers and Library/Learning Solutions Projects for a project which commenced towards the end of last year and will run to Summer 2006 The main aim of AIMMS was to look at the feasibility of using the ONIX for Serials format for transmitting serials information from publishers or aggregators to participants in the chain, including libraries. CIG annual conference Sept. 2006 CIG, Sept. 2006
18
AIMSS: Automating Ingest of Metadata on Serial Subscriptions (2)
Use of ONIX to update SUNCAT with holdings information from participating libraries Develop capability for EDINA to accept ONIX for Serials messages (Serials Online Holdings) Map ONIX for Serials fields to MARC21 fields Investigated how to upload the data received into SUNCAT records Develop and disseminate expertise to libraries Encourage wider use of ONIX for Serials format We tested this by using data supplied in ONIX for Serials format from Serials Solutions to update records on SUNCAT. This involved developing the capability to receive ONIX for Serials messages using Serials Online Holdings, mapping the ONIX for Serials fields to MARC21 fields as the format used by SUNCAT and so uploading the data into SUNCAT to update our records. We then documented the process and develop a workflow which could be used on behalf of libraries unable to develop a local capacity to accept and process ONIX for Serials messages. The results were encouraging; the ONIX format was successfully used to receive messages from Serials Solutions, map that information onto bibliographic records from SUNCAT, and load those records back into SUNCAT with new holdings information. However, data quality, as ever, turned out to be an issue. The mapping of the message relied on the title and the ISSN to be identical in both the bibliographic record and the ONIX message. As we all know, what constitutes the title may well be a subject for much debate, so that the title in the ONIX message may differ from that in the bibliographic record it should match to. Coupled with that, not all of the titles in the ONIX message were taken from the 245; some were taken from the 222, which might well differ considerably from the 245 in the bibliographic record. Therefore, matching of the two formats was not as high as we would have liked. Nevertheless, AIMSS should prove a useful starting point for the continuing development of this standard. The project should benefit SUNCAT and its user community by ensuring the online serials information held is accurate and up-to-date and should also encourage the wider use of the ONIX for Serials format, acting as an incentive to publishers and Library Management Systems to use the format again ultimately benefiting end-users with more up-to-date serials information. More information is available on the JISC website, which includes the final report; the latter is also available on the SUNCAT website. CIG annual conference Sept. 2006 CIG, Sept. 2006
19
CIG annual conference Sept. 2006
Maintaining SUNCAT Regular updates from libraries to maintain currency of database Improve matching to reduce instances of duplication Increase number of libraries in SUNCAT CIG annual conference Sept. 2006
20
CIG annual conference Sept. 2006
The future of SUNCAT Stable service Regularly updated High quality records for downloading More libraries More unique titles Improved geographic coverage Different union views Linking with related services CIG annual conference Sept. 2006
21
CIG annual conference Sept. 2006
SUNCAT SUNCAT service: SUNCAT team: Natasha Aburrow-Jones CIG annual conference Sept. 2006
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.