Download presentation
Presentation is loading. Please wait.
Published byElijah Esmond Richardson Modified over 8 years ago
1
NACO Lite? -- re-imagining Harvard’s local name authority workflow as an identity management workflow using ISNI PCC Policy Committee Meeting, November 5, 2015 Michelle Durocher Andrew MacEwan
2
Inspiration from PCC Mission Statement From the Strategic Directions report: – Enabling the extension, iterative enhancement, reuse, and open exchange of metadata – Encouraging work at the network level, i.e. sharing – Leveraging emerging technologies, such as linked data, i.e. a focus on unique identifiers – Facilitating the automated generation of metadata
3
Goals in our workflow pilot are: Cease local name authority work in favor of a workflow that shares data outside of Harvard Recognizing the value of intellectual work performed even when output is not NACO compliant, develop an efficient workflow to share that does not require duplicated work in multiple systems Experiment with publisher-supplied author metadata pre-publication, which can be rich and well suited to identity management Establish a metadata lifecycle that is valued by the community and promotes iterative enhancement
4
How would one define NACO Lite? Lowering the threshold for participation in NACO While the term NACO Lite may not be the correct name exactly… it’s a starting point for conversation The discussion quickly focused on identity management rather than name forms as being at the heart of NACO Lite Rules for establishing name forms can be time- consuming to master and to apply; also can rely on language or cultural knowledge
5
SD3: Provide leadership for the shift in authority control from an approach primarily based on creating text strings to one focused on managing identities and entities A pilot group of catalogers self-identified to explore this concept, to see if they could define a new workflow utilizing ISNI as the platform to share the value of the work currently performed as local authority work
6
Catalogers examine the steps, sequence of activities and the available tools What platform was useful for which purposes? Is there functionality for automating aspects of the work?
7
Catalogers received training from an ISNI staff member and from the Quality Team at the BL They were introduced to two interfaces available to ISNI members: a web form with Search/Create functionality and the WinIBW client, a windows-based Command Line interface with powerful troubleshooting functionality for resolving splits, merges, etc. They are experimenting with workflow scenarios and thinking about proposals for tool development
8
Early take-aways: coding If data contributed to ISNI are loaded into the Name Authority file as NACO Lite, there should be coded values that identify the status of the data, as well as the source and level of confidence of the data feed Possible precedents to consider are: 1.Using an Auth Status code in the fixed field 2.Authorizing the 883 in authority records as they are in Bib records for machine-generated
9
Early conundrums: Catalogers still rely on text elements to identify the entity they are seeking, so strings still serve a helpful role for humans Recording variant name forms is vital to differentiation by both humans and machines for not creating duplicate identifiers for the same entity Our current discovery systems don’t use identifiers as a replacement for strings yet and therefore still rely on differentiated strings for meaningful user displays
10
Early take-aways: data exchange Infrastructure of data exchange is not in place yet to support a production workflow where work done in one platform can be understood to flow reliably between global platforms (NAF, ISNI, VIAF) on a clear timeline Real-time data exchange will be necessary Interim workarounds will likely be part of our temporary workflow solution to allow us to experiment while being efficient
11
ISNI: how it works PCC Policy Committee, November 2015 Andrew MacEwan, British Library & Michelle Durocher, Harvard
12
Libraries Text Rights Music Rights Trade Sources Encyclopaedias Researchers & Professional Granting organisations Professional Societies Article databases Theses databases cross-domain bridging-domains Archives and Museums
13
Not for profit network Assignment Agency Data contributor Hosts Web page Treasurer Registration Agency Data contributor Quality Team Secretary Legal Registration Agency Data contributor 60,000 member libraries 46 national libraries 52 performer rights management organisations 89 text rights management organisations 229 music rights management organisations Administration Data contributor Chair Legal Data contributor Legal Former chair
14
Members Registration Agencies (RAGs) Data ContributorsPartnerships Associate Quality Team
15
Sustainable Quality Management ISNI Database Harvested, Batch loaded; Online contributions Algorithms Notifications Data fixing Algorithms Notifications Data fixing Sampling Data Policy Sampling Data Policy Enrichment Correction Curation Enrichment Correction Curation Crowd sourcing Members and Registration Agencies
16
Provisional: Unassigned 9,287, 278 Provisional: Possible 700,815 Assigned 8.69 million Assigned ISNIs November 2014 VIAF + non VIAF sources 4,870,099 3+ VIAF sources 428,988 2+ sources (not VIAF) 315,915 Unique name 2,735,449 Trusted single source ( JISC, BOEK, RING ) 342,231 Total 8,692,683 Authoritative, Unique, Trustful, Persistent 8.24 million persons 446,258 organisations + % confidence - % confidence ISNI data sources Searching Maintenance Anomalies VIAF QT ISNI data sources Searching Maintenance Anomalies VIAF QT
17
Name Title Partial title Rare title word Date Publisher Personal affiliation Organisation affiliation ISBN, ISWC, ISAN, DOI + Other name identifier e.g. IPI, VIAF, IPD Instrument Linked entities Dewey classification Scores are collected from each judge (ice skating style) Lowered for common surnames and common titles Score >.85 = match Score >.6 but <.85 = possible match Scores are collected from each judge (ice skating style) Lowered for common surnames and common titles Score >.85 = match Score >.6 but <.85 = possible match ISNI Matching
18
Procedures for maximizing assignment Refinement of matching algorithms E.g. introduced rare title word; Now ignoring date of birth 1900 Re-import program Rematch with new rules Rematch after new data added ISNI Quality Team: Data sampling assessing impact of single source Recommendations for program changes New criteria Assessing uncommon surname assignment Rules for online rich assignment
19
Maximizing assignment Enter a request record online (Web page or via API) Batch loaded records – passive method Quality Team manual fixes OCLC periodic re-match runs Matches from later batch loading & online activity Batch loaded records – active method Resolve possible matches found by the system Search the database for candidate records for merging Enrich a record with URLs to external sources such as author’s web pages, Wikipedia, IMDB, MusicBrainz, Discogs, etc. May 2012% assignedOct 2014% assigned ALCS41,52363.86%49,15776.66% PROL2,20535.24%4,14366.18% PROQ65,12212.89%243,48148.19% May 2012% assignedOct 2014% assigned AUVLU00%1,71648.28% ICLA00%2,20897.61%
20
VIAF re-clusters every month and makes duplicate clusters where one source has duplicates. The result can be cluster movement. ISNI has been monitoring VIAF cluster movement & making recommendations. Merges are valid cluster moves. Cluster movement
21
ISNI & VIAF XA Records VIAF includes XA records that act as “Police records” If an ISNI record has 2 VIAF Ids & an indication of a manual merge, the ISNI record gets XA status & will cause merges in VIAF If ISNI sends 2 records with the same name & an indication of manual split, it will cause a split in VIAF. A record that could go in either cluster will be admitted to neither
22
Quality Team Samples data regularly Makes corrections at cluster level Merges, splits, error notifications Access to cataloguing client / macros Curates End User Input Associate Quality Teams – libraries with full editing rights Members Creates, edits records Adds links, merges, acts upon error notifications Contribute own data to the ISNI hub Registration Agencies As members providing services to 3 rd parties ISNI – division of labour
23
Adding a new record – Michel Calame Harvard University Library 2014-11-19
24
Adding a new record
25
New Organisation form
26
Adding your source to an existing record
27
3 Major Types of Link to sources to resource entities to related identities
28
LINKS Among Sources LINKS Among Sources AssignedTotal % assigned Non VIAF linksVIAF linksAll Links American Musicological Society 74551167363,87125071357526082 British Library Theses 7569734429521,995517346434101607 Digital Author Id, Netherlands 546717277175,1355973142152198125 JISC Names Project 449134637096,86619791833080309 La Trobe University 2255355263,49245622594715 Modern Languages Association 539555,79123144267 OCLC Theses 1935509204129794,82119916135479114747072 ODIN ORCID & DataCite Interop 1241729917,0020336472680 AuthorClaim and RePec 237253630765,35372904603683326 Proquest Theses 24406250522248,3111684854437171285 Scholar Universe 32594663021551,72378068167773545841 Electronic tables of content 23075025915189,04398231279440677671 Researcher sources combined 2557156353834572,27 231984243191386638980 6.6 million links from ISNI Researcher sources How does the world see a group of researchers? Where is a group of researchers linking? Which researchers are having impact? Managing links is what ISNI is about Linked Data: isni.org/isni/
29
LINKS Among Identities LINKS Among Identities Co-author, pseudonym….. isMemberOf, isAffiliatedWith….. hasMember, hasEmployee….. hasUnit, supersedes acquired hosts isPartneredWith…..
30
Links to resources ISBN, DOI, titles etc. Resource links LINKS to Resource Entities LINKS to Resource Entities
31
Discussion and policy papers Contents 1. ISO Standard – Definition of ISNI 2. Change of Name – General Principles 3. ISNI data – VIAF/Rights Management Agencies 4. Pseudonyms 5. Married/Maiden Names 6. Gender Reassignment 7. Policies on Maiden/Married Names and Gender Reassignment
32
OCLC Research Task Force on Organisations in ISNI Karen Smith-YoshimuraOCLC Research (leader) Grace AgnewRutgers University Christopher BrownJISC (UK) (CASRAI) Kate ByrneUniversity of New South Wales Matt CarruthersUniversity of Michigan Naun ChewCornell University Peter FletcherUCLA Janifer GatenbyOCLC Leiden (ISNI Assignment Agency) Stephen HearnUniversity of Minnesota Xiaoli LiUniversity of California, Davis Shi LiuUniversity of California, Irvine Marina MuilwijkUniversity of Utrecht Boaz Nadav-ManesOCLC Leiden (ISNI Assignment Agency) Roderick SadlerLa Trobe University John RiemerUCLA Jing WangJohns Hopkins University Glen WileyUniversity of Miami Kayla Willey Brigham Young University With input from Andrew MacEwan, British Library and Anila Angjeli, Bibliothèque nationale de France Examined 13 use cases; producing sample records for each use case 23 recommendations for the system, for the ISNI-IA, for users Search guidelines for organisations to be produced Outreach document Examined 13 use cases; producing sample records for each use case 23 recommendations for the system, for the ISNI-IA, for users Search guidelines for organisations to be produced Outreach document
33
Goals Enhanced interface from ORCID to ISNI to claim works data and IDs BL will help to “establish interoperability” between identifiers Using expertise with the ISNI matching processes Subcontract with ISNI Assignment Agency Target key data, e.g. EThOS and ETOC (theses & Journal articles) ISNI organisation IDs links to ORCID Develop ORCID ID and ISNI researcher ID 1-1 links at scale http://project-thor.eu/the-thor-mission/ interoperation
34
Co-leader of the ISNI Quality Team Representing CENL on ISNI Board Data contributor JISC Names, ODIN, British Library Sound Archive, ZETOC, EThOS, LC/NACO Registration Agency for the UK Research Project with OCLC Research on researcher clustering British Library role in ISNI
35
Central authority file of identifiers maintained as links Tools for creation & maintainenance of links between data sources, resources and identities Automated and manual efforts work towards continuous growth and improvement in quality Rule-lite Policies developed as required to define more complex identity relationships, pseudonyms, corporate bodies Sustainable approach to linking authoritative data In Conclusion…
36
For awareness: additional prongs of ISNI exploration at Harvard Library 1.Evaluate loading legacy local authority file as an ISNI data source (sampling, analysis, batch processing) 2.Create identifiers for Harvard entities (faculty, departments, students) 3.Test and provide feedback on model for corporate bodies/organizations coming soon in OCLC Research report.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.