Presentation is loading. Please wait.

Presentation is loading. Please wait.

NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services

Similar presentations


Presentation on theme: "NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services"— Presentation transcript:

1 NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services opesch@ebsco.com

2 Overview Premise for IOTA completeness score and element weights Proving the theory through real-life tests Using statistical approach to determine weights Test results Conclusions Next steps for IOTA

3 The premise behind IOTA Completeness Score is the measure of the “completeness” of a single OpenURL Completeness Index is attributed to the content provider as an overall measure of the completeness of their OpenURLs

4 The premise behind IOTA The Completeness Score is calculated by “weighing” the elements provided in the OpenURL based on their importance in target links Some elements are more important than others and will have a higher weight Completeness Score equals the sum of weights of elements found divided by the maximum score possible

5 The premise behind IOTA Simple example assuming equal element weights ElementDescriptionWeight This OpenURL ATitleArticle title1 AuLastAuthor’s last name1 DateDate of publication1 ISSN 1 IssueIssue number1 SPageStart page1 TitleJournal Title1 VolumeVolume number1 TOTAL 8

6 The premise behind IOTA Simple example assuming equal element weights ElementDescriptionWeight This OpenURL ATitleArticle title1 AuLastAuthor’s last name1 DateDate of publication1 ISSN 1 IssueIssue number1 SPageStart page1 TitleJournal Title1 VolumeVolume number1 TOTAL 8 1 1 1 1 1 5 Completeness Score... (Total for This OpenURL) Total Weights 5 / 8 =.625 Completeness Score... (Total for This OpenURL) Total Weights 5 / 8 =.625

7 Determining the weights Initial approach Frequency of element occurrence in target link templates Combined with reasoning

8 Initial Weights OpenURL data elementDescriptionWeight ATitleArticle title1 AuLastAuthor’s last name1 DateDate of publication5 eISSNOnline ISSN3 ISSNPrint ISSN3 IssueIssue number3 JtitleJournal Title1 PmidPubMed ID8 SPageStart page3 TitleJournal Title1 VolumeVolume number3 DOIDigital Object Identifier8

9 Initial Weights OpenURL data elementDescriptionWeight ATitleArticle title1 AuLastAuthor’s last name1 DateDate of publication5 eISSNOnline ISSN3 ISSNPrint ISSN3 IssueIssue number3 JtitleJournal Title1 PmidPubMed ID8 SPageStart page3 TitleJournal Title1 VolumeVolume number3 DOIDigital Object Identifier8 Initial weights were somewhat subjective.

10 Initial Weights OpenURL data elementDescriptionWeight ATitleArticle title1 AuLastAuthor’s last name1 DateDate of publication5 eISSNOnline ISSN3 ISSNPrint ISSN3 IssueIssue number3 JtitleJournal Title1 PmidPubMed ID8 SPageStart page3 TitleJournal Title1 VolumeVolume number3 DOIDigital Object Identifier8 Most link resolver knowledge bases can handle look-ups by either Print ISSN or Online ISSN (both are not needed)

11 Initial Weights OpenURL data elementDescriptionWeight ATitleArticle title1 AuLastAuthor’s last name1 DateDate of publication5 eISSNOnline ISSN3 ISSNPrint ISSN3 IssueIssue number3 JtitleJournal Title1 PmidPubMed ID8 SPageStart page3 TitleJournal Title1 VolumeVolume number3 DOIDigital Object Identifier8 Most link resolvers will enhance identifiers like PubMed ID and DOI; therefore, having an identifier is like having all metadata elements.

12 Validating the Completeness Score Use real OpenURLs and a commercial link resolver. (tested with LinkSource and 360-Link) Remove institutional holdings as a limit to resolution Process each OpenURL through the link resolver to determine “Success” Score one point for finding at least one full text target Calculate the completeness score for each OpenURL Look for a statistical correlation between the completeness score and the success score

13 Results: Original Weights Correlation Coefficient.43 Tests conducted on sample of 15,000 OpenURLs randomly pulled from IOTA database

14 A Statistical Approach to Determining Element Weights Select a set of “perfect” OpenURLs include all key data elements and resolve to full text Perform step-wise regression Test failure rates for each element by removing that element Use failure rates as basis for weights Use new weights to test for correlation between weights and success for larger sample

15 Failure Rates from 1500 OpenURL test sample Element removed from the OpenURL DescriptionFailure Percentage ATitleArticle title.74% AuLastAuthor’s last name.07% DateDate of publication.4% ISSN ISSN (either online or print ISSN) 22.02% IssueIssue number20.27% SPageStart page33.27% Title Journal Title (either Title or Jtitle).61% VolumeVolume number74.14% Volume is most critical Author’s last name is least important Date is surprisingly low

16 Calculated Element Weights ElementDescriptionWeight* ATitleArticle title1.87 AuLastAuthor’s last name0.83 DateDate of publication 1.61 ISSN ISSN (either online or print ISSN) 3.34 IssueIssue number 3.31 SPageStart page 3.52 Title Journal Title (either Title or Jtitle) 1.78 VolumeVolume number3.87 *Element weight calculation: log10 (failure-rate-per-10,000 OpenURLs)

17 Results: New Weights Correlation Coefficient.80 Tests conducted on sample of 15,000 OpenURLs randomly pulled from IOTA database

18 Notes Testing the same OpenURLs on 360-Link results in different numbers but consistent trends. Differences may be attributed to: Variations in metadata enhancement techniques Strictness in target link rules (e.g. required elements before link shows – tied to level of forgiveness of target) Link syntax used for target

19 Notes 96.3 of OpenURLs in the test were able to populate a full text target of credible ILL form… Perception of high failure rate of OpenURL may be attributed to library holdings and user expectations Suggestion: set link text to control expectations Link to full text (for items in the online collection) Check library collection (for things in print collection) Request from library (for everything else)

20 Conclusions Step-wise regression approach to element weights works Completeness Index scores can be correlated to actual OpenURL “success” KB and resolver technology influence results and prevent a universal set of element weights The Completeness Index is a mechanism individual link resolver vendors can use to provide metrics to help improve their service quality

21 Other takeaways Several factors involved in perceived “link failure”: 1. Bad or missing metadata in the OpenURL link 2. Inaccurate holdings data within the resolver’s knowledge base 3. Flexibility of syntax to the target - e.g., target supports at least two: OpenURL syntax, DOI link, proprietary link structure 4. Flexibility of resolution logic at the target - i.e., target finds way to create link using available data when some data missing or wrong 5. User expectations - e. g., link resolver provided link to OPAC or ILL form, but user was expecting full text - IOTA focused on (1) - KBART working on (2) - Education of content providers could address (4) - Displaying OpenURL button only if full text available could address (5)

22 What’s next for IOTA Continue offering public access to reports on element frequency Publish technical report on work to date Publish recommended practice for calculation and use of completeness scores for link quality assessment by link resolver vendors Continue work as a NISO standing committee for at least one more year


Download ppt "NISO'S IOTA INITIATIVE: COMPLETENESS INDEX AND IMPROVING ELEMENT WEIGHTS Oliver Pesch EBSCO Information Services"

Similar presentations


Ads by Google