Presentation is loading. Please wait.

Presentation is loading. Please wait.

Crowd-sourcing the creation of “articles” within the Biodiversity Heritage Library Bianca Crowley Trish Rose-Sandler

Similar presentations


Presentation on theme: "Crowd-sourcing the creation of “articles” within the Biodiversity Heritage Library Bianca Crowley Trish Rose-Sandler"— Presentation transcript:

1 Crowd-sourcing the creation of “articles” within the Biodiversity Heritage Library Bianca Crowley crowleyb@si.edu Trish Rose-Sandler trish.rose-sandler@mobot.org

2 The BHL is… A consortium of 13 natural history, botanical libraries and research institutions An open access digital library for legacy biodiversity literature. An open data repository of taxonomic names and bibliographic information An increasingly global effort BHL LITA 2011

3 Problem: Books vs. Articles Librarians manage booksUsers need articles BHL LITA 2011

4 Solution: “Article-ization” Creating articles manually, through the help of our users: BHL PDF Generator Creating articles through automated means: BioStor http://biostor.org/issn/0006-324Xhttp://biostor.org/issn/0006-324X BHL LITA 2011 Page, R. (2011). Extracting scientific articles from a large digital archive: BioStor and the Biodiversity Heritage Library. BMC Bioinformatics, 12(187). Retrieved from http://www.biomedcentral.com/1471-2105/12/187 http://www.biomedcentral.com/1471-2105/12/187

5 LITA 2011 BHL

6 Create-your-own PDF BHL LITA 2011

7 Citebank today: http://citebank.orghttp://citebank.org BHL LITA 2011

8 What is an “article” anyway? BHL LITA 2011

9 the Good, the Bad, the Ugly BHL LITA 2011

10 the Good, the Bad, the Ugly BHL LITA 2011

11 the Good, the Bad, the Ugly BHL LITA 2011

12 Questions for Data Analysis What is the quality, or accuracy, of user provided metadata? What kinds of content are users creating? How can we improve the PDF generator interface? BHL LITA 2011

13 Stats Jan 2010-Apr 2011 –Approx 60,000 pdfs created from PDF Generator –40% of those (approx 24,000) were ingested into CiteBank (PDFs without user-contributed metadata excluded) 5 reviewers analyzed 945 pdfs (approx 3.9% of the 24,000+ articles going into Citebank) **Thanks to reviewers Gilbert Borrego, Grace Costantino, and Sue Graves from the Smithsonian Institution BHL LITA 2011

14 Methodological approach Quantitative – numerical rating system Rated titles, authors, beg/end pages Its “findability” within CiteBank search often determined how it was rated BHL LITA 2011

15 Ratings System Title 1=has all characters in title letter for letter 2=does not have all characters in title letter for letter but still findable in CiteBank search 3= does not have all characters in title letter for letter and is NOT findable via the CiteBank search LITA 2011 BHL

16 Ratings System Author 1=has all characters in author(s) last name letter for letter 2=has at least one author’s last name spelled correctly 3=has no authors or none of the author’s last names are spelled correctly LITA 2011 BHL

17 Ratings System Article beginning & ending pages 1=has all text pages for an article, from start to end 2=subset of pages from a larger article 3=a set of pages where the intellectual content has been compromised. LITA 2011 BHL

18 Analysis steps LITA 2011

19 Results Title average 1.68 Title average1.68 Author(s) average1.33 Beg/End pg average1.41 Title & Author average1.50 Overall average (combines first 3 above) 1.47 LITA 2011 BHL

20 What did we learn? Ratings were better than we expected Many users took the time to create decent metadata “good enough” is not great but is still “findable” LITA 2011 BHL

21 BHL-Australia’s new portal http://bhl.ala.org.au/ there’s always room for improvement Other factors But of course….. BHL LITA 2011

22 Changes we made for UI so far Asking users if they want to contribute their article to CiteBank Making article title a required field and validating it so its at least 2 or more characters Review button for users to review page selections and metadata (inspired by BHL- AUS) Reduced text and increased more intuitive graphics (inspired by BHL-AUS) BHL LITA 2011

23 Brief survey of proposed changes Overwhelmingly positive response to proposed change there’s always room for improvement But of course….. BHL LITA 2011

24 Success Factors Monitor the creation of the metadata to look at user behavior and patterns Engage with your users Incentivize your users LITA 2011

25 @BioDivLibrary /pages/Biodiversity-Heritage-Library/63547246565 /photos/biodivlibrary/sets/ /group/biodiversity-heritage-library Bianca Crowley crowleyb@si.edu Trish Rose-Sandler trish.rose- sandler@mobot.org http://biodiversitylibrary.org BHL LITA 2011


Download ppt "Crowd-sourcing the creation of “articles” within the Biodiversity Heritage Library Bianca Crowley Trish Rose-Sandler"

Similar presentations


Ads by Google