Download presentation
Presentation is loading. Please wait.
Published bySamuel Austin Modified over 9 years ago
1
Open Past: Digital Projects from Government Libraries Finance Canada Statistics Canada Library of Parliament June 1, 2012
2
CLA Conference 2012 “Share your thoughts with fellow delegates and CLA members while attending the conference. The twitter hashtag is #CLAOTT2012 or you can blog or "Facebook" from CLA 2012 in Ottawa. Go to the CLA website, http://www.cla.ca/conference/2012for the CLA from Away links.” http://www.cla.ca/conference/2012
3
Overview Introductions Finance Canada Statistics Canada Library of Parliament Questions
4
Finance Canada Digitizing the Federal Budget Eileen Bays-Coutts Iona Henderson June 1, 2012
5
Library Digitization Goals To increase Web access to and discoverability of federal budget publications To address service delivery issues Pilot: To assess digitization, repository, and metadata requirements.
6
Pilot Phase March 2010, digitized the 1952 to 1994 Speech, Plan, and Budget in Brief publications. Used in-house photocopier and casual staff. Publications scanned to PDF and files optimized using Adobe Acrobat Pro OCR and tagging processes.
7
Pilot Continued Sample of OCR coding errors underlying PDFs: I am honoured, Madam Speaker, to have the opportunity to present to Parliament the first b 6' dget of this new decade. It is a b U dget which sets new directions for the economy ~ directions which willensure both energy security and economic securit ' y for Canadians in the years ahead. It would b ~ no service to this House, nor to Q anadians, to deny that there is a deeply troubling air of uncertainty and anxiety around the world and, I am sure, in the hearts and minds of Canadians; we have inherited many difficulties from the decade of the 70s. But I t would be just as wrong to deny that the decade of the 80s provides extraordinary oppo. rtunities for Canada and Canadians.
8
Pilot Continued Results: Low cost Crawlable and searchable files 3% to 5% OCR error rate. Conclusion – error rate unacceptable.
9
Project Phase Goal to produce CLF2 compliant, 99.5% error-free OCR text Work competitively outsourced in 2010/11 to Terra Reproductions Same scope as Pilot phase.
10
Project Continued Full specs were provided to the company including generic metadata; metadata to be enhanced later. Results: error rate of 0.5% or lower But discovered some gaps
11
Getting to the Web Add: 1968 to 1994 Enhance user experience 2007 to 2012 budget.gc.ca 1995 to 2006 fin.gc.ca
12
Inspiration
13
Getting to the Web Continued Additional metadata added to files Prime Minister Finance Minister Parliament number Political party Became our filtering criteria + the year
14
Getting to the Web Continued JQuery used for sorting functionality Some browser issues with display so custom style sheets developed Clean up of 1995 – 1999 PDFs on FIN
15
Final Product! www.budget.gc.ca/pdfarch/index-eng.html
16
Going Forward Fill gaps in collections Enhance metadata Improve layout and functionality Add additional PDF documents from years 1994 – 2011 Improve accessibility of PDFs
17
Thank you Eileen.Bays-Coutts@fin.gc.ca Eileen.Bays-Coutts@fin.gc.ca Iona.Henderson@fin.gc.ca Iona.Henderson@fin.gc.ca http://www.budget.gc.ca/pdfarch/index-eng.html http://www.budget.gc.ca/pdfarch/index-eng.html
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.