Presentation is loading. Please wait.

Presentation is loading. Please wait.

TELplus WP 1 “Making searchable digitised images via OCR” TELplus Kick-off Meeting Tallinn, 15./16. October 2007 Max Kaiser / Joachim Korb Austrian National.

Similar presentations


Presentation on theme: "TELplus WP 1 “Making searchable digitised images via OCR” TELplus Kick-off Meeting Tallinn, 15./16. October 2007 Max Kaiser / Joachim Korb Austrian National."— Presentation transcript:

1 TELplus WP 1 “Making searchable digitised images via OCR” TELplus Kick-off Meeting Tallinn, 15./16. October 2007 Max Kaiser / Joachim Korb Austrian National Library max.kaiser@onb.ac.atmax.kaiser@onb.ac.at / joachim.korb@onb.ac.at www.onb.ac.at/ joachim.korb@onb.ac.at www.onb.ac.at/

2 WP1: Main Objectives Provide a huge amount of full text for The European Library (and the future European Digital Library) in a very short time Lay the basis for future efforts with full text –overview over OCRable material –best practices –identification of priorities

3 WP1: Background Main shortcoming of The European Library today: lack of access to the full text of the material WP1 will be an import step to tackle this problem Will enable TEL to make a bigger contribution to the European Digital Library targets –Commission Target: 2 million objects accessible by 2008, at least 8 million objects by 2010...

4 WP1: Main Goals Full text access to more than 20 million pages Facing common challenges in OCR Establishing best practices for OCR Providing full text access through The European Library

5 Partners OCR contributions (examples) Spain:5 Mio pages of historical newspapers and magazines Iceland:1.0 – 1.6 Mio pages of books and newspapers Poland:525.000 pages of journals 1918 – 1939 Austria:400.000 pages of governmental publications Sweden:320.000 pages of travel literature (books/manuscripts)

6 WP 1 Tasks organisation and research 1.1 Survey of availability of digitised images 1.2 Survey of existing OCR approaches 1.3 Identification of concrete materials for OCR. OCR Specifications, implementation plans and tenders Production 1.4 Carrying out OCR and making full texts available via partners’ digital library environments 1.5 Provision of access to newly OCRed material through The European Library

7 organisation and research

8 Task 1.1: Survey of availability of digitised images for OCR who and when Leader: Austrian National Library 01. Oct – 31. Dec 2007 31.Dec. 2007:Survey to be delivered Participants: All content partners in WP 1 D1.101. Oct31. Dec D1.1 – Survey of availability of digitized images for OCR

9 Task 1.1: Survey of availability of digitised images for OCR Which collections at partner libraries? What do partners plan to OCR? Which priorities in partners OCR plans? How will full-text collections be made accessible? Results provide basis for implementation plans (Task 1.3)

10 Task 1.2: Survey of existing OCR approaches who and when Leader: Austrian National Library 01. Jan. – 31. July 2008 31. July 2007:Survey to be delivered Participants: All content partners in WP 1 D1.201. Jan31. July D1.2 – A survey of existing OCR practices and recommendations for more efficient work

11 Task 1.2: Survey of existing OCR approaches which experiences and approaches partners have? which digital library environments? how will access to full text be provided? which challenges? how can they be mastered? Results will support practical work (Task 1.4) Results provide perspectives for future work to build on

12 Task 1.3: Identification of concrete materials for OCR; OCR specifications; implementation plans and tenders - who and when Leader: Austrian National Library 01. Dec. 2007 – 28 Feb. 2008 28 Feb. 2007:specifications, implementation plans and tenders delivered concrete OCR materials identified Participants: All content partners in WP 1  This Task commences before surveys are evaluated M1.1 01. Dec28. Feb D1.3 M1.2 D1.3 – Package of specifications and implementation plans M1.1 – Identification of concrete materials for OCR against an agreed budget M1.2 – Specifications and implementation plans for full text conversion available

13 Task 1.3: Identification of concrete materials for OCR; OCR specifications; implementation plans and tenders input from Tasks 1.1 and 1.2 (surveys) priority lists of material will be drawn up for each content provider OCR specifications and implementation plans will be set up tenders will be prepared and published (for out-sourced work) Result will give framework for practical work proper tendering procedures will be observed as budget payment depends on them

14 production

15 Task 1.4: Carrying out OCR and making full texts available via partners’ Digital Library environments - who and when Leader: Austian National Library 01. Jan. 2008 – 31. Dec. 2009 30. Sept. 2008:10 million OCRed pages 1. progress report 30. Sept. 2009:20 million OCRed pages 2. progress report Participants: All content partners in WP 1  This Task commences before 1.2 and 1.3 end 01. Jan28. Feb D1.5D1.4 D1.4 – First set of consolidated OCR progress reports D1.5 – Second set of consolidated OCR progress reports

16 Task 1.4: Carrying out OCR and making full texts available via partners’ Digital Library environments Tendering Carrying out of OCR implementations plans Full text conversion Quality control Indexing Providing access through partners’ Digital Library environment Providing material for WP 3 Task 3.1 and subtasks Providing feedback for Task 1.2

17 Task 1.5 who and when Leader: National Library of the Netherlands [TEL office] 1. March 2008 – 31. Dec. 2009 31. Dec. 2009access to all newly OCRed material provided Participants: All content partners in WP 1  Task commences before Tasks 1.2 and 1.4 end M1.3 M1.5 01. Mar 31. DecD1.6 M1.4 D1.6 – Provision of access to newly OCR-ed material through TEL M1.3 – First set of full text material available M1.4 – Second set of full text material available M1.5 – Full texts accessible via The European Library

18 Task 1.5: Provision of access to newly OCRed material through The European Library Provision of access to all OCRed material from this WP through TEL  All content partners will have to provide full text and indexes for this first half Sept. 2008 second half Sept. 2009

19 D1.1 01. Oct 31. Dec Task 1.1 D1.201. Jan31. July Task 1.2 M1.1 01. Dec28. Feb D1.3 M1.2 Task 1.3 01. Jan31. DecD1.5 Task 1.4 M1.3 M1.5 01. Mar D1.6 M1.4 Task 1.5 D1.4 31. Dec TIMELINE

20 Leading Partners in WP 1 Austria National Library Tasks 1.1 through 1.4 National Library of the Netherlands [TEL office] Task 1.5 French National Library Workshop in January 2008 - Paris

21 Contributing Partners in WP 1 National Library of Estonia26 person months National and University Library of Slovenia 6 person months National Library of the Czech Republic19 person months Austrian National Library20 person months National and University Library of Iceland20 person months National Library of Latvia24 person months Martynas Mažvydas National Library of Lithuania28 person months National Széchényi Library of Hungary 7 person months French National Library20 person months National Library of Norway39 person months National Library of Spain10 person months Slovak National Library20 person months National Library of Sweden 5 person months The National Library of Poland40 person months

22 TELplus Work Package 1 Thank You! Questions? Max Kaiser / Joachim Korb Austrian National Library max.kaiser@onb.ac.atmax.kaiser@onb.ac.at / joachim.korb@onb.ac.at www.onb.ac.at/ joachim.korb@onb.ac.at www.onb.ac.at/


Download ppt "TELplus WP 1 “Making searchable digitised images via OCR” TELplus Kick-off Meeting Tallinn, 15./16. October 2007 Max Kaiser / Joachim Korb Austrian National."

Similar presentations


Ads by Google