IMAODBC, The Hague, 5-9 sept 2005

Slides:



Advertisements
Similar presentations
Don’t Type it! OCR it! How to use an online OCR..
Advertisements

Disseminating Statistics: Internet and Publications INE – Madrid, 3-5 March 2008 Ulrich Wieland, Eurostat How to link publications and Internet in order.
Enterprise Integration Solutions SharePoint Imaging.
Services Digitisation & Content Management. 600 People – India.
Client Lunch & Learn (12:15). Association for Information & Image Management Nov Research Scanner Utilization.
ARCHIVE IMAGING SEARCHABLE VIA THE WEBPAC Marthie de Kock The Hong Kong Institute of Education 9 December 2002.
Disseminating statistics: Internet and Publications Madrid, 3-5 March 2008 A strategy for publications. Part II Enhancement of marketing products Maria-Luz.
Alternative Ways of Presenting Historical Census Data Luuk Schreven & Anouk de Rijk &
Introduction to EndNote Martin Snelling March 2007.
MGMT 230 Lab 1 HTML Basics. 2 HTML Tags An HTML document contains both document content and tags. The tags are the HTML codes inserted in a document to.
Presented by Mina Haratiannezhadi 1.  publishing, editing and modifying content  maintenance  central interface  manage workflows 2.
Advanced Workgroup System. RED Advanced Workgroup Systems: Scan Features Copy Print Scan DNSG Software Our Customers Documents Our Customers Documents.
Chapter 10 Publishing and Maintaining Your Web Site.
Disseminating statistics: Internet and Publications Madrid, 3-5 March 2008 Digital Preservation (E-Archiving) Marta Melgar García
Accounting & Financial Services OOA & UCDHS Electronic Document Management System July 2008 Project website:
DML-CZ: Scanning and adjusting the images Martin Lhoták Academy of Sciences Library Launching the DML-CZ Prague.
Introduction to EndNote Web Margaret Forrest Academic Liaison Librarian.
Mark Phillips Digital Projects Department University of North Texas Annexation of Texas Project.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Internet Basics Dr. Norm Friesen June 22, Questions What is the Internet? What is the Web? How are they different? How do they work? How do they.
MSS Technologies and the AIIM Grand Canyon Chapter present: Electronic Document Management System Needs Analysis.
Tutorial 1: Getting Started with Adobe Dreamweaver CS4.
Making Life and Communication Easier Neka Anyaogu.
1999 Asian Women's Network Training Workshop What the Internet Offers Communications  Across the country or across the world Information resources and.
Kurzweil Designed for individuals with vision Designed for individuals with vision –Learning disabilities –Low vision –TBI/ABI –ADD/ADHD.
Lakeland Click arrow to advance show. Click on the “A” under “Listed By Name.” (“A” for Academic Search Database)
Saving and printing Section 4. Objectives Student will learn about print a web site, download files from the internet.
Who is ROH Incorporated? Founded in 1971 Service oriented company built on client “partnering” Florida State Contract # Provide advanced.
Kurzweil 3000 Ron Stewart Access Technology Instructor High Tech Center Training Unit.
LinkWare LinkWare is a web-enabled, open platform for generation and distribution of electronic technical documentation and e–catalogues. The LinkWare.
An and Collaboration Suite LI 815 XR Kristen Gripp.
Use PowerPoint to make an E-BOOK with voice embedded.
Library of Vilnius Gediminas Technical University Asta Katinaitė, Aurelija Striogienė
Customer Feedback “The ability to search for information over all of the online books is fantastic. When you need information in a hurry, and the Web isn’t.
Proposed EOB Workflow. EOB Dilemma  Unstructured Documents, data appears in different areas  Small font affects ability for OCR software to bring back.
5 Marzo 2007 Census mapping and Gis Part II: dissemination Fabio Crescenzi Istat, Central Directorate on General Censuses UNECE Training Workshop on Census.
Chapter 9 Publishing and Maintaining Your Site. 2 Principles of Web Design Chapter 9 Objectives Understand the features of Internet Service Providers.
Chapter 8 Browsing and Searching the Web. 2Practical PC 5 th Edition Chapter 8 Getting Started In this Chapter, you will learn: − What is a Web page −
The SAIC Operation 54 Network and the Internet. Overview The purpose of this brown bag training session is to provide you with an introduction to the.
1 UNOG Library Digitization and Microform Unit (DMU) – December 2009.
E-Books Presentation. Hard Copy (Book) Scanning OCR Text Document HTML Conversion Text Formatting Linking Image Insertion Final QC Soft Copy (JPG/TIFF)
Overview Web Session 3 Matakuliah: Web Database Tahun: 2008.
By… Prapasri Fungsriwirot Database Training Division Book Promotion & Service Co., Ltd Latest Update 13/01/50.
Hardware Software InternetMiscellaneous
Digital library of Spanish old newspapers and magazines National Library of Spain.
S T A T I S T I C S A U S T R I A March SuperSTAR A joint development with STR D.Burget October 2007 © STATISTICS AUSTRIA I n f.
Website Design:. Once you have created a website on your hard drive you need to get it up on to the Web. This is called "uploading“ or “publishing” or.
1 « Luxembourg, 18 April 2007 « Virtual Library of Official Statistics « Dissemination Working Group.
C. Candace Chou University of St.Thomas EndNote for Researchers.
Tema 3 INEbase history Statistical books available on the web Celia Santos
Alejandra Silva CELADE- Population Division of ECLAC, UN. (1957 – 2007) ‏ REDATAM+SP WebServer as a tool for dissemination of microdata REDATAM+SP WebServer.
Nordic webseminar, Reykjavik, March 2009
How to get started with RefWorks
Chapter 8 Browsing and Searching the Web
Prepared by: Galya STATEVA, Chief expert
How to get started with RefWorks
THIS IS JEOPARDY. THIS IS JEOPARDY With Your Host... Paul Berman.
MISSION To prepare and disseminate quality and timely statistical information on economic and demographic processes in the state, social factors and trends.
eCopy PDF Pro Office Integration with iManage Work.
DIGITAL LIBRARY.
Three areas that have proved successful and one area where improvements are needed in the Statistical Office of Estonia by Tuulikki Sillajõe International.
Section 14.1 Section 14.2 Identify the technical needs of a Web server
Future Library : The Virtual Library
IMAODBC, The Hague, 5-9 sept 2005
Sharing of Eurostat predefined tables
Sharing of Eurostat predefined tables
EUROPEAN STATISTICS ON THE INE WEBSITE
Using Old Streets to Make New Inroads to Data: Part 1
Dissemination and Communication Introductory course
Presentation transcript:

IMAODBC, The Hague, 5-9 sept 2005 INEbase history. Statistical books 1858-1990 available on the web Antonio ARGÜESO argueso@ine.es IMAODBC, The Hague, 5-9 sept 2005 Tema 3

INEbase history: a virtual library INEbase history: Statistical books 1858-1990 available on the web INEbase history: a virtual library The project: showing on the Internet the editorial funds of INE (1857 - 1997 approx. ) Information stored not as complete books but as hierarchically organised documents Search utilities INEbase history: a new section of INEbase This virtual library is not offered as a section of “products and services” not even as a “virtual library” but as a part of the output database, INEbase A link to: History 1/6 Tema 3

Project phases: Phase I. June 2004- june 2005. INEbase history: Statistical books 1858-1990 available on the web Project phases: Phase I. June 2004- june 2005. 110,000 pages scanned (77 yearbooks, 221 books of pop. censuses) Software development (100,000 euros) First 25 books catalogued and published (4 people: 1 mger, 3 grants) Phase II. July 2005- july 2006. 180,000 pages scanned (vital stats, agricultural & industrial censuses…) Software improvements (10,000 euros) 150 books catalogued and published (8 people: 1 mger, 7 cataloguers) 2/6 Tema 3

The technical process in 3 steps INEbase history: Statistical books 1858-1990 available on the web The technical process in 3 steps 1. Scanning and OCR Books are scanned in high speed scanners . The output files are TIFF 600 ppi and TIFF 4 (300 ppi) (the popular telefax format) These files are OCRed, obtaining a final enriched pdf with two layers : The image of the page (first layer) The words recognised by the OCR (second layer) => These PDFs are page images but also allow text searching 3/6 Tema 3

2. Cataloguing books into the system INEbase history: Statistical books 1858-1990 available on the web 2. Cataloguing books into the system cataloguers create the hierarchical trees (books indices) and the final nodes (statistical tables) are associated to a pdf 4/6 Tema 3

INEbase history: Statistical books 1858-1990 available on the web 3. Publication Once a book has been catalogued and revised, just one click and the book is on the web 5/6 11/17 Tema 3

Hardware and software used: An easy system INEbase history: Statistical books 1858-1990 available on the web Hardware and software used: An easy system a server for cataloguing contains the development DB and the pdf files. As many PCs as cataloguers connecting to it using the client program. A dissemination server hosts the software and a copy of the DB coherence & synchronisation mechanisms between both systems (development and dissemination) 6/6 Tema 3

Thank you! Tema 3