Software for Digital Library By Bhupendra Ratha, Lecturer School of Library and Information Science Devi Ahilya University, Indore

Slides:



Advertisements
Similar presentations
Collecting data Chapter 6. What is data? Data is raw facts and figures. In order to process data it has to be collected. The method of collecting data.
Advertisements

Presentation by Priyanka Sawarkar
Electronic Theses and Dissertations: Benefits, Issues, and the University of Waterloo Approach
CAPTURE SOFTWARE Please take a few moments to review the following slides. Please take a few moments to review the following slides. The filing of documents.
Client Lunch & Learn (12:15). Association for Information & Image Management Nov Research Scanner Utilization.
Input & Output Devices ASHIMA KALRA.
Objective Understand web-based digital media production methods, software, and hardware. Course Weight : 10%
Input to the Computer * Input * Keyboard * Pointing Devices
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
Internet Resources Discovery (IRD) IBM DB2 Digital Library Thanks to Zvika Michnik and Avital Greenberg.
AUTOMATIC DATA CAPTURE  a term to describe technologies which aim to immediately identify data with 100 percent accuracy.
S OFTWARE AND M ULTIMEDIA Chapter 6 Created by S. Cox.
JSTOR & OCR - A Case Study Kiffany Francis. What is JSTOR? “JSTOR is a not-for- profit organization with a dual mission to create and maintain a trusted.
Chapter 3 Software Two major types of software
Application Software.  Topics Covered:  Software Categories  Desktop vs. Mobile Software  Installed vs. Web-Based Software.
UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data.
Software and Multimedia
How the World Wide Web Works
Computer Software.
McGraw-Hill Technology Education © 2006 by the McGraw-Hill Companies, Inc. All rights reserved. 77 CHAPTER INPUT AND OUTPUT Page 150.
CHAPTER 2 Input & Output Prepared by: Mrs.sara salih 1.
Defining Electronic Systems
Class 6 Data and Business MIS 2000 Updated: September 2012.
1 Chapter 6 Understanding Computers, 11 th Edition Software Ownership Rights Software license: agreement, either included in a software package or displayed.
Section 2.1 Compare the Internet and the Web Identify Web browser components Compare Web sites and Web pages Describe types of Web sites Section 2.2 Identify.
Digital Library Architecture and Technology
TERMS TO KNOW. Programming Language A vocabulary and set of grammatical rules for instructing a computer to perform specific tasks. Each language has.
Introduction to Computers
 Optical Scanners Optical Scanners  Scanners Scanners  Electronic Tablet/Pen Electronic Tablet/Pen  Digital Camera Digital Camera  Webcam Webcam.
Flash Cards Computer Technology.
Copyright © cs-tutorial.com. Introduction to Web Development In 1990 and 1991,Tim Berners-Lee created the World Wide Web at the European Laboratory for.
UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
1 Web Basics Section 1.1 Compare the Internet and the Web Compare Web sites and Web pages Identify Web browser components Describe types of Web sites Section.
Planning a digital library How to Build a Digital Library Ian H. Witten and David Bainbridge.
UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and.
CHAPTER FOUR COMPUTER SOFTWARE.
Using a Template to Create a Resume and Sharing a Finished Document
Document Retention System. MARCH 2006 Confidential 2 General Architecture Scan and Search Search only Scan and Search Search only Scan Search Store Secured.
Business Software What is database software? p. 145 Allows you to create, access, and manage data Add, change, delete, sort, and retrieve data Next.
UEC 01 : Computer Skills & Programming Concepts I 1PUA – Computer Engineering Department – UEC01 – Dr. Mona Abou - Of Lecture 6: Applications Software.
Technology Choices for the JSTOR Online Archive Presented by Chang Feng Department of Computer Engineering and Computer Science, University of Missouri-Columbia,
DATA COLLECTION METHODS CONTENT PAGE How data is collected via questionnaires. How data is collected via questionnaires. How data is collected with mark.
Planning a digital library How to Build a Digital Library Ian H. Witten and David Bainbridge.
4 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved. Computer Software Chapter 4.
File Formats Different applications (programs) store data in different formats. Applications support some file formats and not others. Open…, Save…, Save.
Capabilities of Software. Object Linking & Embedding (OLE) OLE allows information to be shared between different programs For example, a spreadsheet created.
Computer Basics & Keyboarding. What Is A Computer? An electronic device operating under the control of instructions stored in its own memory unit An electronic.
VIVO and Scholarly Repositories: Synergistic Opportunities.
ELEMENTS OF A COMPUTER SYSTEM HARDWARE SOFTWARE PEOPLEWARE DATA.
Chapter 11 Working with Credit Card Methods of Processing Credit Cards Preparing for Cyber Cash Authoring a Credit card Transaction.
Regional Workshop on the 2010 World Programme on Population and Housing Censuses: International standards, contemporary technologies for census mapping.
DSpace An Open Source Dynamic Digital Repository Xizi (Cecilia) Cai IS565 Spring 2013 DL Topic Presentation.
Discovering Computers 2008 Fundamentals Fourth Edition Discovering Computers 2008 Fundamentals Fourth Edition Chapter 1 Introduction to Computers.
The Big Picture Things to think about What different ways are there to collect information automatically? What are the advantages and disadvantages of.
Memory Random Access Memory (RAM) and Read Only Memory (ROM)
RECORDS MANAGEMENT Judith Read and Mary Lea Ginn Chapter 12 Electronic Media and Image Records 1 © 2016 Cengage Learning ®. May not be scanned, copied.
© 2017 by McGraw-Hill Education. This proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner.
Drill Workflow- Make a workflow using the task and decision boxes on the board to simulate a student getting up and going to school in the morning. Use.
PDF Recovery Tool Fix Portable Document File Format.
Input & Output Devices ASHIMA KALRA.
LECTURE Course Name: Computer Application
Software and Multimedia
Software and Multimedia
Inputting Data In Other Ways
Optical Data Capture: Optical Character Recognition (OCR)
Data Capture Process Stages
The Office Procedures and Technology
I.T.: Application By Bhupendra Ratha, Lecturer
Input and Output devices in a Computer
Presentation transcript:

Software for Digital Library By Bhupendra Ratha, Lecturer School of Library and Information Science Devi Ahilya University, Indore

Software for Digital Library If we want to create a digital library than we require different software's like;  Digital library software  OCR  DOI  Image Editing Software's  Other software’s Operating Systems Data Base Management System Software Programming/ Scripting Language Firewall & Protection Software

Digital library software DSpace is a digital library system to capture, store, index, preserve and redistribute the intellectual output of a university’s research faculty in digital formats. Dspace has been developed jointly by MIT Libraries and Hewlett-Packard (HP). It is now freely available to research institutions world-wide as an open source system. Eprints is generic archive software under development by the University of Southampton. It is intended to create a highly configurable web-based archive. EPrints primary goal is to be set up as an open archive for research papers, but it could be easily used for other things such as images, research data, audio archives - anything that can be stored digitally by making changes in configuration. Greenstone is a suite of software for building and distributing digital library collections. It provides a new way of organizing information and publishing it on the Internet or on CD-ROM. It is available for both Windows and Linux O/S. It requires Perl software to build collections.

OCR ( Optical Character Recognizer)  OCR is also referred to as Optical Character Reader.  It allows to scan printed, typewritten or hand written text (numerals, letters or symbols) and/or convert scanned image to a computer processable format, either in the form of a plain text or a word document or an excel spread sheet, which can be edited, used or reused in other documents.  “A system that provides a full alphanumeric recognition of printed or handwritten characters at electronic speed by simply scanning the documents that called OCR.”

Difference: OCR and OMR OCR technologies, images can be scanned, indexed and written to optical media. OCR can recognize all type of information. It is very flexible. OMR is a data collection technology. OMR cannot recognize hand-printed or machine- printed characters. It is not flexible.

Features of OCR It is a program which have recognition capabilities of characters. The technology provides a complete form processing and documents capture solution. OCR is used when recreating a document in electronic form takes more time The converted text files take less space than the original image file and can be indexed Bridges the gap between the paperless and the papered.

Advantages of OCR Savings in costs and efficiencies by not having the paper. Scanning and recognition allowed efficient management and planning for the rest of the processing workload. Reduced long term storage requirements, questionnaires could be destroyed after the initial scanning, recognition and repair. Quick retrieval for editing and reprocessing. Minimizes errors associated with physical handling of the documents.

Disadvantages of OCR While OCR technology can be effective in converting handwritten or typed characters, it does not give as high accuracy as of OMR for reading data, where users are actually marking forms. Scanning speed will be determined by the quality of the scanner machines, the size of non-drop out color. paper quality, cleanness, weights. To compare the value of the interpreted image with the real image of the document. Processing can be in geographic order or in random order.

Processing of OCR

Cont… Scanner has 4 components: –A detector, An illumination source, A scan lens and a document transport. OCR hardware/software performs three operational steps: –Document analysis, Character recognition, Contextual processing. Output Interface –Allows character recognition results to be electronically.

Types of OCRs Two types of OCRs –Task specific readers –General purpose readers Task specific readers –Reads only specific documents: bank cheques. General purpose page readers –High end OCR (usually for offices) Speed and Accuracy are important Format preservation Good proof reading solutions –Low end OCR (usually for house use) Speed is not required Proof reading is done manually

Factors affecting OCR quality Scanner quality Scan resolution Type of printed documents, whether laser printer outputs or photocopied Paper quality Fonts used in the text Linguistic complexities

Evaluating OCRs Neat interface Easy-to-use wizards Accurate recognition Scan resolution setting (600 dpi is advisable) Time taken from scanning to deliver the final product Enhanced usability of the product Ability to modify the scan setting

DOI (Digital Object Identifier) An open standard for creating an alphanumeric name that identifies digital content, mostly scholarly contents such a e-book or journal article. A DOI is a unique ID number for a document is paired with the object’s electronic address, or URL (updatable), along with other metadata

Basic aspects of DOI “The DOI is like the Bar Code, but for objects on the Internet.” Two aspects: 1.Uniquely Identifies the Object – 2.Provides Linking to the Object Itself (or to any related objects, transactions or services). These links are: –Permanent –Dynamically maintainable –Capable of one-to-many routing –Capable of supporting new applications over time

Features of DOI Applies to any type or format of object –text, music, film, video, photographs, software, database record. Applies at any level of specificity –whole book/individual chapters, music collection/ individual tracks, software programs/ individual routines products/components… Compatible with every other numbering scheme (UPC, ISBN, ISSN) Permanent (Once assigned, never changes. “A DOI is Forever.”) It protect to copyright. A central directory provides a level of indirection between the ID and its locations or services

DOI number format /abc123defg = the whole DOI = Publisher Prefix abc123defg = Suffix= Handle suffix –item identifier –any format –naming authority (publisher) in use, a DOI is an opaque string (a “dumb number” - a good thing)

IDF International DOI Foundation (IDF) established October 1997 Offices in Washington & Geneva Non-profit: supported primarily by membership fees Develops policies and governance procedures (“policy infrastructure”) Liaises with standards organizations internationally Manages the relationship with CNRI (as technology provider) via service contract

Working Process of DOI User PC Browser 2. Forward Query to Publisher 1. Send DOI Query DOI Directory Server DOI = Where to go next Publisher/ Gateway Object Information 3. Receive Object Information User PC

Image Editing Software Photoshop Coral draw Paint brush

Other Software for DL Operating Systems: –MS-Word, excel, power point etc. Data Base Management System Software –Oracle/SQL, MS-Access, Fox-Pro etc Programming/ Scripting Language –Html, Java,.Net, VB. C++ etc. Firewall & Protection Software –Firewall, Anti-virus, bioinformatics programs etc