1 Open Access to Digital Libraries. Must Research Libraries be Expensive? William Y. Arms Department of Computer Science Cornell University.

Slides:



Advertisements
Similar presentations
Lecture 2 - Revenue Models
Advertisements

1 L U N D U N I V E R S I T Y Integrating Open Access Journals in Library Services & Assisting Authors in choosing publishing channels 4th EBIB Conference.
© 2008 Pearson Addison Wesley. All rights reserved Chapter Seven Costs.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
OvidSP Flexible. Innovative. Precise. Introducing OvidSP Resources.
1 Use of Electronic Resources in Research Prof. Dr. Khalid Mahmood Department of Library & Information Science University of the Punjab.
1 Web Search Environments Web Crawling Metadata using RDF and Dublin Core Dave Beckett Slides:
28 April 2004Second Nordic Conference on Scholarly Communication 1 Citation Analysis for the Free, Online Literature Tim Brody Intelligence, Agents, Multimedia.
UNITED NATIONS Shipment Details Report – January 2006.
1 William Y. Arms Cornell University October 25, 2002 The National Science Digital Library (NSDL) as an Example of Information Science Research.
Library 1 Electronic Resources in the EUI Library Veerle Deckmyn, Library Director Aimee Glassel, Electronic Resources Librarian September 2, 2009.
Library Electronic Resources in the EUI Library Veerle Deckmyn, Library Director Aimee Glassel, Electronic Resources Librarian 07 September
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Year 6 mental test 10 second questions
LIBRARY WEBSITE, CATALOG, DATABASES AND FREE WEB RESOURCES.
SEARCHING THROUGH EBSCO MEDLINE AND CINAHL WITH FULL TEXT prepared by Literature Searching Team Library, Faculty of Medicine, UGM 2012.
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
Computer Literacy BASICS
1 The information industry and the information market Summary.
Collections and services in the information environment JISC Collection/Service Description Workshop, London, 11 July 2002 Pete Johnston UKOLN, University.
VOORBLAD.
Building repositories Iryna Kuchma, eIFL Open Access program manager, eIFL.net Presented at “Open Access: Maximising Research Impact” workshop, May 25.
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
© 2012 National Heart Foundation of Australia. Slide 2.
Who are the Experts?Simon KampaSlide 1 Who are the Experts? Simon Kampa IAM Group University of Southampton
Macromedia Dreamweaver MX 2004 – Design Professional Dreamweaver GETTING STARTED WITH.
Model and Relationships 6 M 1 M M M M M M M M M M M M M M M M
1 Using one or more of your senses to gather information.
The library as organizer of digital information
H to shape fully developed personality to shape fully developed personality for successful application in life for successful.
Analyzing Genes and Genomes
DIKLA GRUTMAN 2014 Databases- presentation and training.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Intracellular Compartments and Transport
PSSA Preparation.
Essential Cell Biology
CINAHL Keyword Searching. This presentation will take you through the procedure of finding reliable information which can be used in your academic work.
1 Chapter 13 Nuclear Magnetic Resonance Spectroscopy.
Energy Generation in Mitochondria and Chlorplasts
RefWorks: The Basics October 12, What is RefWorks? A personal bibliographic software manager –Manages citations –Creates bibliogaphies Accessible.
CFR 250/590 Introduction to GIS, Autumn 1999 Data Search & Import © Phil Hurvitz, find_data 1  Overview Web search engines NSDI GeoSpatial Data.
WEB OF KNOWLEDGE 5.2
South Dakota Library Network MetaLib User Interface South Dakota Library Network 1200 University, Unit 9672 Spearfish, SD © South Dakota.

1 The Impact of the Internet on Research Universities Examples from Distance Education & Digital Libraries William Y. Arms Department of Computer Science.
1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries.
1 CS 502: Computing Methods for Digital Libraries Lecture 16 Web search engines.
William Y. Arms Corporation for National Research Initiatives March 22, 1999 Object models, overlay journals, and virtual collections.
1 Automated Digital Libraries William Y. Arms Department of Computer Science Cornell University.
1 William Y. Arms September 26, 2002 A Research Program for Information Science with the NSDL as an Example.
1 William Y. Arms Cornell University April 4, 2003 Free Access to Information Today Who Benefits? What are the Risks? Who Pays?
Corporation For National Research Initiatives NSF SMETE Library Building the SMETE Library: Getting Started William Y. Arms.
1 Economic Models for Open Access William Y. Arms Department of Computer Science Cornell University Professional.
1 The NSDL: A Case Study in Interoperability William Y. Arms Cornell University.
1 CS 502: Computing Methods for Digital Libraries Lecture 28 Current work in preservation.
1 A Very Large Digital Library Technology Demonstration William Y. Arms Cornell University.
1 The Digital Library Landscape Looking for Trends William Y. Arms Department of Computer Science Cornell University.
1 The NSDL Program Stephen Griffin National Science Foundation.
1 CS 430: Information Discovery Lecture 18 Web Search Engines: Google.
CS 430: Information Discovery
Presentation transcript:

1 Open Access to Digital Libraries. Must Research Libraries be Expensive? William Y. Arms Department of Computer Science Cornell University

2 Before Digital Libraries Access to scientific, medical, legal information In the United States: -- excellent if you belonged to a rich organization (e.g, a major university) -- very poor otherwise In many countries of the world: -- very poor for everybody

3 Research Libraries are Expensive library materials buildings & facilities staff

4 The Potential of Digital Libraries materials open access buildings & facilities staff

5 Economic Models for Open Access Who pays for open access to information?

6 Two Fallacies 1. The Luddite Publishing Fallacy Academic authors will never change. Prestige is determined by which journals a researcher publishes in. The prestigious journals make the rules. 2. The Free Lunch Fallacy Web publishing costs nothing. Therefore groups of researchers should publish their own research. There is no need to waste money on publishers.

7 Four Economic Models Example: Broadcast Television Open Access Advertising network television External funding public broadcasting Restricted Access Subscription cable Pay-by-usepay-per-view

8 Examples OldNew Books in Print (subscription)Amazon.com (advertising) Medline (pay-by-use)Grateful Med (external) Journal (subscription)ePrint archives (external) Westlaw (pay-by-use)Legal Information Institute (external) Inspec (subscription)Google (advertising)

9 Thoughts on the Future of Open Access The dominant force is author pressure, which emphasizes open access rather than closed access. 1. A mixture of economic models will coexist. 2. Eventually, we will have open access to most scientific and professional information. 3. The most common economic model will be that information is published by the producing organization. The producing organization may be a university (or part), a conference series, a laboratory, an association, etc.

10 A New Role For Academic Libraries and Associations Academic libraries and associations can provide support for open access information: -- Establish standards for academic quality -- Maintain local archives (e.g., M.I.T.'s archive of local research) -- Protect and preserve for the long-term

11 buildings & facilities computers & networks The Potential of Digital Libraries materials open access staff ?

12 Automated Digital Libraries How effectively can computers be used for the skilled tasks of professional librarianship? -- Time horizon: 5 to 20 years -- All materials in digital form Computers cannot imitate intelligence. Can automated digital libraries provide equivalent services?

13 Example: Catalogs and Indexes Catalog, index and abstracting records are very expensive when created by skilled professionals -- only available for certain categories of material (e.g., monographs, scientific journals) -- contain limited fields of information (e.g., no contents page) -- restricted to static information

14 Equivalent Services: Catalogs and Indexes Cataloguing rules -- Application of cataloguing rules is skilled -- It is hard to imagine a computer system with these skills but Cataloguing rules are the means, not the end

15 Equivalent Services Information discovery I used to be a heavy user of Inspec. Now I use Google instead. Why are web search services the most widely used information discovery tools in universities today?

16 Conventional Criteria Web search services have many weaknesses -- selection is arbitrary -- index records are crude -- no authority control -- duplicate detection is weak -- search precision is deplorable yet they clearly satisfy some users...

17 Effectiveness of Web Search Why I use Google instead of Inspec: => Broader coverage => Better ranking => Immediate access to information (e.g., open access version of published paper) Google is an equivalent service for information discovery (for some users)

18 Simple Algorithms + Immense Computing Power

19 Brute Force Computing Few people really understand Moore's Law -- Computing power doubles every 18 months -- Increases 100 times in 10 years -- Increases 10,000 times in 20 years Simple algorithms + immense computing power may outperform human intelligence

20 Brute Force Computing Example Creators of the world champion chess program (Deep Thought later Deep Blue) -- moderate chess players -- simple tree-search algorithm -- very, very fast computer hardware

21 Examples of Automated Digital Library Services

22 Brute Force Computing:Web Search Web search engines: -- retrieve every page on the web -- index every word -- repeat every month

23 Substitutes for Human Intelligence Automated algorithms for information discovery Closeness of match -- vector space and statistical methods (Salton, et al., c. 1970) Importance of digital object -- Google ranks web pages by how many other pages link to them (NSF/DARPA/NASA Digital Libraries Initiative)

24 Brute Force Computing: Archiving and Preservation Internet Archive -- Monthly, web crawler gathers every open access web page with associated images -- Web pages are preserved for future generations -- Files are available for scholarly research

25 Brute Force Computing: Reference Linking ResearchIndex (CiteSeer, ScienceIndex) (NEC) -- fully automatic -- all open access material in computer science -- a free service Contrast with the Web of Science (ISI) -- input: combination of automatic means, skilled people -- limited number of journals -- very expensive

26 Brute Force Computing: Automated Metadata Extraction Informedia (Carnegie Mellon) Automatic processing of segments of video, e.g., television news. Algorithms for: -- dividing raw video into discrete items -- generating short summaries -- indexing the sound track using speech recognition -- recognizing faces (NSF/DARPA/NASA Digital Libraries Initiative)

27 Automating Interoperability Example: Cornell University's Core System for the NSDL (The National Science Foundation's digital library for science, mathematics, engineering and technology education)

28 Levels of Interoperability A comprehensive science library: The NSDL must provide coherent services across a vast range of materials managed by organizations with many objectives. Three levels of interoperability: Federation Harvesting Gathering

29 Federation (e.g., Z39.50 and MARC) Digital libraries that follow a full set of agreements form a federation. Standards and agreements -- Technical: formats, protocols, security systems, etc. -- Content: data and metadata (including semantics) -- Organizational: access, services, payment, authentication, etc. Federations are desirable but very demanding and hence rare

30 Gathering (e.g., Internet Archive, Google) Gathering: service for open access information, even if information providers do not follow standard agreements: -- web crawlers gather open access information -- web search engines index it -- automated services are possible (e.g., ResearchIndex) Entirely automated

31 Harvesting (e.g., Open Archives Initiative) Digital libraries: -- provide a brief metadata record for each item (e.g., minimal Dublin Core) -- support a simple protocol for access to this metadata Automated harvesters: -- harvest the metadata automatically -- build automated services Mainly automated

32 Costs and Benefits

33 Costs of Automated Digital Libraries The Google Company million searches daily people (half technical, 14 with Ph.D. in computing) -- 2,500 PCs running Linux, with 80 terabytes of disk The Internet Archive -- 7 people plus support from Alexa (March 2000)

34 Overall If you are rich Research libraries, using commercial information services, provide excellent service at very high cost to a favored few -- Automated digital libraries are far from providing the personal service available to a faculty member at a rich university but...

35 The Model T Library The Model T Ford, with mass production, brought car travel to the masses Automated digital libraries, with open access materials, can already provide good service at low cost -- In the future, automated digital libraries can bring scientific, scholarly, medical and legal information to everybody

36 Some Light Reading William Y. Arms, "Automated digital libraries." D-Lib Magazine, July/August William Y. Arms, "Economic models for open-access publishing." iMP, March