Computing - The Next 10 Years Universal Access to Information Raj Reddy Carnegie Mellon University Pittsburgh, USA April 6, 2001 Talk presented at Georgia.

Slides:



Advertisements
Similar presentations
E-Content Service Group Virtual Meeting Digital Preservation: How to Get Started.
Advertisements

DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
Copyright © 2014 Pearson Education, Inc. Publishing as Prentice Hall
File Management Chapter 3
Digital Content Solutions Digital content management technology has transformed the way to manage content and knowledge, in this knowledge era. Research.
Discovering Computers: Chapter 1
TC2-Computer Literacy Mr. Sencer February 4, 2010.
Universal Memex (A Research Project for Discussion)
44 CHAPTER SPECIALIZED APPLICATION SOFTWARE. © 2005 The McGraw-Hill Companies, Inc. All Rights Reserved. 4-2 Competencies Describe graphics software Discuss.
Internet Resources Discovery (IRD) IBM DB2 Digital Library Thanks to Zvika Michnik and Avital Greenberg.
Content Management Systems Digital Resources for Research in the Humanities 2001.
New Technologies Are Surfacing Everyday. l Some will have a dramatic affect on the business environment. l Others will totally change the way you live.
MULTIMEDIA Programming
T.Sharon-A.Frank 1 Internet Resources Discovery (IRD) Definition of Digital Libraries.
Introduction to Computers Essential Understanding of Computers and Computer Operations.
WEB DESIGNING Prof. Jesse A. Role Ph. D TM UEAB 2010.
Information Technology Ms. Abeer Helwa. Computer Generations First Generation (Vacuum Tubes) -They relied on the machine language to perform operations.
A Seminar report On Electronic Resources :An Overview
Research Methods & Data AD140Brendan Rapple 2 March, 2005.
Section 2.1 Compare the Internet and the Web Identify Web browser components Compare Web sites and Web pages Describe types of Web sites Section 2.2 Identify.
Chapter II The Multimedia Sysyem. What is multimedia? Multimedia means that computer information can be represented through audio, video, and animation.
Computer Science : Information Systems Design and Development Unit Web Sites - National 4 / 5 St Andrew’s High School-Revised January 2013 Slide 1 St Andrew’s.
Introduction to digital libraries How to Build a Digital Library Ian H. Witten and David Bainbridge.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Discovering Computers 2010 Chapter 2 The Internet and World Wide Web.
The Dawning of the Age of Infinite Storage William Perrizo Dept of Computer Science North Dakota State Univ.
Connecticut History Online A digital library? By Todd Vandenbark.
1 Web Basics Section 1.1 Compare the Internet and the Web Compare Web sites and Web pages Identify Web browser components Describe types of Web sites Section.
WORKFLOWS AND OTHER CONSIDERATIONS FOR DIGITIZATION  Steve Bingo  Processing Archivist Washington State University Libraries  Alex Merrill  Assistant.
EED 502: UNIT A MULTI MEDIA TOUR.
44 CHAPTER SPECIALIZED APPLICATION SOFTWARE Graphics 1. Desktop publishing 2. Image editors 3. Illustration programs 4. Image galleries 5. Graphic.
An Overview of MPEG-21 Cory McKay. Introduction Built on top of MPEG-4 and MPEG-7 standards Much more than just an audiovisual standard Meant to be a.
Module 3: Business Information Systems Chapter 8: Electronic and Mobile Commerce.
Course Title: M.M.T Chapter No: 01 “Introduction to Multimedia”
From Concept to Reality: An overview of the University of Wisconsin Digital Collections Melissa Mclimans.
Section 1 # 1 CS The Age of Infinite Storage.
COM 205 MULTIMEDIA APPLICATIONS St. Joseph’s College Fall 2004.
Section 1 # 1 CS The Age of Infinite Storage.
Chapter One Orientation: The world of digital libraries How to Build a Digital Library Ian H. Witten and David Bainbridge.
Integrating a Statewide Web Gateway With Digital Collections ______________________ Eric Weig and Beth Kraemer University of Kentucky and KCVL.
Challenges for Academic Libraries in the Networked World Christine L. Borgman Professor & Presidential Chair in Information Studies UCLA & Visiting Professor.
The Evolving Digital Mathematics Library: A Mathematics Librarian’s Perspective Timothy W. Cole University of Illinois at Urbana-Champaign 8 Dec
Internet for Teaching and Learning. Understanding the Web The Web is A collection of publicly accessible pages (web sites) on the Internet All use the.
PAN-European Exploitation of the Results of the Libraries Programme - EXPLOIT German Libraries Institute Berlin EXPLOIT 1 Electronic library materials.
Multimedia ETD Questions Bill Savage UMI Dissertations Publishing ETD 2002 Provo, Utah Saturday, June 1, 2002.
Collecting History: Profiles in Science Alexa T. McCray National Library of Medicine Bethesda, MD Stanford University August 21, 1999.
Digital Library The networked collections of digital text, documents, images, sounds, scientific data, and software that are the core of today’s Internet.
Corporation For National Research Initiatives Technical Issues in Electronic Publishing Corporation for National Research Initiatives William Y. Arms.
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
Current Information To help you find current news and information, many search engines and directories include a hyperlink to a "What's new" page. Many.
1 CS 430: Information Discovery Lecture 18 Web Search Engines: Google.
Databases vs the Internet. QUESTION: What is the main difference between using library databases and search engines? ANSWER: Databases are NOT the Internet.
Chapter Three Presentation: User interface How to Build a Digital Library Ian H. Witten and David Bainbridge.
Leveraging the Expertise of our Staff and the Information Resources We Manage MIT Libraries Visiting Committee April 13, 2005.
July 15-18, 2002 Shanghai, China. Yuhfen Diana Wu Susana Liu San Jose State University San Jose, California
introductionwhyexamples What is a Web site? A web site is: a presentation tool; a way to communicate; a learning tool; a teaching tool; a marketing important.
Libraries in the digital age Collection & preservation for generational access part two The LOCKSS Program.
Million Book Project: Vision Becoming Reality Gabrielle Michalek, Carnegie Mellon Presentation to Carnegie Mellon Qatar Library November 9 & 10, 2005.
Avalon's Role in the Digital Collections Ecosystem
Preserving Our Past and Present for the Future Generations
CS The Age of Infinite Storage
System And Application Software
Carnegie Mellon University Libraries
DIGITAL LIBRARY.
Metadata to fit your needs... How much is too much?
Chapter 3 Hardware and software 1.
Chapter 3 Hardware and software 1.
COM 205 MULTIMEDIA APPLICATIONS
Introduction to Multimedia
Copyright & Fair Use What You Need to Know!.
Presentation transcript:

Computing - The Next 10 Years Universal Access to Information Raj Reddy Carnegie Mellon University Pittsburgh, USA April 6, 2001 Talk presented at Georgia Tech 10 th Anniversary Convocation

Future Technology Computational power doubles every 18 months (Moore’s Law) Computational power doubles every 18 months (Moore’s Law) 100-fold improvement every 10 years 100-fold improvement every 10 years Disk Densities double every 12 months Disk Densities double every 12 months 1000-fold improvement every 10 years 1000-fold improvement every 10 years Optical bandwidth doubling every 9 months Optical bandwidth doubling every 9 months fold improvement every 10 years fold improvement every 10 years Infinite Bandwidth and Memory before Computation Infinite Bandwidth and Memory before Computation Cost decreasing, density increasing Cost decreasing, density increasing

What does the future hold? We can see some glimpses of the future Universities without walls, Universities without walls, Computers that never fail and self healing software Computers that never fail and self healing software Every home with giga PCs connected by gigabit networks Every home with giga PCs connected by gigabit networks Access to all the published creative works of the world Access to all the published creative works of the world anytime anywhere anyone anytime anywhere anyone Emergence of the World Bank of, not money, but Knowledge Emergence of the World Bank of, not money, but Knowledge Systems, so-called geriatric robotics, that help the disabled lead normal lives, and Systems, so-called geriatric robotics, that help the disabled lead normal lives, and Systems that give the rest of us superhuman capabilities, like getting a month’s work done in a day Systems that give the rest of us superhuman capabilities, like getting a month’s work done in a day

Universal Access to Information Information at your fingertips Access to all human knowledge: Access to all human knowledge: Anyone Anyone Anywhere Anywhere Anytime Anytime

All Human Knowledge Recorded Information Books Books Periodicals (journals, newspapers) Periodicals (journals, newspapers) Music, opera, dance Music, opera, dance Paintings, Sculptures and Monuments Paintings, Sculptures and Monuments Movies, video Movies, video Databases, software Databases, software Suppose all of this were on the Web

Examples from Lecture: Michael Shamos on UL Lecture: Michael Shamos on ULMichael Shamos on ULMichael Shamos on UL Books: A Child’s History of England Books: A Child’s History of EnglandA Child’s History of EnglandA Child’s History of England Art: Greek Art Art: Greek ArtGreek ArtGreek Art

Collection of static content Collection of static content Collection of dynamic multimedia content Collection of dynamic multimedia content Linearly organised Linearly organised Browsable, navigable Browsable, navigable Selected by an Author as related Selected by an Author as related Selected by User as related Selected by User as related Occupying a single physical location Occupying a single physical location No physical existence No physical existence Physically bound between cover Physically bound between cover Instantly Transmittable Instantly Transmittable What is a book? What is a digital book ?

What is a Library? What is a Library? Collection of items Collection of items Linearly organized (shelves) Linearly organized (shelves) Chosen by budget constraints Chosen by budget constraints Occupying physical space Occupying physical space Cataloged for access Cataloged for access

What is a Digital Library? What is a Digital Library? Collection of digital items Collection of digital items (potentially huge ) (potentially huge ) Encompassing everything (someday) Encompassing everything (someday) Organized arbitrarily Organized arbitrarily Occupying no physical space Occupying no physical space Fully content-searchable Fully content-searchable

Universal Library Implications Elimination of time, space, cost constraints Elimination of time, space, cost constraints Democratization of information Democratization of information “Knowledge is power” “Knowledge is power” Hyperlinks to related information Hyperlinks to related information Preservation and Dissemination of Knowledge Preservation and Dissemination of Knowledge faster and wider faster and wider Backup preservation Backup preservation Preservation of culture Preservation of culture

Universal Library Implications Research Research Web of scholarly information, reviews Web of scholarly information, reviews Teaching Teaching Support for distance education Support for distance education Academic publishing Academic publishing Virtual museums Virtual museums Interactivity Interactivity

Universal Library Applications Acess to “Born Digital” Information Acess to “Born Digital” Information World produces a Billion Billion(10 18 ) bytes of information every year(Lyman and Varian) World produces a Billion Billion(10 18 ) bytes of information every year(Lyman and Varian) 90% is stored digitally 90% is stored digitally Digital museum Digital museum Digital tour guide Digital tour guide What’s in the Taj Mahal? What’s in the Taj Mahal?

Universal Library Applications Research assistant Research assistant What did Newton write about color? What did Newton write about color? What are Moslem views on race? What are Moslem views on race? Teaching resource Teaching resource “Act out” books in virtual reality “Act out” books in virtual reality Real-time explanations Real-time explanations Business information Business information Data mining Data mining

We Can Store Everything 1 book = 500 pp. 1 book = 500 pp. 1MB uncompressed – 300KB compressed 1MB uncompressed – 300KB compressed 10 8 to 3x 10 8 books = ~10 14 bytes = 100 terabytes 10 8 to 3x 10 8 books = ~10 14 bytes = 100 terabytes Over 100 million computers on the Internet Over 100 million computers on the Internet At 1 GB each, >100 petabytes now At 1 GB each, >100 petabytes now 1 GB of disk costs ~$3 1 GB of disk costs ~$3 100 terabytes < $300 thousand to $1 million 100 terabytes < $300 thousand to $1 million

Non-textual Material 1 Movie = 10 GB 1 Movie = 10 GB 1 petabyte = 100,000 movies 1 petabyte = 100,000 movies All the movies ever made! All the movies ever made! Audio Audio 1 petabyte = 3000 years of music 1 petabyte = 3000 years of music All music ever performed or recorded All music ever performed or recorded Paintings and 1 MB Paintings and 1 MB 1 petabyte = 1 billion painting or photos 1 petabyte = 1 billion painting or photos

Non-textual Material Gore’s Digital Earth Gore’s Digital Earth “A multi-resolution, three-dimensional representation of the planet, into which we can embed vast quantities of geo-referenced data.” “A multi-resolution, three-dimensional representation of the planet, into which we can embed vast quantities of geo-referenced data.” Area of Earth  1/2 peta m 2 Area of Earth  1/2 peta m bytes/m 2 feasible 1000 bytes/m 2 feasible 2 MB/m 2 not practical yet  bytes = 1 zettabyte 2 MB/m 2 not practical yet  bytes = 1 zettabyte {peta-, exa-, zetta-, yotta-} {peta-, exa-, zetta-, yotta-}

Technological Challenges Input (scanning, digitizing, OCR) Input (scanning, digitizing, OCR) Data representation Data representation text, notations, images, web pages text, notations, images, web pages Navigation and Search Navigation and Search Multilingual Issues Multilingual Issues Output (voice, pictures, virtual reality) Output (voice, pictures, virtual reality) Synthetic Documents Synthetic Documents

Universal Library Design Modular Modular Technology plug-ins (e.g. machine translation) Technology plug-ins (e.g. machine translation) Distributed Distributed Mirror sites Mirror sites Multiple interfaces Multiple interfaces Human (languages, cultures, literacy) Human (languages, cultures, literacy) Machine Machine

Universal Library Design Speech input/output Speech input/output Pictorial output Pictorial output Language support Language support Translation assistants Translation assistants Summarization tools Summarization tools Synthetic documents Synthetic documents Encyclopedia-on-demand Encyclopedia-on-demand

Input Issues Non-digital media Non-digital media Conversion, scanning, correction Conversion, scanning, correction Triple keyboard, uncorrected OCR Triple keyboard, uncorrected OCR Digital media Digital media Formats, conversions, color representation Formats, conversions, color representation ASCII, HTML, SGML, XML, PDF, PS, TEX ASCII, HTML, SGML, XML, PDF, PS, TEX JPEG, TIFF, GIF? JPEG, TIFF, GIF?

Input Issues Structured matter Structured matter Musical notation, Laban Musical notation, Laban Chemistry Chemistry 3D Items 3D Items Resource allocation (what’s first?) Resource allocation (what’s first?) Duplication of effort (no registry) Duplication of effort (no registry)

Metadata Data about an item not part of the item Data about an item not part of the item Bibliographic Bibliographic Format, medium, encoding, resolution Format, medium, encoding, resolution Provenance Provenance Reliability, integrity Reliability, integrity Permissions Permissions Who generates metadata? Who generates metadata?

Navigation Browsing, finding, searching, flying Browsing, finding, searching, flying Fractal view Fractal view Keys are granularity and connectivity Keys are granularity and connectivity View whole collections or one glyph View whole collections or one glyph Understanding structure of information Understanding structure of information Making Sense Of The World’s Knowledge

Searching Mathematics

MATHEMATICA Canonical Form: Integrate[ Times[Power[E,Times[-1,Power[V1,2]]], Sin[Power[V1,2]]], {V1,0,Infinity}]

Multilingual Issues Character sets Character sets Representations Representations Íîäà ôèçè÷åñêè íàõîäèòñÿ â çäàíèè Èçâåñòèé Нода физически находится в здании Известий Multilingual navigation Multilingual navigation Translation assistance Translation assistance

Synthetic Documents Documents derived automatically from retrieved information Documents derived automatically from retrieved information Multilingual translation Multilingual translation Abstracts, summaries, glossaries Abstracts, summaries, glossaries Encyclopedia-on-demand Encyclopedia-on-demand

Information Reliability Existence  validity Existence  validity Universal Library Philosophy Universal Library Philosophy Avoid value judgments Avoid value judgments Provide information from which users (and programs) can assess validity Provide information from which users (and programs) can assess validity Source, reputation, recency, reviews, consistency Source, reputation, recency, reviews, consistency

Scaling Problems Search services (e.g. Altavista) index >10 8 documents Search services (e.g. Altavista) index >10 8 documents Suppose there were ? Suppose there were ? How can a billion users access the same item at once? How can a billion users access the same item at once?

Policy Challenges Use of copyrighted material Use of copyrighted material Economics (Who pays? Who gets?) Economics (Who pays? Who gets?) Privacy Privacy Reliability of information Reliability of information Change in the nature of teaching Change in the nature of teaching

Use Of © Content Philosophy: must pay for use Philosophy: must pay for use Authors, publishers will not suffer Authors, publishers will not suffer Implied license Implied license Automated permissions Automated permissions Bulk licensing Bulk licensing Compulsory licensing Compulsory licensing Owner CAN’T refuse; user MUST pay Owner CAN’T refuse; user MUST pay

Economics Flat-fee subscriptions (e.g. HBO) Flat-fee subscriptions (e.g. HBO) Metered use (electric company) Metered use (electric company) Microcharge (Tobias “clickl”) Microcharge (Tobias “clickl”) Free (paid by government) Free (paid by government) Automated permissions Automated permissions Use measured by technology Use measured by technology

Operating Model Single portal for access to all information Single portal for access to all information Universal Library provides input, access, multilingual, output and synthesis tools Universal Library provides input, access, multilingual, output and synthesis tools Universal Library will be a model scanning operation Universal Library will be a model scanning operation Registry of digitized works Registry of digitized works

Operating Model Specialized collections curated by specialists, provided to Universal Library Specialized collections curated by specialists, provided to Universal Library Foreign collection performed in foreign countries Foreign collection performed in foreign countries Universal Library will be mirrored in ~12 sites around the world Universal Library will be mirrored in ~12 sites around the world

Universal Library Status >13,000 digital volumes >13,000 digital volumes Art Art Newspapers Newspapers Music, video Music, video Portal to hundreds of other collections Portal to hundreds of other collections Visit Visit

Projects Navigator Navigator Academic electronic publishing Academic electronic publishing Electronic Union Catalog Electronic Union Catalog Books out of copyright books out of print Books out of copyright books out of print Software distribution Software distribution

Conclusions and Recommendations Conclusions Conclusions Barely 10% of all public information is available on the Internet Barely 10% of all public information is available on the Internet Government needs to play a leadership role in developing digital libraries Government needs to play a leadership role in developing digital libraries Significant technical and operational challenges in migrating and maintaining holdings in digital form Significant technical and operational challenges in migrating and maintaining holdings in digital form Intellectual Property rights need to be addressed to facilitate creation and access digital libraries Intellectual Property rights need to be addressed to facilitate creation and access digital libraries Recommendations Recommendations Support research: meta data, scalability, multiple languages, security, and usability Support research: meta data, scalability, multiple languages, security, and usability Create testbeds: million book project Create testbeds: million book project Place all public governmental information online Place all public governmental information online Preserve IP rights of creators by creating tax incentives for public use of online copyrighted information Preserve IP rights of creators by creating tax incentives for public use of online copyrighted information