Presentation is loading. Please wait.

Presentation is loading. Please wait.

Digital Libraries: An Aid to Education through Interoperable Open Archives of Resources U. Kentucky February 24, 2000 Edward A. Fox

Similar presentations


Presentation on theme: "Digital Libraries: An Aid to Education through Interoperable Open Archives of Resources U. Kentucky February 24, 2000 Edward A. Fox"— Presentation transcript:

1 Digital Libraries: An Aid to Education through Interoperable Open Archives of Resources U. Kentucky February 24, 2000 Edward A. Fox fox@vt.edu http://fox.cs.vt.edu CC CS DLRL Internet TIC Virginia Tech, Blacksburg, VA, USA

2 Acknowledgements (Selected) F Sponsors: ACM, Adobe, IBM, Microsoft, NSF, OCLC, US Dept. of Education, … F Co-PIs: Marc Abrams, Robert Akscyn, John Eaton, Gail McMillan F Students: Fernando Das Neves, Robert France, Neill Kipp, Paul Mather, Constantinos Phanouriou, James Powell, Ohm Sornil, David Watkins, Chang Zhang, Jianxin Zhao

3 Remember! F VT (education and technology) F PetaPlex, Envision, MARIAN, NRG F DL, 5S (to understand and build DLs) F CSTC, CRIM (add to, use) -> NSDL F OAI (convention, meetings, proposals)

4 Virginia Tech Background F Largest university in Virginia, land-grant, town population 35K plus 25K students F Blacksburg Electronic Village, since 1992, with 80% of community on Internet F Net.Work.Virginia, largest ATM network, with over 750 sites, for education, research, government F LMDS, Local Multipoint Distribution Service, gigabit wireless networking - 1/3 of Virginia F Math Emporium, 500 workstations F Faculty Development Initiative, round 2

5 Supporting Authors (Teachers and Learners) Faculty Develop. Initiative ETD Support Virginia Tech Digital Library University Libraries Classifying/ Cataloging/ Preserving Collaboration Visualization MM IR EPub HCI Model Classroom of the 21st Century Technology ShowcaseATMVideo Conf.Develop MM New Media Center Dig. Library & Archives

6 McBryde 110 F Model Classroom of 21 st Century F ATM-based VTEL system F Apple G3, Media 100, 120G, BetaCam SP, FireWire, one of almost any device F Large Smart Board F IBM Multimedia PC, … F Supports spring multimedia class (CS4624) F Tom Wilkinson’s staff and systems supporting innovation in learning grants

7 ACITC F Advanced Communications and Information Technology Center, opening summer 2000 F Connects to the library, with a focus on IT F 1/3 high-tech (multimedia) classrooms F 1/3 digital/electronic library (reading room) F 1/3 research labs: 10, including: –Digital Library Research Laboratory (DLRL) –Center for Applied Technologies in the Humanities –Center for Human-Computer Interaction (HCI) – extending 5 year $2M NSF Research Infrastructure project that has usability laboratories (individuals, 2-person teams, groups) –HPC; Multimedia; Visualization (CAVE),...

8 End-to-End Innovation OC3 NET.WORK.VIRGINIA World’s Most Advanced Public Network Statewide Access Regional / National Access Blacksburg Electronic Village LMDS Wireless Technology Multimedia Service Access Point Local Community Access Internet 2 / NGI Multimedia Network Access Point

9 PetaPlex F Digital Library Machine (“super” object store) F Parallel computer / storage utility for scale of 1000 to 100,000,000 gigabytes (1 Tbyte - 100 Pbyte) F Knowledge Systems Incorporated is supplying VT- PetaPlex-1 for $250,000 with –high speed backbone connection(s) –2.5 terabytes through 100 “nanoservers”: –Each = Network connection + IBM 25GB disk + 233 MHz Pentium II + Linux

10 PetaPlex Complex FRONT END MACHINE RS/6000, 1G RAM, 4 Proc. Nanoserver Service Machine 1 Service Machine 2 Service Machine 3 Service Machine 4

11 PetaPlex Service Machine Possibilities F Front-end provides handle/repository abstraction through hashing F Small object server F Large object server –video on demand –streaming audio F Information retrieval server F Proxy / cache server (e.g., 1 terabyte server of 1000 worldwide for Comsat/Intelsat)

12 PetaPlex Top View 4 ft. side

13 PetaPlex Side View 4 ft. wide 8 ft. high Roles: * Support * Cooling * Power 15 shelves

14 Comparison Network of Workstations (NOW) BeowulfPetaPlex Archi- tecture Cluster of general purpose workstation class machines using off-the-shelf network interconnect General purpose PCs, interconnected with a custo- mized network Special purpose architecture tuned for superstorage. Uses a mix of off-the-shelf PC compo- nents and specialized network interconnects. Cost per node Workstation prices. Between $2000-$2500/node Mid to low-end PC prices. Between $1200- $1800 per node Mass produced components will reduce price to around $100/node Target area Computation Storage; computation is a secondary function Filesystem support UNIX flavors Replaces location dependant files with location independent fine-grained URN named objects

15 ENVISION F NSF “A User-Centered Database from the Computer Science Literature” (1991-93) F Collected bib/typesetter data, converted to SGML F Scanned thousands of page images F MARIAN search engine - can be made available (also applied to the Virginia Tech library catalog) used as part of a prototype object-based DL, with tailored visualization interface (L. Nowell dissertation)

16 Envision Results Window

17 MARIAN F Multiple Access Retrieval of Information with ANnotations F (Musical: Marian the Librarian …) F Evolved from 1980’s CODER system to a distributed Online Public Access Catalog (OPAC), then DL backend, now becoming a full DL system F From C/C++ to Java by Jianxin Zhao F Future uses: NDLTD, NUDL, PetaPlex

18

19

20 MARIAN Layers Database Layer Search Engine Layer User Information Layer User Interface Layer User

21 MARIAN Parallelism

22 MARIAN Response Time

23 France Dissertation F Key developer since CODER F Applying computational linguistics efforts with machine readable dictionaries F Applying opportunistic handling of term lists for ranking, usable displays (“to be or not to be, that is the”) F Developing and evaluating variety of interfaces

24 Network Research Group F NSF 3 year grant on WWW logging, characterization, and optimization: Abrams, Fox, Pollard (CNS) F Core member of Web Characterization Activity of World-Wide Web Consortium F Providing DL to support WCA (at http://www.w3c.org/WCA): –logs –tools –publications

25 Example: NRG Tools WebJamma: Artificial HTTP traffic generator WebWatcher: HTTP traffic monitoring and logging system CLFmunge: Anonymizes common log format HTTPdump: Protocol decode for tcpdump Caching proxy simulator Splus programs Log description and validation interface & routines

26 How do universities and digital libraries relate? F Each U. will have its own digital library. Hence there will be large numbers (i.e., critical mass). F All students will learn how to use and how to “feed” digital libraries (and bring those habits to future work as needs and skills). F All digital library problems (esp. federation, flexibility, personalization) appear at U’s (so they are a good type of testbed, with willing collaborators in- place for developing solutions). F Start with NDLTD, extend to NUDL

27 SPIRE Visualization

28 Digital Libraries --- Virginia Tech F MARIAN (NLM) F CS DL Prototype - ENVISION (NSF, ACM) F TULIP (Elsevier, OCLC) F BEV History Base (NSF, Blacksburg) F DL for CS Education - EI (NSF, ACM) F WATERS, NCSTRL (NSF) F NDLTD (SURA, US Dept. of Education) F CSTC (NSF, ACM), CRIM (NSF, SIGMM) F WCA (Log) Repository (W3C) F VT-PetaPlex-1 (Knowledge Systems)

29 Digital Libraries --- Objectives F World Lit.: 24hr / 7day / from desktop F Integrated “super” information systems: 5S: streams, structures, spaces, scenarios, societies F Ubiquitous, Higher Quality, Lower Cost F Education, Knowledge Sharing, Discovery F Disintermediation -> Collaboration F Universities Reclaim Property F Interactive Courseware, Student Works F Scalable, Sustainable, Usable, Useful

30 DLs: Why of Global Interest? F National projects can preserve antiquities and heritage: cultural, historical, linguistic, scholarly F Knowledge and information are essential to economic and technological growth, education F DL - a domain for international collaboration –wherein all can contribute and benefit –which leverages investment in networking –which provides useful content on Internet & WWW –which will tie nations and peoples together more strongly and through deeper understanding

31 DL Challenges F Preservation - so people with trust DLs F Supporting infrastructure - networks,... F Scalability, sustainability, interoperability F DL industry - critical mass by covering libraries, archives, museums, corporate info, govt info, personal info - “quality WWW” integrating IR, HT, MM,... –Need tools & methods to make them easier to build

32 Computing (flops) Digital content Communicat i ons (bandwidth, connectivity) Locating Digital Libraries in Computing and Communications Technology Space Digital Libraries technology trajectory: intellectual access to globally distributed information lessmore

33

34 Definition: Digital Libraries are complex systems that F help satisfy info needs of users (societies) F provide info services (scenarios) F organize info in usable ways (structures) F present info in usable ways (spaces) F communicate info with users (streams)

35 5S Layers Societies Scenarios Spaces Structures Streams

36 Definition: 5S Framework F Societies: interacting people (, computers) F Scenarios: services, functions, operations, methods F Spaces: domains + constraints (e.g., distance, adjacency): 2D, vector, probability F Structures: relations, trees, nodes and arcs F Streams: sequences of items (text, audio, video, network traffic) F (5 Element System: Fire, Wood, Earth, Metal, Water)

37 5S: Components F Societies: roles, rituals, reasons, relationships, artifacts F Scenarios: acquire, index, consult, administer, preserve F Spaces: physical, temporal, functional, presentational, conceptual F Structures: architectures, taxonomies, schema, grammars, links, objects F Streams: granularities, protocols, paths, flows, turbulences

38 5S: Combinations F Societies + Scenarios = user model F Societies + Scenarios + Spaces = user interface F Streams + Structures = markup F Streams + Structures + Scenarios = object F Structures + Scenarios = DBMS

39 How to Build a Digital Library F Understand the problem (using the 5S Framework) F Solve the problem (using the Star Methodology) –design, develop, evaluate, –refine, operate

40 Neill Kipp Dissertation F Training interested groups about 5S and the Star Methodology, refining the Framework to have solid mathematical foundation F Case studies of projects at Virginia Tech or involving VT staff/students: CSTC, NDLTD, NARA (National Archives, with SAIC), Lexis,... F Open also to study DL projects elsewhere F Focusing too on the design artifacts developed and related issues of efficient description and representation (esp. with markup, hypermedia)

41 DLs Shorten the Chain from Editor Publisher A&I Consolidator Library Reviewer

42 DLs Shorten the Chain to Editor A&I Digital Library Reviewer

43 DLs Shorten the Chain to Author Reader Digital Library Editor Reviewer Teacher Learner Librarian

44 Enhancing Learning with DLs

45

46

47 NSF Education Innovation (EI) F NSF “Interactive Learning with a Digital Library in Computer Science” (1993-98) F 45 online courses (esp. Internet, IR, MM, Professionalism, overall EI project pages): 100+K accesses/wk F Tools: SWAN (visualization), QUIZIT F Evaluation –traditional –network logging and analysis –tools for visualization

48 Digital Library Courseware F http://ei.cs.vt.edu/~dlib/ F WWW pages or large PDF copy files F Online quizzes based on book by Michael Lesk (Morgan Kaufmann Publishers) F Contents based on book, with several other popular topics added (e.g., agents) F Separate pages to supplement: Definitions, Resources (People, Projects), and References

49 CS -> CSTC -> CRIM F NSF and ACM Education Committee are funding a 2 year project “A Computer Science Teaching Center” - CSTC - http://www.cstc.org/ F College of NJ, U. Ill. Springfield, Virginia Tech F Focus initially on labs, visualization, multimedia F Multimedia part is also supported by a 2nd grant to Virginia Tech and The George Washington University: http://www.cstc.org/~crim/ (with curricular guidelines also under development)

50 CS Teaching Center (CSTC) F Instead of building large, expensive multimedia packages, that become obsolete and are difficult to re-use, concentrate on small knowledge units. F Learners benefit from having well-crafted modules that have been reviewed and tested. F Use digital libraries to build a powerful base of support for learners, upon which a variety of courses, self-study tutorials & reference resources can be built. [See NSF NSDL - National Science (math, engineering, technology education) Digital Library (formerly SMETE-lib) at http://www.dlib.org/smete/public/smete-public.html ] F ACM Education Board and SIG support, new NSF grant with COLLEGIS Research Institute and others …

51

52 Browsing (1)

53 Browsing (2)

54 CRIM Rationale F MM field needs properly trained personnel F Support this with resources + curricula F Together these help us move toward a DL for Interactive MM -> CS -> NSDL F Benefits will go to teachers (who have more to build upon) and students (who will have a richer environment for learning

55

56 CRIM Project Activities F Workshops, other ways to involve community F WWW site including DL in CSTC re MM –Devised cataloging schema, designed interface –Referring to all MM syllabi and curriculum –Inviting learning resources for the CRIM DL, with reviews, reuse certifications F Publish report on MM curriculum through ACM and IEEE, after careful review F CSTC, CRIM will lead to ACM Journal of Educational Resources in Computing (JERiC)

57 Virginia Tech CRIM Related Courses F Art: Digital Art and Design course (Photoshop) F CS: 1604 Introduction to the Internet (1 cr.) F CS: 3604 Professionalism in Computing F CS: 4624 Multimedia, Hypertext and Information Access (3 cr.) F CS: 5604 Information Storage & Retrieval (3 cr.) F CS: 6604 Digital Libraries (3 cr.)

58 SMETE Library -> NSDL (from www.dlib.org to NSF DLI-2) F Context: Global movement toward Digital Libraries (see April 1998 CACM) F NSF effort: Science, Mathematics, Engineering, and Technology Education Digital Library (focussed on undergraduates) –3 workshops, yearly increasing funds / new calls F SMETE Library likely to operate as distributed federation, with separate parts for each key discipline, and to lead to a global effort

59 Open Archives Initiative History F xxx at LANL = Los Alamos National Laboratory (Ginsparg) for high-energy physics - 1991 F CSTR + WATERS = NCSTRL (Lagoze) - 1994 F xxx + NCSTRL = CoRR collaboration - 1998 F UPS (Universal Preprint Service) – 1999 mtg –Herbert Van de Sompel (U. Ghent, SFX) … –Dublin Core (DC), XML –Dienst protocol and software (Lagoze) F Renamed late 1999 as OAI

60 OAI Philosophy F Self-archiving = submission mechanism F Long-term storage system = archive F Open interface = harvesting mechanism F Data provider + service provider F Start with e-prints / pre-prints

61 Open Archives (protoproto) F ArXiv & Los Alamos National Lab F CogPrints & U. Southampton F NACA & NASA (reports) F NCSTRL & Cornell U. F NDLTD & Virginia Tech F RePEc & U. Surrey F (Washington U. & EconWPA)

62 Open Archives Members  Original Participants in the Open Archives Initiative – Caroline Arms, Library of Congress – Leslie Carr, University of Southampton – Mark Doyle, American Physical Society – Dale Flecker, Harvard University – Edward A. Fox, Virginia Tech – Michael Friedman, HighWire Press, Stanford University – Paul M. Gherman, Vanderbilt University – Paul Ginsparg, Los Alamos National Laboratory & xxx – Stevan Harnad, University of Southampton – Thomas Krichel, University of Surrey & RePEc – Carl Lagoze, Cornell University – Rick Luce, Los Alamos National Laboratory – Clifford Lynch, Coalition for Networked Information – Kurt Maly, Old Dominion University – Michael L. Nelson, NASA Langley Research Center – John Ober, California Digital Library – Bob Parks, Washington University & EconWPA – Herbert Van de Sompel, University of Ghent – Eric F. Van de Velde, California Institute of Technology – Don Waters, The Andrew W. Mellon Foundation – Ken Weiss, California Digital Library  Others Joining (selected) – University of Virginia – Jim French, Worthy Martin, Thornton Staples, – NEC Research Institute - C. Lee Giles and Steve Lawrence – Internet Archive - Kurt Bollacker, Marlita Kahn – India - University of Mysore – Shalini Urs – Mexico – University of Monterrey - David Garza Salazar

63 VT Open Archives – Initial Set F NDLTD – global (DC – listserv) F NDLTD – VT (MARC, DC) F CSTC (DC format, ACM format) F W3C WCA logs (XML, atomic)

64 Approaches to Open Archives Build By Discipline Build By Institution Author Category Interdisciplinary Year Language Query …

65 Institutions / Disciplines F Universities: part, all, sets of F Disciplines: buy in as in Germany –Physics, Chemistry, Math, Sociology, Educ. F Basis for Federation: –Language – German, Spanish, French, CJK –Politics – OhioLink, National Library of Portugal, ISTEC for Latin America –Economics – Developing Countries (UNESCO)

66 Open Archives Initiative (OAI) www.openarchives.org F Santa Fe meeting, Oct. 21-22, 1999 and protoproto F Next mtg June 3, San Antonio, between HT’00 & DL’00 F LANL, CNI, DLF, Mellon, … F Convention (see Feb. D-Lib Magazine) F Archives -> Open Archives –Support unique archive identifiers –Implement Open Archives Metadata Set (DC-based, using XML) –Implement Dienst harvesting interface –Register the archive F Build tools, layer other services: linking, searching, …

67 Tiered Model of Interoperability Mediator services Metadata harvesting Document models

68 Repository of Digital Objects Repository Access Protocol handle Digital object terms and conditions

69

70 Interoperability for NDLTD F Naming F Data exchange: share MARC records F Performance, reliability: replication(mirroring) F Federated searching –Query on content, metadata, links/relationships F Dynamic linking / extended services F Browsing, viz., working in concept space F Annotating/reviewing/certifying F Perspective/goals: removing barriers

71 Mechanisms F Sharing –Join federation, run software –Make metadata and archive available F Aggregating –By discipline –By institution –By genre F Automating –Workflow –Harvesting and providing services –Federated searching –Dynamic linking

72 OAI-Related Proposals F CNPQ – collaboration with PUC Rio F CONACyT – collaboration with UDLA and Monterrey (Mexico) F FIPSE preproposal – GSDI + OAI – with Caltech, U. Cincinatti (OhioLink), U. Kentucky, U. Iowa, USF (FL center for library automation)

73 Remember! F VT (education and technology) F PetaPlex, Envision, MARIAN, NRG F DL, 5S (to understand and build DLs) F CSTC, CRIM (add to, use) -> NSDL F OAI (convention, meetings, proposals)


Download ppt "Digital Libraries: An Aid to Education through Interoperable Open Archives of Resources U. Kentucky February 24, 2000 Edward A. Fox"

Similar presentations


Ads by Google