1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox.

Slides:



Advertisements
Similar presentations
1 Introduction to NDLTD and Brief History of the ETD Movement ETD 2008: 11 th Int. Symp. on ETDs Aberdeen, Scotland: Newcomers Edward A. Fox,
Advertisements

1. The Digital Library Challenge The Hybrid Library Today’s information resources collections are “hybrid” Combinations of - paper and digital format.
ETANA-DL: Leveraging Digital Library Technologies to Support Archaeology Vanderbilt University Nashville, TN -- Sept. 8, 2006 Weiguo Fan, Edward A. Fox,
Web- and Multimedia-based Information Systems. Assessment Presentation Programming Assignment.
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
Architecture & Data Management of XML-Based Digital Video Library System Jacky C.K. Ma Michael R. Lyu.
Introduction to Implementing an Institutional Repository Delivered to Technical Services Staff Dr. John Archer Library University of Regina September 21,
Institutional Repositories Tools for scholarship Mary Westell University of Calgary AMTEC Conference May 26, 2005.
Corporation For National Research Initiatives NSF SMETE Library Building the SMETE Library: Getting Started William Y. Arms.
Introduction and Conceptual Modeling
1 CS5604 October 13, 2010 “5S Overview for Modules” by Edward A. Fox and Lillian (Boots) Cassel (on Ensemble) Dept. of.
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
Digital Library Architecture and Technology
Introduction to Digital Libraries hussein suleman uct cs honours 2004.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Web Archives, IDEAL, and PBL Overview Edward A. Fox Digital Library Research Laboratory Dept. of Computer Science Virginia Tech Blacksburg, VA, USA 21.
Teaching Metadata and Networked Information Organization & Retrieval The UNT SLIS Experience William E. Moen School of Library and Information Sciences.
Dr. Kurt Fendt, Comparative Media Studies, MIT MetaMedia An Open Platform for Media Annotation and Sharing Workshop "Online Archives:
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Requirements Gathering and Modeling of Domain Specific Digital Libraries with the 5S Framework: An Archaeological Case Study with ETANA ECDL 2005, Vienna,
Managing the Record of Research At the Smithsonian Using SIdora SAA Research Forum August 12, 2014.
LIS 506 (Fall 2006) LIS 506 Information Technology Week 11: Digital Libraries & Institutional Repositories.
1 Introduction to NDLTD and Brief History of the ETD Movement ETD 2009: 12 th Int. Symp. on ETDs Pittsburgh, PA: Newcomers Edward A. Fox, Executive.
1st Workshop on Intelligent and Knowledge oriented Technologies Universal Semantic Knowledge Middleware Marek Paralič,
Ms. Irene Onyancha ISTD/Library & Information Management Services United Nations Economic Commission for Africa The Second Session of the Committee on.
Collaborative Research: Curriculum Development for Digital Library Education Presentation in May 1,2006
Creating and Operating a Digital Library for Information and Learning– the GROW Project Muniram Budhu Department of Civil Engineering & Engineering Mechanics.
PI: Edward A. Fox (CS, Co-PIs at VT: English – Evia; Business – Fan, Sheetz, Zobel Co-PIs at partner sites: Carr (NC.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
© 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang 5-1 Chapter 5 Business Intelligence: Data.
ETANA-DL NSF Digital Library Project Edward A. Fox, Virginia Tech ASOR Annual Meeting, 2004
Metadata Lessons Learned Katy Ginger Digital Learning Sciences University Corporation for Atmospheric Research (UCAR)
CITIDEL: Computing & Information Technology Interactive Digital Educational Library Web Page: Contacts: Future.
1 World Bank (Washington, D.C. – 20 November 2007) “Digital Libraries, 5S, and Applications – esp. Archaeology, Education, ETDs, and CTR (Crisis, Tragedy.
1 NDLTD Welcome and Introduction ETD 2011: 14 th Int. Symp. on ETDs Cape Town, South Africa Edward A. Fox Executive Director, NDLTD,
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
1 Slides for Steve Griffin, NSF “ETANA and Digital Library Integration” by Edward A. Fox Oct. 3, Dept. of Computer.
1 C.W. Post Campus, Long Island U. (23 April 2008) “Digital libraries: From Theory to CS/LIS Curricula” Edward A. Fox Dept.
National Science Foundation The National SMET Education Digital Library (NSDL) Program: Context and Vision August 10, 2000 US-Korea Joint Workshop on Digital.
Andreas Abecker Knowledge Management Research Group From Hypermedia Information Retrieval to Knowledge Management in Enterprises Andreas Abecker, Michael.
XXDL and CSTC and Virginia Tech NSDL Fall 2000 PI Meeting September 22-24, 2000 NSF, Arlington, VA Edward A. Fox CS DLRL.
This presentation describes the development and implementation of WSU Research Exchange, a permanent digital repository system that is being, adding WSU.
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
Digital Libraries Lillian N. Cassel Spring A digital library An informal definition of a digital library is a managed collection of information,
Digital Library The networked collections of digital text, documents, images, sounds, scientific data, and software that are the core of today’s Internet.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
1 Video Message: Welcome ETD 2015: 18 th Int’l Symposium on ETDs New Delhi, India Edward A. Fox Executive Director, Chairman of the Board NDLTD,
Towards a Reference Quality Model for Digital Libraries Maristella Agosti Nicola Ferro Edward A. Fox Marcos André Gonçalves Bárbara Lagoeiro Moreira.
Introduction to Concept Maps Edward A. Fox and Rao Shen CS5604 Fall 2002 “Information Storage & Retrieval” Dept. of Computer Science Virginia Tech, Blacksburg,
1 IBM Academic Initiative Introduction for Pamplin School of Business Virginia Tech – October 13, 2011 “IBM Academic Skills Cloud and Computing Education.
Foundations of Information Systems in Business. System ® System  A system is an interrelated set of business procedures used within one business unit.
ETD Search Services Ming Luo Edward A. Fox Virginia Tech.
Visual Semantic Modeling of Digital Libraries Qinwei Zhu, Marcos André Gonçalves, Rao Shen, Edward A. Fox – Virginia Tech,, Blacksburg, VA, USA Lillian.
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
SCENARIO-BASED GENERATION OF DIGITAL LIBRARY SERVICES Rohit Kelapure, Marcos André Gonçalves, Edward A. Fox Virginia Tech, Blacksburg, VA, USA.
The library is open Digital Assets Management & Institutional Repository Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business.
ETDs and NDLTD Hussein Suleman University of Cape Town May 2004.
Foundations of, and Experiences with, Componentized Digital Libraries OCKHAM Panel ECDL Rome, Italy Edward A. Fox Digital Library Research.
5S Perspective Digital Libraries Foundations Workshop at JCDL 2007 Vancouver – June 23 Edward A. Fox Virginia Tech, USA
Designing Protocols in Support of Digital Library Componentization Hussein Suleman and Edward A. Fox Digital Library Research Laboratory Virginia Tech.
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
Data Management: Data Processing Types of Data Processing at USGS There are several ways to classify Data Processing activities at USGS, and here are some.
1 Digging into Digital Libraries: From Archaeology to Formalism Edward A. Fox Virginia Tech, Dept. of CS CSC Spring Colloquium Villanova – February.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Introduction to NDLTD and Brief History of the ETD Movement ETD 2008: 11th Int. Symp. on ETDs Aberdeen, Scotland: Newcomers Edward A. Fox,
Vision... “… a network of learning environments and resources for Science, Mathematics, Engineering and Technology education, will ultimately meet the.
Outline Pursue Interoperability: Digital Libraries
9/22/2018.
Presentation transcript:

1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox Dept. of Computer Science, Virginia Tech Blacksburg, VA USA

2 “From information retrieval to digital libraries to computer science education” ABSTRACT: Information is a fundamental human need. The field of information retrieval has helped address this need since the 1960s, with a range of models and systems. A broad view of this field leads to digital libraries, a re-definition of the concepts, systems, and human involvement in sharing information across time and space, supported by digital technologies. We can formalize and better operationalize this through the 5S framework, which addresses information with regard to Societies, Scenarios, Spaces, Structures, and Streams. This approach has supported our work with personalization and computer science syllabi, curriculum development regarding digital libraries, and ensuring that college graduates are prepared not only to live in, but also to help build our future cyberinfrastructure, i.e., for Living In the KnowlEdge Society (LIKES). This talk will summarize our related research and education innovation.

Acknowledgements (selected) Colleagues: Lillian Cassel, Debra Dudley, Weiguo Fan, Marcos Gonçalves, Doug Gorton, Rohit Kelapure, Neill Kipp, Aaron Krowne, Ming Luo, Uma Murthy, Manuel Perez, Ananth Raghavan, Rao Shen, Hussein Suleman, Srinivas Vemuri, Layne Watson, … Sponsors: ACM, AOL, CAPES, DFG, Google, IBM, IMLS, INL, Microsoft, NSF (CCF ; IIS , , , , , , , ; DUE , , , , , , ), SUN, …

4 Acknowledgements - Mentors JCR Licklider – undergrad advisor ( ) –Author in 1965 of “Libraries of the Future” –Before, at ARPA, funded start of Internet Michael Kessler – BS thesis advisor –Project TIP (technical information project) –Defined bibliographic coupling Gerard Salton – graduate advisor ( ) –“Father of Information Retrieval”

5 Information Retrieval: Algorithms and Heuristics 2 nd Ed. By David A. Grossman & Ophir Frieder Kluwer Academic Publishers

6 Document Retrieval (Grossman & Frieder Fig. 1.1)

7 Vector Space Model – 2 terms (Grossman & Frieder Fig. 2.2)

8 Language Model (Grossman & Frieder Fig. 2.5)

9 Document-Term-Query Inference Network (Grossman & Frieder Fig. 2.7)

10 Inference Network Layers (Grossman & Frieder Fig. 2.8)

11 Relevance Feedback Process (Grossman & Frieder Fig. 3.1)

12 Information Life Cycle Authoring Modifying Organizing Indexing Storing Retrieving Distributing Networking Retention / Mining Accessing Filtering Using Creating

13 Asynchronous, Digital Library Mediated Scholarly Communication Different time and/or place

14 DLs Shorten the Chain to Author Reader Digital Library Editor Reviewer Teacher Learner Librarian

15 DL Definitions - 1 “A digital library is an organized and focused collection of digital objects, including text, images, video, and audio, along with methods of access and retrieval, and for selection, creation, organization, maintenance, and sharing of the collection.” Witten & Bainbridge – “How to Build a Digital Library” – Morgan Kaufmann 2003

16 DL Definitions - 2 “Digital libraries are organizations that provide the resources, including the specialized staff, to select, structure, offer intellectual access to, interpret, distribute, preserve the integrity of, and ensure the persistence over time of collections of digital works so that they are readily and economically available for use by a defined community or set of communities” Waters,D.J. CLIR Issues, July/August

17 DL Definitions - 3 Issues and Spectra –Collection vs. Institution –Content vs. System –Access vs. Preservation –“Free” vs. Quality –Managed vs. Comprehensive –Centralized vs. Distributed

18 DL Definitions - 4 NOT a “digitized library” NOT a “deconstruction” of existing systems and institutions, moving them to an electronic box in a Library IS a new way to deal with knowledge –Authoring, Self-archiving, Collecting, –Organizing, Preserving, –Accessing, Propagating, Re-using

19

20 Informal 5S & DL Definitions DLs are complex systems that help satisfy info needs of users (societies) provide info services (scenarios) organize info in usable ways (structures) present info in usable ways (spaces) communicate info with users (streams)

21 Hypotheses A formal theory for DLs can be built based on 5S. The formalization can serve as a basis for modeling and building high- quality DLs.

22 “Streams” - All types of (multimedia) content (as well as communications and flows over networks, or into sensors, or sense perceptions; data stream management systems) “Structures” - Organizational schemes (including data structures, databases, and knowledge representations – taxonomies, ontologies) 5S Framework

23 5S Framework “Spaces” - 2D and 3D interfaces, GIS data, representations of documents and queries “Scenarios” - System states and events, but also can represent situations of use by human users (or machine processes, yielding services or transformations of data) “Societies” - Both software “service managers” and fairly generic “actors” who could be (collaborating) human (users).

24 5Ss SsExamplesObjectives Streams Text; video; audio; image Describes properties of the DL content such as encoding and language for textual material or particular forms of multimedia data Structures Collection; catalog; hypertext; document; metadata Specifies organizational aspects of the DL content Spaces Measure; measurable, topological, vector, probabilistic Defines logical and presentational views of several DL components Scenarios Searching, browsing, recommending Details the behavior of DL services Societies Service managers, learners, teachers, etc. Defines managers, responsible for running DL services; actors, that use those services; and relationships among them

25

26 ETANA-DL Archaeological DL Integrated DL –Heterogeneous data handling Applies and extends the OAI-PMH –Open Archives Initiative Protocol for Metadata Handling Design considerations –Componentized –Extensible –Portable

27

28 ETANA Societies 1.Historic and pre-historic societies (being studied) 2.Archaeologists (in academic institutes, fieldwork settings, or local and national governmental bodies) 3.Project directors 4.Technical staff (consisting of photographers, technical illustrators, and their assistants) 5.Field staff (responsible for the actual work of excavation) 6.Camp staff (e.g., camp managers, registrars, tool stewards) 7.General public (e.g., educators, learners, citizens)

29 ETANA Societies Social issues 1.Who owns the finds? 2.Where should they be preserved? 3.What nationality and ethnicity do they represent? 4.Who has publication rights? 5.What interactions took place between those at the site studied, and others? What theories are proposed by whom about this?

30 ETANA Scenarios 1.Life in the site in former times 2.Digital recording: the planning stage and the excavation stage 3.Planning stage: remote sensing, fieldwalking, field surveys, building surveys, consulting historical and other documentary sources, and managing the sites and monuments 4.Excavation 1.Detailed information is recorded, including for each layer of soil, and for features such as pole holes, pits, and ditches. 2.Data about each artifact is recorded together with information about its exact find spot. 3.Numerous environmental and other samples are taken for laboratory analysis, and the location and purpose of each is carefully recorded. 4.Large numbers of photographs are taken, both general views of the progress of excavation and detailed shots showing the contexts of finds. 5.Organization and storage of material 6.Analysis and hypotheses generation and testing 7.Publications, museum displays 8.Information services for the general public

31 ETANA Spaces 1.Geographic distribution of found artifacts 2.Temporal dimension (as inferred by archaeologists) 3.Metric or vector spaces 1.used to support retrieval operations, and to calculate distance (and similarity) 2.used to browse / constrain searches spatially 4.3D models of the past, used to reconstruct and visualize archaeological ruins 5.2D interfaces for human-computer interaction

32 ETANA Structures 1.Site Organization 1.Region, site, partition, sub-partition, locus, … 2.Temporal orderings (ages, periods) 3.Taxonomies 1.for bones, seeds, building materials, … 4.Stratigraphic relationships 1.above, beneath, coexistent

33 ETANA Streams 1.successive photos and drawings of excavation sites, loci, unearthed artifacts 2.audio and video recordings of excavation activities and discussions 3.textual reports 4.3D models used to reconstruct and visualize archaeological ruins.

34 5S and DL formal definitions and compositions (April 2004 TOIS)

35 Fox & Gonçalves Book Outline Ch. 1. Introduction (Motivation, Synopsis) Part 1 – The “Ss” Part 2 – Higher DL Constructs Part 3 – Advanced Topics Appendix

36 Book Parts and Chapters - 1 Ch. 1. Introduction (Motivation, Synopsis) Part 1 – The “Ss” –Ch. 2: Streams –Ch. 3: Structures –Ch. 4: Spaces –Ch. 5: Scenarios –Ch. 6: Societies

37 Book Parts and Chapters - 2 Part 2 – Higher DL Constructs –Ch. 7: Collections –Ch. 8: Catalogs –Ch. 9: Repositories and Archives –Ch. 10: Services –Ch. 11: Systems –Ch. 12: Case Studies

38 Book Parts and Chapters - 3 Part 3 – Advanced Topics –Ch. 13: Quality –Ch. 14: Integration –Ch. 15: How to build a digital library –Ch. 16: Research Challenges, Future Perspectives Appendix –A: Mathematical preliminaries –B: Formal Definitions: Ss –C: Formal Definitions: DL terms, Minimal DL –D: Formal Definitions: Archeological DL –E: Glossary of terms, mappings

39 Chapter 3: (Degree of) Structure Chaotic OrganizedStructured WebDLsDBs

40 Digital Objects (DOs) Born digital Digitized version of “real” object –Is the DO version the same, better, or worse? –Decision for ETDs: structured + rendered Surrogate for “real” object –Not covered explicitly in metamodel for a minimal DL –Crucial in metamodel for archaeology DL

41 Metadata: Complex to Simple MARC ($50)Dublin Core (DC) + thesis

42 Also Important: Epub, SGML, XML 5S perspective: streams, structures, scenarios Authoring Rendering, presenting Tagging, Markup, DOM Semi-structured information Dual-publishing, eBooks Styles (XSL, XSLT) Structured queries

43 Chapter 4 Overview (Spaces) Retrieval models –Boolean, extended Boolean –Vector, LSI –Probabilistic: classical, belief network, inference network, language models User interfaces and visualization – cont’d

44 User interfaces and visualization 2D interfaces 3D interfaces GIS Other paradigms: trees, graphs, bubbles, coordinated views, … Stepping Stones and Pathways –

45 Chapter 6 Overview (Societies) User communities –Authors, editors, teachers, students, readers –Personal(ization), group(ware), community, global –Accessibility, universal access Librarians: reference, acquisition, operations Research community –Associations, conferences, publications, labs, projects Economics –Copyright, intellectual property rights, digital rights management, authorization, authentication, security, privacy, self-archiving (eprints) –Publishers, catalogers, distributors, sustainability –Open source, commercial, hybrid

46 Chapter 9 Archives & Repositories Open Archives Initiative (OAI) Institutional Repositories Persistent storage of digital objects Coupling of metadata with digital objects Use of “handles” as identifiers for digital objects Put, get, harvest

47 OAI - Open Archives Initiative Advocacy for interoperability Standard for transferring metadata among digital libraries –Protocol for Metadata Harvesting (PMH) Simplicity Generality Extensibility Support for PMH => Open Archive (OA)

48 OAI – Repository Perspective Required: Protocol DO MDO

49 OAI – Black Box Perspective OA 1OA 2OA 4OA 3OA 5OA 6OA 7

50 Tiered Model of Interoperability Mediator services Metadata harvesting Document models

51 Institutional Repositories - 1 “Institutional repositories are digital collections that capture and preserve the intellectual output of a single university or a multiple institution community of colleges and universities.” Crow, R. “Institutional repository checklist and resource guide”, SPARC, Washington, D.C., USA

52 Institutional Repositories - 2 “A university-based institutional repository is a set of services that a university offers to the members of its community for the management and dissemination of digital materials created by the institution and its community members. It is most essentially an organizational commitment to the stewardship of these digital materials, including long-term preservation where appropriate, as well as organization and access or distribution.” Lynch, C.A. In ARL Bimonthly Report 226, pp. 1-7, Feb. 2003,

53 What is a Digital Object Repository?  Also called: digital rep., digital asset rep., institutional repository  Stores and maintains digital objects (assets)  Provides external interface for Digital Objects  Creation, Modification, Access  Enforces access policies  Provides for content type disseminations Adapted from Slide by V. Chachra, VTLS

54 Goals of Institutional Repositories (by Steven Harnad, U. Southampton)  Self Archiving of Institutional Research  Thesis and Dissertations (VTLS NDLTD Project)  Article preprints and post prints  Internal documents and maps  Management of digital collections  Preservation of materials – decentralized approach  Housing of teaching materials  Electronic Publishing of journals, books, posters, maps, audio, video and other multimedia objects Adapted from Slide by V. Chachra, VTLS

55 Chapter 10 Services Taxonomy of services Ontology, composition, reuse Evaluation Key services in-depth: –Crawling, indexing –Clustering, classifying –Recommending, using social networks –Logging

56

57 Ontology: Applications Expand definition of minimal DL by characterizing –typical DL services –in the context of “employs” and “produces” relationships Use characterization to: –Reason about how DL services can be built from other DL components –As well as be composed with other services through extension or reuse

58

59 Ontology: Applications

60

61 5S and Generating DLs 5S Framework 5S definitions, services taxonomy, ontology 5SL (specification language) 5SGraph (to prepare 5SL) 5SGen (for DL development, incl. DSpace) SchemaMapper for development of union DL

62

63 Chapter 11 Systems: Architectural Issues Independent system vs. part of federation Centralized vs. distributed vs. open services Monolithic vs. modular vs. componentized Topologies: bus vs. star vs. hierarchical vs. network Decompositions vary –search engine, browser, DBMS, MM support –repository, handle server, client –information resources + mediators, bus or agent collection + client with workspace/environment

64 Also Important: Agents 5S perspective: societies, streams, spaces, scenarios, structures Protocols: light-weight Knowledge interchange: mediators, wrappers Negotiation, registries Distributed issues Webbots (automatic indexing) Ontologies (standard upper)

65 Fedora ™ Digital Object Architecture Persistent ID (PID) Disseminators SystemMetadata EAD, TEI, DC, MARC, VRA Core, MIX, etc. Datastreams Images, E-books, E-journals, Music, Video, etc. Globally unique persistent id Public view: access methods for obtaining “disseminations” of digital object content Internal view: metadata necessary to manage the object Protected view: content that makes up the “basis” of the object The Mellon Fedora Project Adapted from Slide by V. Chachra, VTLS

66 Example Disseminators Persistent ID (PID) Default Disseminators Simple Image SystemMetadata Datastreams Get Profile List Items Get Item List Methods Get DC Record Get Thumbnail Get Medium Get High Get VeryHigh

67 Fedora™ Repository Web Service Exposure Layer Adapted from Slide by V. Chachra, VTLS

68 5SL: a DL design language Domain specific languages –Address a particular class of problems by offering specific abstractions and notations for the domain at hand –Advantages: domain-specific analysis, program management, visualization, testing, maintenance, modeling, and rapid prototyping. XML-based realization of 5S –Interoperability –Use of many sub-languages (e.g., MIME types, XML Schemas, UML notations)

69 Help users model their own instances of a digital library (DL) in the 5S language (5SL). A simple modeling process which enables rapid generation of digital libraries Features –5SGraph loads and displays a metamodel in a structured toolbox. –The structured editor of 5SGraph provides a top- down visual building environment for the DL designer. –5SGraph produces syntactically correct 5SL files according to the visual model built by the designer. 5SGraph: A DL Modeling Tool

70 Overview of 5SGraph Workspace (instance model) Structured toolbox (metamodel)

71

72

73

74

75 5SGen Version 1 – MARIAN as the target system –Focused on rich structures: semantic networks –Behavior attached to nodes/links Version 2 – Shifted for later work to componentized (ODL) approach –Focused on scenarios/societies –Structures/Spaces encapsulated within components (e.g., relational tables, indexes) –Only textual streams supported Version 3 – Into DSpace (practical DL)

76 5SLGen – Version 2: ODL, Services, Scenarios

77 Tools/Applications

78 5SGraph 5S Archaeology MetaModel ArchDL Expert ArchDL Designer Structure Sub-model ETANA-DL Union Services Descriptions Harvesting Mapping Searching Browsing … Scenario Sub-model VN Metadata Format ETANA-DL Metadata Format HD Metadata Format Mapping Tool Wrapper4VNWrapper4HD Inverted Files Services DB Index Browse Service Search Service Browse DB Other ETANA-DL Services Web Interface XOAI VN Catalog HD Catalog Union Catalog 5SGen Component Pool Browsing …

Ch. 12 Case Studies: CS -> CSTC NSF and ACM Education Committee funded a 2 year project “A Computer Science Teaching Center” - CSTC - College of NJ, U. Ill. Springfield, Virginia Tech Focus initially on labs, visualization, multimedia Multimedia part supported by a 2nd grant to Virginia Tech and The George Washington University (with curricular guidelines)

CS Teaching Center (CSTC) Instead of building large, expensive multimedia packages, that become obsolete and are difficult to re-use, concentrate on small knowledge units. Learners benefit from having well-crafted modules that have been reviewed and tested. Use digital libraries to build a powerful base of support for learners, upon which a variety of courses, self-study tutorials & reference resources can be built. ACM support led to Journal of Educational Resources in Computing (JERIC): completed 2 co-EIC terms

81

82 Browsing (2)

83

84

85 Computing and Information Technology Interactive Digital Educational Library (CITIDEL) Domain: computing / information technology Genre: one-stop-shopping for teachers & learners: courseware (CSTC, JERIC), leading DLs (ACM, IEEE-CS, DB&LP, CiteSeer), PlanetMath.org, NCSTRL (technical reports), … Submission & Collection: sub/partner collections 

86 Overview of CITIDEL architecture

87 Distributed repository structure

88 Digital library architecture for local and interoperable CITIDEL services

89

90

91

92

93

CITIDEL -> NSDL A collection project in the National STEM (science, technolgy, engineering, and mathematics) education Digital Library – NSDL National Science Digital Library (Next slides courtesy Lee Zia, NSF)

95 Connects: Users: students, educators, life-long learners Content: structured learning materials; large real-time or archived datasets; audio, images, animations; primary sources; digital learning objects (e.g. applets); interactive (virtual, remote) laboratories;... Tools: search; refer; validate; integrate; create; customize; publish; share; notify; collaborate;...

96 Enables: Environments for Communication Collaboration Creation Validation Evaluation Recognition... Discovery Stability Reliability Reusability Interoperability Customizability... of Resources AND

97 Collections Discovery of content Classification and cataloguing Acquisition and/or linking; referencing Disciplinary-based themes define a natural body of content, but other possibilities are also encouraged Access to massive real-time or archived datasets Software tool suites for analysis, modeling, simulation, or visualization Reviewed commentary on learning materials and pedagogy

98 Services Help services, frequently asked questions, etc. Synchronous/asynchronous collaborative learning environments using shared resources Mechanisms for building personal annotated digital information spaces Reliability testing for applets or other digital learning objects Audio, image, and video search capability Metadata system translation Community feedback mechanisms

99

100

101

102 NSDL Information Architecture Essentially as developed by the Technical Infrastructure Workgroup referenced items & collections referenced items & collections Special Databases NSDL Services NSDL Services Other NSDL Services CI Services annotation CI Services discussion CI Services personalization CI Services authentication CI Services browsing Core Services: information retrieval Core Collection- Building Services harvesting Core Collection- Building Services protocols Core Services: metadata gathering Portals & Clients Portals & Clients Portals & Clients Usage Enhancement Collection Building User Interfaces NSDL Collections NSDL Collections NSDL Collections Core NSDL “Bus”

A Digital Library Case Study Domain: graduate education, research Genre:ETDs=electronic theses & dissertations Submission: ETD-db, DSpace, Proquest, … Collection: local archives, regional collaborations, global union catalog Project: Networked Digital Library of Theses & Dissertations (NDLTD)

104

Student Gets Committee Signatures and Submits ETD Signed Grad School

Aiding universities to enhance graduate education, publishing and IPR efforts Helping improve the availability and content of theses and dissertations Educating ALL future scholars so they can publish electronically and effectively use digital libraries (i.e., are Information Literate and can be more expressive) What are we doing?

107 Why ETD? Short Answer For Students: –Gain knowledge and skills for the Information Age –Richer communication (digital information, multimedia, …) For Universities: –Easy way to enter the digital library field and benefit thereby For the World: –Global digital library – large, useful, many services General: –Save time and money –Increased visibility for all associated with research results

108 Metamodels in the 5S Framework Modeling archaeological information systems using the 5S theory to better understand the domain and design the system and the supported services Minimal DL Minimal ArchDL …

109 Digital Object Repository Collection Minimal DL Metadata Catalog Descriptive Metadata Specification A Minimal DL in the 5S Framework Structural Metadata Specification StreamsStructuresSpacesScenariosSocieties indexing browsing searching services hypertext Structured Stream

110 StreamsStructuresSpacesScenariosSocieties indexing browsing searching services hypertext Structured Stream Descriptive Metadata specification SpaTemOrg StraDia Arch Descriptive Metadata specification ArchDO ArchObj ArchColl Arch Metadata catalog ArchDColl ArchDR Minimal ArchDL A Minimal ArchDL in the 5S Framework

111 Moving from a minimal DL towards a DL reference model (1/2) Minimal DLDL reference model Multimedia Annotation Knowledge management Practical DL systems PIM DL quality Domain- specific DLs

112 Moving from a minimal DL towards a DL reference model (2/2) Content-based image retrieval services in a DL A superimposed- information-supported DL Practical DL generation

113 Superimposing information Superimposed layer New information/structures Base layer Existing information from heterogeneous sources: text, images, audio/video documents Mark Reference to base information element

114 Preliminary SI-DL metamodel

115 StreamStructureSpaceServiceSociety Image Stream Feature Vector Image Descriptor Structured Featute Vector Image Content Description Image Digital Object Image Object User Info Need Image Collection Visualization Operation Content-based Image Searching Service Image Descriptor Metadata Catalog Composite Descriptor KNNQ RQ Minimal CBIR DL

116 Summary 5S and Generating DLs –5S Framework –5S definitions, services taxonomy, ontology –5SL –5SGraph –5SGen (and DL development) –DL development of union DL –5SGen into DSpace 5S Metamodels –Minimal DL –Archaeology DL –Multimedia (CBIR) DL –Union DL –Practical DL, superimposed information, personal DL, …

117 NSF Workshop on DL Future, Chatham, MA

118 People Digital librarians DL system developers DL system administrators DL managers DL collection development staff DL evaluators DL users

119

120 Living In the KnowlEdge Society (LIKES) Grant: NSF , CPATH Proposal: for VT Pathways (themed version of core curric.) PI: Edward A. Fox

121 Purpose Graduates from colleges & universities should be prepared to live in and contribute to the Knowledge Society emerging in the 21 st century. Computing/LIS education can be revitalized: if the LIKES theme spreads in programs (so graduates can help build the Knowledge Society); if faculty collaborate (both in education and research endeavors) with colleagues globally who are interested in LIKES.

122 Living In the KnowlEdge Society (LIKES): Core surrounded by enabling concepts, problem providing disciplines

123 Objectives – 1 of 3 Enhance education in the discipline: –New courses: Living in the Global Knowledge Society, Knowledge Management –Enhanced courses to be more driven by the LIKES theme: Artificial Intelligence, Data Mining, Digital Libraries, Multimedia/Hypertext/Information Access, …

124 Objectives – 2 of 3 Give special attention, inside the discipline and across disciplines: to the areas of data, information, and knowledge; to key concepts and methods, such as: representation/viewssearch/discovery inference/decisionscomparison/matching complexity/heuristicsanalysis/mining integration/mappingmodeling/simulation

125 Objectives – 3 of 3 Engage researchers and teachers and students in the Knowledge Society’s problems, as motivation, orientation, and to help with solutions, e.g., –Shifting toward digital government, including statutes, rules, regulations, and procedures; –Handling attacks, including spam and viruses; –Ensuring quality even with disinformation, through knowledge sourcing, provenance, and sharing of community expertise; –Ensuring changes through education, that is cross- disciplinary, globally contextualized, based on awareness of human development, learning theory, and cognitive psychology

126 Potential Course Areas/Courses Personal Knowledge Management –Computer Science and Information Systems, e.g., multi-media, process design and evaluation, and Human-Computer / Human-Information interaction. –Psychology, e.g., knowledge organization principles, human cognitive processes. –Industrial Systems Engineering, e.g., Ergonomic factors of knowledge environments. –Ethics, e.g., ethical issues of information disclosure. Communication and Collaboration –Communications, e.g., Communication using digital visualizations, using knowledge access in constructing digital messages. –Information Systems and Computer Science, e.g., computer supported cooperative work and group support systems. –Marketing, e.g., influence of knowledge presentation on on-line customer behavior. Organization –Information Systems, e.g., service innovation and development, system design and development. –Management Science, e.g., decision support systems concepts, capabilities, techniques, and tools. –Management, Marketing, Accounting, and Finance, e.g., business in the information age. Society –Sociology, e.g., impact of knowledge differentials across society and countries. –Political Science, e.g., governmental collection and use of knowledge, impact of technology on elections and government.

127 DL Curriculum Project (NSF supporting VT, UNC-CH) Identify, develop and test educational DL modules, guided by - Experts, international collaborators - Computing Curriculum S framework - Analysis of DL course syllabi …

128 CC2001 Information Management Areas IM1. Information models and systems* IM8. Distributed DBs IM2. Database systems*IM9. Physical DB design IM3. Data modeling*IM10. Data mining IM4. Relational DBsIM11. Information storage and retrieval IM5. Database query languagesIM12. Hypertext and hypermedia IM6. Relational DB designIM13. Multimedia information & systems IM7. Transaction processingIM14. Digital libraries

129 Why Modular Design Flexibility, e.g., for ETD programs: –Self-study by NDLTD trainers –Self-study by ETD authors –Short courses by NDLTD trainers of ETD authors –A course based on a single module –Course sequence (program) from multiple modules –Plug in modules into an existing course (enhancement) Module 1. Overview + Module 10. DL Education & Research

130 Modules 1.Collection Development 2.Digital objects / Composites / Packages 3.Metadata, Cataloging, Author submission 4.Architecture, Interoperability 5.Data visualization 6.Services 7.Intellectual property rights management, Privacy, Protection 8.Social issues / Future of DLs 9.Archiving and Preservation

131 Ascertaining Priority Topics We’ve manually classified and analyzed publications using 9 Modules: SourceCount ProceedingsJCDL ’01 – ’05354 ProceedingsACM DL ’96 – ’00189 Magazine articlesD-Lib ’95 – ‘06521 Session titlesJCDL, ACM DL, ECDL 264

132 Conference papers x modules

133 Analysis Results: -Total of 543 proceedings: Most popular topics were architecture (module 4) and services (module 6)

134 Distribution of D-Lib Magazine Articles across Module Topics

135 Analysis Results: -Total of 521 articles: Most popular topics were architecture (module 4), services (module 6) and social issues (module 8)

136 Distribution of Session Titles across Module Topics

137 Analysis Results: -Total of 264 session titles (JCDL, ECDL, ICADL): Most popular topic was services (module 6) followed by architecture (module 4)

138 Fox & Gonçalves Book Outline Ch. 1. Introduction (Motivation, Synopsis) Part 1 – The “Ss” –Ch. 2: Streams –Ch. 3: Structures –Ch. 4: Spaces –Ch. 5: Scenarios –Ch. 6: Societies

139 Textbook Outline (2) Part 2 – Higher DL Constructs –Ch. 7: Collections –Ch. 8: Catalogs –Ch. 9: Repositories and Archives –Ch. 10: Services –Ch. 11: Systems –Ch. 12: Case Studies

140 Textbook Outline (3) Part 3 – Advanced Topics –Ch. 13: Quality –Ch. 14: Integration –Ch. 15: How to build a digital library –Ch. 16: Research Challenges, Future Perspectives Appendix –A: Mathematical preliminaries –B: Formal Definitions: Ss –C: Formal Definitions: DL terms, Minimal DL –D: Formal Definitions: Archeological DL –E: Glossary of terms, mappings

141 Pointers and Summary IR -> DL Education: CSTC, CITIDEL, NSDL, NDLTD, LIKES, DLcurric