1 Digging into Digital Libraries: From Archaeology to Formalism Edward A. Fox Virginia Tech, Dept. of CS CSC Spring Colloquium Villanova – February.

Slides:



Advertisements
Similar presentations
Interoperability Scenarios All Working Groups Meeting May, Rome, Italy.
Advertisements

ETANA-DL: Leveraging Digital Library Technologies to Support Archaeology Vanderbilt University Nashville, TN -- Sept. 8, 2006 Weiguo Fan, Edward A. Fox,
1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox.
Formalizing the Design of Digital Libraries Based on UML Delos NoE, Preservation Cluster: Workshop: Persistency in Digital Libraries 13. February 2006,
Digital Libraries. Synchronous Scholarly Communication Same time, Same or different place.
Yannis Ioannidis University of Athens, Hellas Digital Libraries at a Crossroads Toward the Future Generation of Digital Library Mgmt Systems.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
1 CS5604 October 13, 2010 “5S Overview for Modules” by Edward A. Fox and Lillian (Boots) Cassel (on Ensemble) Dept. of.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Prototyping Digital Libraries Handling Heterogeneous Data Sources – An ETANA-DL Case Study Unni Ravindranathan, Rao Shen, Marcos André Gonçalves, Weiguo.
Cluj Napoca, 28 August IEEE International Conference on Intelligent Computer Communication and Processing Digital Libraries Workshop Towards.
Digital Library Architecture and Technology
Dienst Distributed Networked Publishing Carl Lagoze Digital Library Scientist Cornell University.
Introduction to Digital Libraries hussein suleman uct cs honours 2004.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Web Archives, IDEAL, and PBL Overview Edward A. Fox Digital Library Research Laboratory Dept. of Computer Science Virginia Tech Blacksburg, VA, USA 21.
A 5S Perspective on Digital Libraries for E-Learning: With case studies from Archaeology, Computing, and Dissertations Edward A. Fox, Virginia Tech
1 Streams, Structures, Spaces, Scenarios, and Societies (5S): A Formal Digital Library Framework and Its Applications Marcos André Gonçalves Doctoral defense.
Guide to the Software Engineering Body of Knowledge Chapter 1 - Introduction.
Teaching Metadata and Networked Information Organization & Retrieval The UNT SLIS Experience William E. Moen School of Library and Information Sciences.
How to participate in the Union Catalogue Project Hussein Suleman Sivulile – Open Access South Africa Advanced Information Management.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Requirements Gathering and Modeling of Domain Specific Digital Libraries with the 5S Framework: An Archaeological Case Study with ETANA ECDL 2005, Vienna,
LIS 506 (Fall 2006) LIS 506 Information Technology Week 11: Digital Libraries & Institutional Repositories.
1 Georgetown University 29 April 2011 “A Formal Approach to Digital Libraries - The 5S Framework: Societies, Scenarios, Spaces, Structures, Streams” by.
Creating and Operating a Digital Library for Information and Learning– the GROW Project Muniram Budhu Department of Civil Engineering & Engineering Mechanics.
Yinlin Chen, Edward A. Fox Dept. of CS, Virginia Tech, Blacksburg, VA USA Contact info: Ensemble Project Meeting, May 18-19, 2009, Portland,
Developing a Concept Extraction Technique with Ensemble Pathway Prat Tanapaisankit (NJIT), Min Song (NJIT), and Edward A. Fox (Virginia Tech) Abstract.
1 From the WWW and Minimal Digital Libraries, to Powerful Digital Libraries: Why and How Edward A. Fox ICADL 2005 Bangkok, Thailand – December.
ETANA-DL NSF Digital Library Project Edward A. Fox, Virginia Tech ASOR Annual Meeting, 2004
ETANA-DL Managing complex information applications: An archaeology digital library This research is funded in part by NSF-ITR grant #IIS Edward.
CITIDEL: Computing & Information Technology Interactive Digital Educational Library Web Page: Contacts: Future.
1 World Bank (Washington, D.C. – 20 November 2007) “Digital Libraries, 5S, and Applications – esp. Archaeology, Education, ETDs, and CTR (Crisis, Tragedy.
1 NDLTD Welcome and Introduction ETD 2011: 14 th Int. Symp. on ETDs Cape Town, South Africa Edward A. Fox Executive Director, NDLTD,
ETANA-ADD: An Interactive Tool for Integrating Archaeological DL Collections JCDL 2006, Chapel Hill, NC June 13, 2006 Naga Srinivas Vemuri, Rao Shen, Sameer.
ICS (072)Database Systems: An Introduction & Review 1 ICS 424 Advanced Database Systems Dr. Muhammad Shafique.
Extending Access To Information Resource Discovery Service William E. Moen, Ph.D. Kathleen R. Murray, Ph.D. School of Library and Information Sciences.
Scenarios for a Learning GRID Online Educa Nov 30 – Dec 2, 2005, Berlin, Germany Nicola Capuano, Agathe Merceron, PierLuigi Ritrovato
Edward A. Fox, N. Srinivas Vemuri Virginia Tech ASOR ETANA-DL: Leveraging DL Technologies to Support Archaeology.
1 Slides for Steve Griffin, NSF “ETANA and Digital Library Integration” by Edward A. Fox Oct. 3, Dept. of Computer.
Models for Digital Libraries CSC 9010 Digital Libraries - week 2 The 5S model is the work of Edward A. Fox and his students at Virginia Tech. These slides.
1 C.W. Post Campus, Long Island U. (23 April 2008) “Digital libraries: From Theory to CS/LIS Curricula” Edward A. Fox Dept.
ETANA-DL ( Electronic Tools and Near Eastern Archives Digital Library) Edward A. Fox, Virginia Tech James W. Flanagan, Case Western Reserve U. AIA 106.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
Digital Libraries Lillian N. Cassel Spring A digital library An informal definition of a digital library is a managed collection of information,
Digital Library The networked collections of digital text, documents, images, sounds, scientific data, and software that are the core of today’s Internet.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Automatic Metadata Discovery from Non-cooperative Digital Libraries By Ron Shi, Kurt Maly, Mohammad Zubair IADIS International Conference May 2003.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Exploring Digital Libraries: Integrating Browsing, Searching, and Visualization Paper by: Rao Shen, Naga Srinivas Vemuri, Weiguo Fan, Ricardo da S. Torres,
1 Video Message: Welcome ETD 2015: 18 th Int’l Symposium on ETDs New Delhi, India Edward A. Fox Executive Director, Chairman of the Board NDLTD,
Towards a Reference Quality Model for Digital Libraries Maristella Agosti Nicola Ferro Edward A. Fox Marcos André Gonçalves Bárbara Lagoeiro Moreira.
Introduction to Concept Maps Edward A. Fox and Rao Shen CS5604 Fall 2002 “Information Storage & Retrieval” Dept. of Computer Science Virginia Tech, Blacksburg,
1 DL.Org (Digital Library Interoperability, Best Practices and Modeling Foundations) Functionality Working Group Mtg June 2009, Athens “Functionality.
1 IBM Academic Initiative Introduction for Pamplin School of Business Virginia Tech – October 13, 2011 “IBM Academic Skills Cloud and Computing Education.
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
ETD Search Services Ming Luo Edward A. Fox Virginia Tech.
Visual Semantic Modeling of Digital Libraries Qinwei Zhu, Marcos André Gonçalves, Rao Shen, Edward A. Fox – Virginia Tech,, Blacksburg, VA, USA Lillian.
SCENARIO-BASED GENERATION OF DIGITAL LIBRARY SERVICES Rohit Kelapure, Marcos André Gonçalves, Edward A. Fox Virginia Tech, Blacksburg, VA, USA.
Foundations of, and Experiences with, Componentized Digital Libraries OCKHAM Panel ECDL Rome, Italy Edward A. Fox Digital Library Research.
5S Perspective Digital Libraries Foundations Workshop at JCDL 2007 Vancouver – June 23 Edward A. Fox Virginia Tech, USA
Designing Protocols in Support of Digital Library Componentization Hussein Suleman and Edward A. Fox Digital Library Research Laboratory Virginia Tech.
Open Digital Libraries Edward A. Fox Virginia Tech, Dept. of Computer Science.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
ETANA-DL (Electronic Tools and Near Eastern Archives Digital Library)
Outline Pursue Interoperability: Digital Libraries
Presentation transcript:

1 Digging into Digital Libraries: From Archaeology to Formalism Edward A. Fox Virginia Tech, Dept. of CS CSC Spring Colloquium Villanova – February 20, 2006

Acknowledgements (selected) 5S Helpers: Weiguo Fan, Marcos Gonçalves, Doug Gorton, Rohit Kelapure, Neill Kipp, Uma Murthy, Ananth Raghavan, Rao Shen, Hussein Suleman, Srinivas, Vemuri, Layne Watson, … Sponsors: ACM, AOL, CAPES, DFG, IBM, Microsoft, NSF (IIS , , , , , ; ITR ; DUE , , , ), SUN

3 Outline WWW and Digital Libraries (DLs) Minimal DLs Powerful DLs Why How Summary and Conclusions

4 WWW and DLs Both emerged in early 1990s. Convergence began around Example: Google spun off from Stanford DL. Crawling WWW is one way to build DLs. WWW support many portals to DLs. Parts of WWW that have catalogs (e.g., Yahoo categories) are close to DLs. Web Services help move WWW toward DLs, as the Semantic Web emerges.

5 Degree of Structure Chaotic OrganizedStructured WebDLsDBs

6 NSDL Information Architecture Essentially as developed by the Technical Infrastructure Workgroup referenced items & collections referenced items & collections Special Databases NSDL Services NSDL Services Other NSDL Services CI Services annotation CI Services discussion CI Services personalization CI Services authentication CI Services browsing Core Services: information retrieval Core Collection- Building Services harvesting Core Collection- Building Services protocols Core Services: metadata gathering Portals & Clients Portals & Clients Portals & Clients Usage Enhancement Collection Building User Interfaces NSDL Collections NSDL Collections NSDL Collections Core NSDL “Bus”

7

8

9

10 Outline WWW and Digital Libraries (DLs) Minimal DLs –Definitions –ETANA example Powerful DLs Why How Summary and Conclusions

11 Minimal Digital Libraries Key concepts, core ideas Minimalist perspective Underlying concepts: 5S (ETANA example) Higher DL constructs Bases: –Literature –Informal explanations –Formal definitions

12 Informal 5S & DL Definitions DLs are complex systems that help satisfy info needs of users (societies) provide info services (scenarios) organize info in usable ways (structures) present info in usable ways (spaces) communicate info with users (streams)

13 5Ss SsExamplesObjectives Streams Text; video; audio; image Describes properties of the DL content such as encoding and language for textual material or particular forms of multimedia data Structures Collection; catalog; hypertext; document; metadata Specifies organizational aspects of the DL content Spaces Measure; measurable, topological, vector, probabilistic Defines logical and presentational views of several DL components Scenarios Searching, browsing, recommending Details the behavior of DL services Societies Service managers, learners, teachers, etc. Defines managers, responsible for running DL services; actors, that use those services; and relationships among them

14 Example of 5Ss: ETANA-DL Archaeological DL (Electronic Tools for Ancient Near Eastern Archaeology Digital Library) Integrated DL –Heterogeneous data handling Applies and extends the OAI-PMH –Open Archives Initiative Protocol for Metadata Handling Design considerations –Componentized –Extensible –Portable –Work based on 5S framework

15

16 ETANA Societies 1.Historic and pre-historic societies (being studied) 2.Archaeologists (in academic institutes, fieldwork settings, or local and national governmental bodies) 3.Project directors 4.Technical staff (consisting of photographers, technical illustrators, and their assistants) 5.Field staff (responsible for the actual work of excavation) 6.Camp staff (e.g., camp managers, registrars, tool stewards) 7.General public (e.g., educators, learners, citizens)

17 ETANA Societies – cont’d Social issues 1.Who owns the finds? 2.Where should they be preserved? 3.What nationality and ethnicity do they represent? 4.Who has publication rights? 5.What interactions took place between those at the site studied, and others? What theories are proposed by whom about this?

18 ETANA Scenarios 1.Life in the site in former times 2.Digital recording: the planning stage and the excavation stage 3.Planning stage: remote sensing, fieldwalking, field surveys, building surveys, consulting historical and other documentary sources, and managing the sites and monuments 4.Excavation 1.Detailed information is recorded, including for each layer of soil, and for features such as pole holes, pits, and ditches. 2.Data about each artifact is recorded together with information about its exact find spot. 3.Numerous environmental and other samples are taken for laboratory analysis, and the location and purpose of each is carefully recorded. 4.Large numbers of photographs are taken, both general views of the progress of excavation and detailed shots showing the contexts of finds. 5.Organization and storage of material 6.Analysis and hypotheses generation and testing 7.Publications, museum displays 8.Information services for the general public

19 ETANA Spaces 1.Geographic distribution of found artifacts 2.Temporal dimension (as inferred by archaeologists) 3.Metric or vector spaces 1.used to support retrieval operations, and to calculate distance (and similarity) 2.used to browse / constrain searches spatially 4.3D models of the past, used to reconstruct and visualize archaeological ruins 5.2D interfaces for human-computer interaction

20 ETANA Structures 1.Site Organization 1.Region, site, partition, sub-partition, locus, … 2.Temporal orderings (ages, periods) 3.Taxonomies 1.for bones, seeds, building materials, … 4.Stratigraphic relationships 1.above, beneath, coexistent

21 ETANA Streams 1.successive photos and drawings of excavation sites, loci, unearthed artifacts 2.audio and video recordings of excavation activities and discussions 3.textual reports 4.3D models used to reconstruct and visualize archaeological ruins.

22 5S and DL formal definitions and compositions (April 2004 TOIS)

23 Digital Object Repository Collection Minimal DL Metadata Catalog Descriptive Metadata Specification A Minimal DL in the 5S Framework Structural Metadata Specification StreamsStructuresSpacesScenariosSocieties indexing browsing searching services hypertext Structured Stream

24

25 Outline WWW and Digital Libraries (DLs) Minimal DLs Powerful DLs –Services –Ontology Why How Summary and Conclusions

26

27 Ontology: Applications

28 Ontology: Applications Expand definition of minimal DL by characterizing –typical DL services –in the context of “employs” and “produces” relationships Use characterization to: –Reason about how DL services can be built from other DL components –As well as be composed with other services through extension or reuse

29 Composition of key fundamental / infrastructure services

30

31 Outline WWW and Digital Libraries (DLs) Minimal DLs Powerful DLs Why –Support DL education –Practical systems –Institutional repositories (DSpace) –Personal DLs (SenseCam -> Memex) –Support archaeology How Summary and Conclusions

32 DL Curriculum Framework

33 Foundations for Information Systems: Digital Libraries and the 5S Framework Ch. 1. Introduction (Motivation, Synopsis) Part 1 – The “Ss” Part 2 – Higher DL Constructs Part 3 – Advanced Topics Appendix

34 Book Parts and Chapters - 1 Ch. 1. Introduction (Motivation, Synopsis) Part 1 – The “Ss” –Ch. 2: Streams –Ch. 3: Structures –Ch. 4: Spaces –Ch. 5: Scenarios –Ch. 6: Societies

35 Book Parts and Chapters - 2 Part 2 – Higher DL Constructs –Ch. 7: Collections –Ch. 8: Catalogs –Ch. 9: Repositories and Archives –Ch. 10: Services –Ch. 11: Systems –Ch. 12: Case Studies

36 Book Parts and Chapters - 3 Part 3 – Advanced Topics –Ch. 13: Quality –Ch. 14: Integration –Ch. 15: How to build a digital library –Ch. 16: Research Challenges, Future Perspectives Appendix –A: Mathematical preliminaries –B: Formal Definitions: Ss –C: Formal Definitions: DL terms, Minimal DL –D: Formal Definitions: Archeological DL –E: Glossary of terms, mappings

37 Practical Systems Commercial: IBM, VTLS, … Open Source –Greenstone –CWIS (for NSDL) –Institutional repositories DSpace Fedora

38 Institutional Repositories “A university-based institutional repository is a set of services that a university offers to the members of its community for the management and dissemination of digital materials created by the institution and its community members. It is most essentially an organizational commitment to the stewardship of these digital materials, including long-term preservation where appropriate, as well as organization and access or distribution.” Lynch, C.A. In ARL Bimonthly Report 226, pp. 1-7, Feb. 2003,

39

40 ETANA-DL Global Architecture DigBase and DigKit Lahav Nimrin Umayri Hisban Megiddo Jalul New Sites DATABASEWRAPPERSDATABASEWRAPPERS ETANA-DL UNION CATALOG Search USERINTERFACEUSERINTERFACE Browse Recommend Note Personalize Review Visualizations Archaeology Specific Work in progress …

41 Megiddo Opening Screen

42 Locus Screen: Pictures View all

43 Area Screen

44 Repository1 DL1 Repository2 Union Catalog Union Repository Catalog1Catalog2 Searching Union DLDL2 archaeologists Society General Public Society Archaeologists General Public Union Society Service Browsing Service Union Service Harvesting, Mapping, Searching, Browsing, Clustering, Visualization Global DL: Architecture of a Union DL

45 Outline WWW and Digital Libraries (DLs) Minimal DLs Powerful DLs Why How –Components –Metamodels, Models –Graphical model building aids –DL generators –Integration –Quality Summary and Conclusions

Program Document Document Document Program Program Image Image Image Video Video Video componentized digital library ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

47 Discovery Current Awareness Preservation Service Providers Data Providers Metadata harvesting The World According to OAI: Open Archives Initiative – Protocol for Metadata Harvesting

48

49

50 Metamodels Completed –Minimal –Archaeological Planned –Practical –System oriented Doug Gorton’s thesis, so people can build models for their systems, and have them generated to work with a particular DL system

51 Digital Object Repository Collection Minimal DL Metadata Catalog Descriptive Metadata Specification A Minimal DL in the 5S Framework Structural Metadata Specification StreamsStructuresSpacesScenariosSocieties indexing browsing searching services hypertext Structured Stream

52 5SL – The Minimal DL Metamodel

53 StreamsStructuresSpacesScenariosSocieties indexing browsing searching services hypertext Structured Stream Descriptive Metadata specification SpaTemOrg StraDia Arch Descriptive Metadata specification ArchDO ArchObj ArchColl Arch Metadata catalog ArchDColl ArchDR Minimal ArchDL A Minimal ArchDL in the 5S Framework

54

55 Overview of 5SGraph Workspace (instance model) Structured toolbox (metamodel)

56 Tools/Applications

57 5SGen – Version 2: ODL, Services, Scenarios

58 XML-based DL Log Standard Log analysis –is a source of information on: How patrons really use DL services How systems behave while supporting user information seeking activities Used to: –Evaluate and enhance services –Guide allocation of resources Common practice in the web setting –Supported by web servers, proxy caches DL Logging can be more detailed

59 The XML Log Format Log SessionIdMachineInfo StatementTransactionTimestamp SessionInfoRegisterInfo StatementEventTimestamp Action SearchBrowse StoreSysInfoUpdate SearchBy QueryString CatalogCollection PresentationInfo StatusInfo Timeout

60 DL Integration What is “DL Integration” –Hide distribution –Hide heterogeneity –Enable autonomy of individual component Why Integration –island-DLs –inability to seamlessly and transparently access knowledge across DLs Utilize various autonomous DLs in concert

61 Formal Definition of DL Integration DL i =(R i, DM i, Serv i, Soc i ), 1 i n –R i is a network accessible repository –DM i is a set of metadata catalogs for all collections –Serv i is a set of services –Soc i is a society UnionRep UnionCat UnionServices UnionSociety

62 Formal Definition of DL Integration (Cont.) DL integration problem definition: Given n individual libraries, integrate the n DLs to create a UnionDL.

63 ETANA-DL Approach Applying and extending Digital Library (DL) techniques to solve key problems: making primary data available, data preservation, and interoperability Modeling archaeological information systems using 5S to better understand the domain and design the system and the supporting services Rapidly prototyping DLs that handle heterogeneous archaeological data using componentized frameworks: –eliciting requirements –refining metamodel and union schema –modeling sites –mapping –harvesting –providing useful services

64 Example of Union Service: CitiViz

65 Union Catalog Integration VN Metadata Format Global Metadata Format VN Catalog HD Catalog Union Catalog Mapping Tool Wrapper Mapping Tool Wrapper HD Metadata Format Virtual Nimrin (VN) Halif DigMaster (HD) Union ArchDL

66 local schemaglobal schema

67 Describing Quality in Digital Libraries What’s a “good” digital Library? –Central Concept: Quality! –Hypotheses of this work: Formal theory can help to define “what’s a good digital library” by: New formalizations of quality indicators for DLs within our 5S framework Contextualizing these measures within the Information Life Cycle

68 Quality Dimensions

69 Quality and the Information Life Cycle

70 Summary and Conclusions WWW and Digital Libraries (DLs) Minimal DLs Powerful DLs Why How -> Theory-based discipline and high quality DL management systems (DLMS)

71 Selected Links - CITIDEL (computing education resources) – NCSTRL (computing technical reports) – NDLTD (electronic theses and dissertations worldwide) – and etdguide.org NSDL (National Science Digital Library) – OAI (Open Archives Initiative) – Virginia Tech Digital Library Research Laboratory (DLRL, –5S, AmericanSouth.Org, CSTC, DL-in-a-box, ENVISION, ETANA, MARIAN, NDLTD, NSDL, OAD, ODL, …)

72 Questions? Discussion? Thank You!