Download presentation
Presentation is loading. Please wait.
Published byFelix Dawson Modified over 9 years ago
1
1 DL.Org (Digital Library Interoperability, Best Practices and Modeling Foundations) Functionality Working Group Mtg 29-30 June 2009, Athens “Functionality modeling and functionality interoperability, Session 1” Functionality and Interoperability with 5S by Edward A. Fox fox@vt.edu http://fox.cs.vt.edu Dept. of Computer Science, Virginia Tech Blacksburg, VA 24061 USA
2
Acknowledgements Mentors (Licklider, Kessler, Salton) Virginia Tech, CS, Digital Library Research Laboratory NSF and other sponsors, e.g., grants –DUE-0840719, CCF-0722259, IIS-0535057, IIS-0325579 Students, colleagues, co-investigators Robert France, Marcos André Gonçalves, Doug Gorton, Yi Ma, Uma Murthy, Rao Shen, Hussein Suleman, Ricardo da Silva Torres,... Barbara Wildemuth, Jeffrey Pomerantz, Sanghee Oh, Seungwon Yang 2
3
Theses and Dissertations Douglas Gorton, "Practical Digital Library Generation into DSpace with the 5S Framework", April 2007, MS thesis, http://scholar.lib.vt.edu/theses/available/etd- 04252007-161736/ Rao Shen, "Applying the 5S Framework To Integrating Digital Libraries", April 2006, PhD dissertation, http://scholar.lib.vt.edu/theses/available/etd-04212006-135018/ Ananth Raghavan, "Schema Mapper: A Visualization Tool for Incremental Semi- automatic Mapping-based Integration of Heterogeneous Collections into Archaeological Digital Libraries: The ETANA-DL Case Study", May 2005, MS thesis, http://scholar.lib.vt.edu/theses/available/etd-05182005-114155/ Marcos Andre Goncalves, "Streams, Structures, Spaces, Scenarios, and Societies (5S): A Formal Digital Library Framework and Its Applications", Nov. 2004, PhD dissertation, http://scholar.lib.vt.edu/theses/available/etd-12052004-135923/ Rohit Dilip Kelapure, "Scenario-Based Generation of Digital Library Services", June 2003, MS thesis, http://scholar.lib.vt.edu/theses/available/etd-06182003-055012/ Hussein Suleman, "Open Digital Libraries", Nov. 2002, PhD dissertation, http://scholar.lib.vt.edu/theses/available/etd-11222002-155624/ Qinwei Zhu, "5SGraph: A Modeling Tool for Digital Libraries", Nov. 2002, MS thesis, http://scholar.lib.vt.edu/theses/available/etd-11272002-210531/ Jun Wang, "VIDI: A Lightweight Protocol Between Visualization Systems and Digital Libraries", May 2002, MS thesis, http://scholar.lib.vt.edu/theses/available/etd- 07012002-145841/ 3
4
Other Selected References Marcos Andre Goncalves, Robert K. France, Edward A. Fox, MARIAN: Flexible Interoperability for Federated Digital Libraries. ECDL 2001, 173-186, 2001 Hussein Suleman and Edward Fox. The Open Archives Initiative: Realizing Simple and Effective Digital Library Interoperability. J. Library Automation, 35(1/2):125-145, 2002 Marcos Andre Goncalves, Edward A. Fox. 5SL - A Language for Declarative Specification and Generation of Digital Libraries. JCDL 2002, 263-272 Marcos Andre Goncalves, Ming Luo, Rao Shen, Mir Farooq Ali, Edward A. Fox. An XML Log Standard and Tool for Digital Library Logging Analysis. ECDL 2002, 129-143 Marcos Andre Goncalves, Ganesh Panchanathan, Unnikrishnan Ravindranathan, Aaron Krowne, Edward A. Fox, Filip Jagodzinski, Lillian Cassel. The XML Log Standard for Digital Libraries: Analysis, Evolution, and Deployment. JCDL 2003, 312 – 314 Hussein Suleman, Edward A Fox, Rohit Kelapure, Aaron Krowne, Ming Luo. Building digital libraries from simple building blocks, Online Information Review 27(5): 301-310, 2003 M. Goncalves, E. Fox, L. Watson, N. Kipp. Streams, Structures, Spaces, Scenarios, Societies (5S): A Formal Model for Digital Libraries. TOIS, 22(2): 270-312, 2004 Rao Shen, Naga Srinivas Vemuri, Weiguo Fan, Ricardo da S. Torres, E. A. Fox. Exploring Digital Libraries: Integrating Browsing, Searching, and Visualization. JCDL 2006, 1-10 Rao Shen, Naga Srinivas Vemuri, Weiguo Fan, Edward A. Fox. What is a Successful Digital Library? ECDL 2006, 208-219 4
5
Other Selected References - 2 Jeffrey Pomerantz, Sanghee Oh, Seungwon Yang, Edward A. Fox, Barbara M. Wildemuth. The Core: Digital Library Education in Library and Information Science Programs. D-Lib Magazine, 12(11), Nov. 2006 Marcos Andre Goncalves, Barbara L. Moreira, Edward A. Fox, Layne T. Watson. "What is a good digital library?" - A quality model for digital libraries. Information Processing and Management, 43(5): 1416-1437, 2007 Uma Murthy, Douglas Gorton, Ricardo Torres, Marcos Goncalves, Edward Fox, Lois Delcambre. Extending the 5S Digital Library (DL) Framework: From a Minimal DL towards a DL Reference Model. JCDL 2007 Workshop on Digital Library Foundations Barbara L. Moreira, Marcos A. Goncalves, Alberto H. F. Laender, Edward A. Fox, Evaluating Digital Libraries with 5SQual. ECDL 2007: pp. 466-470 Yi Ma, Edward A. Fox, Marcos A. Goncalves. Personal Digital Library: PIM upon 5S Framework. CIKM 2007 Workshop: PIKM07, Lisbon, Nov. 2007, 117-124 Marcos Andre Goncalves, Edward A. Fox, Layne T. Watson. Towards a Digital Library Theory: A Formal Digital Library Ontology. Int. J. Digital Libraries 8(2): 91-114, 2008 Rao Shen, Naga Srinivas Vemuri, Weiguo Fan, Edward A. Fox. Integration of Complex Archaeology Digital Libraries: An ETANA-DL Experience. Information Systems. 33(7-8): 699-723, 2008 Barbara L. Moreira, Marcos Andre Goncalves, Alberto H.F. Laender, Edward A. Fox. Automatic Evaluation of Digital Libraries with 5SQual. J. Informetrics, 3(2): 102-123, 2009 5
6
Outline Contextual Background –DL Definitions, Scope –DL Curricula Efforts –Interoperability Approaches 5S 5S Services Work International Repository Infrastructure Workshop (Amsterdam, Mar 16-17, 2009) Discussion Topics 6
7
7 DL Definitions Issues and Spectra –Collection vs. Institution –Content vs. System –Access vs. Preservation –“Free” vs. Quality –Managed vs. Comprehensive –Centralized vs. Distributed
8
8 Borgman et al.: Workshop Report on Social Aspects of Digital Libraries: http://www-lis.gseis. ucla.edu/DL/ Information Life Cycle
9
9 Information Life Cycle Authoring Modifying Organizing Indexing Storing Retrieving Distributing Networking Retention / Mining Accessing Filtering Using Creating
10
10 Digital Libraries Shorten the Chain from Editor Publisher A&I Consolidator Library Reviewer
11
11 DLs Shorten the Chain to Author Reader Digital Library Editor Reviewer Teacher Learner Librarian
12
DL Curric. Project NSF awards to VT and UN C-CH CS and LIS http://curric.dlib.vt.edu/ http://curric.dlib.vt.edu/wiki/index.php/Main _Page http://curric.dlib.vt.edu/modDev/modDev.ht ml 12
13
13 DL Curriculum Framework
14
DL Curric. Modules - 1 Module 1-b: History of digital libraries and library automation Module 2-c: File Formats, Transformation, and Migration Module 3-b: Digitization Module 4-b: Metadata Module 5-a: Architecture overviews 14
15
DL Curric. Modules - 2 Module 5-b: Application software Module 5-d: Protocols Module 6-a: Information needs/relevance Module 6-b: Online information seeking behaviors and search strategies Module 6-d: Interaction design and usability assessment 15
16
DL Curric. Modules - 3 Module 7-b: Reference Services Module 7-g: Personalization Module 8-b: Web Archiving Module 9-c: Digital library evaluation, user studies 16
17
Interoperability Approaches Browsers (Mosaic) Federation Heterogeneous, Homogeneous Protocols (OAI-PMH) Repositories Content Standards (XML), Mapping Integration (ETANA) Services (Superimposed Information) 17
18
18 Integration: Challenges “Semantic Web” is vision, not reality. How can we integrate without a theory? How can we interoperate without a common framework? How can we have a science of DLs if we lack agreement on definitions (so we can reason and discuss) and measures of quality (so we can compare and improve)?
19
19 Informal 5S & DL Definitions DLs are complex systems that help satisfy info needs of users (societies) provide info services (scenarios) organize info in usable ways (structures) present info in usable ways (spaces) communicate info with users (streams)
20
20 5S Layers Societies Scenarios Spaces Structures Streams
21
21 5Ss SsExamplesObjectives Streams Text; video; audio; image Describes properties of the DL content such as encoding and language for textual material or particular forms of multimedia data Structures Collection; catalog; hypertext; document; metadata Specifies organizational aspects of the DL content Spaces Measure; measurable, topological, vector, probabilistic Defines logical and presentational views of several DL components Scenarios Searching, browsing, recommending Details the behavior of DL services Societies Service managers, learners, teachers, etc. Defines managers, responsible for running DL services; actors, that use those services; and relationships among them
22
5S Overview 5S and Generating DLs –5S Framework –5S definitions, services taxonomy, ontology –5SL –5SGraph –5SGen (and DL development) –DL development of union DL, DL integration –5SGen into DSpace 5S Metamodels –Minimal DL –Archaeology DL –CBIR DL –Union DL
23
23 Streams
24
24 Structure (Degrees, Terminology) Chaotic OrganizedStructured WebDLsDBs
25
25 Digital Objects (DOs) Born digital Digitized version of “real” object –Is the DO version the same, better, or worse? –Decision for ETDs: structured + rendered Surrogate for “real” object –Not covered explicitly in metamodel for a minimal DL –Crucial in metamodel for archaeology DL
26
26 Databases 5S perspective: structures, streams, scenarios Extending database technology Structured and unstructured info Multimedia databases Link databases Performance, transaction processing Replicated storage, rollback/recovery
27
27 Spaces User interfaces and visualization 2D interfaces 3D interfaces GIS Other paradigms
28
Scenarios Services (see later) Scenario based design, use cases Functionality Representation and processing for humans and machines 28
29
29 Societies User communities –Authors, editors, teachers, students, readers –Personal(ization), group(ware), community, global –Accessibility, universal access Librarians: reference, acquisition, operations Research community –Associations, conferences, publications, labs, projects Economics –Copyright, intellectual property rights, digital rights management, authorization, authentication, security, privacy, self-archiving (eprints) –Publishers, catalogers, distributors, sustainability –Open source, commercial, hybrid
30
30 Higher DL Constructs Collections Catalogs Repositories and Archives Services Systems Case Studies
31
31 Collections Terminology: set, “database” Distributed: basis, efficiency/effectiveness Parallelism: federation, harvesting Scale: object size, compression, replication, stream splitting Intelligence/processing granularity: object, cluster, collection, repository
32
32 NSDL Collections Discovery of content Classification and cataloguing Acquisition and/or linking; referencing Disciplinary-based themes define a natural body of content, but other possibilities are also encouraged Access to massive real-time or archived datasets Software tool suites for analysis, modeling, simulation, or visualization Reviewed commentary on learning materials and pedagogy
33
33 Catalogs OPACs Distributed vs. centralized Coverage, breadth Specificity, depth Management: versioning, works
34
34 Repositories and Archives Naming, identifiers Architectures, interoperability –OAI: harvesting –SRU/SRW: federation Preservation, archives –LOCKSS, UVC, emulation/migration Scalability, storage Institutional repositories, Open Access
35
35 Services NSDL Services Taxonomy of services Ontology, composition, reuse Evaluation Key services in-depth: –Crawling, indexing –Clustering, classifying –Recommending, using social networks –Logging
36
36 NSDL Services Help services, frequently asked questions, etc. Synchronous/asynchronous collaborative learning environments using shared resources Mechanisms for building personal annotated digital information spaces Reliability testing for applets or other digital learning objects Audio, image, and video search capability Metadata system translation Community feedback mechanisms
37
37
38
38 Services Ontology: Applications
39
39 Ontology: Applications Expand definition of minimal DL by characterizing –typical DL services –in the context of “employs” and “produces” relationships Use characterization to: –Reason about how DL services can be built from other DL components –As well as be composed with other services through extension or reuse
40
40
41
41 5S and DL formal definitions and compositions (April 2004 TOIS)
42
42
43
43 XML-based DL Log Standard Log analysis –is a source of information on: How patrons really use DL services How systems behave while supporting user information seeking activities Used to: –Evaluate and enhance services –Guide allocation of resources Common practice in the web setting –Supported by web servers, proxy caches DL Logging can be more detailed
44
44 The XML Log Format Log SessionIdMachineInfo StatementTransactionTimestamp SessionInfoRegisterInfo StatementEventTimestamp Action SearchBrowse StoreSysInfoUpdate SearchBy QueryString CatalogCollection PresentationInfo StatusInfo Timeout
45
45 Systems Architectures –Client-server, service-oriented –P2P, Grid System descriptions and comparisons –Personal DLs; Institutional to global –DSpace, Eprints, Fedora, Greenstone, Kepler ODL 5S Suite: language, visualization, generation, logging
46
46 Architectural Issues Independent system vs. part of federation Centralized vs. distributed vs. open services Monolithic vs. modular vs. componentized Topologies: bus vs. star vs. hierarchical vs. network Decompositions vary –search engine, browser, DBMS, MM support –repository, handle server, client –information resources + mediators, bus or agent collection + client with workspace/environment
47
47 NSDL Information Architecture Essentially as developed by the Technical Infrastructure Workgroup referenced items & collections referenced items & collections Special Databases NSDL Services NSDL Services Other NSDL Services CI Services annotation CI Services discussion CI Services personalization CI Services authentication CI Services browsing Core Services: information retrieval Core Collection- Building Services harvesting Core Collection- Building Services protocols Core Services: metadata gathering Portals & Clients Portals & Clients Portals & Clients Usage Enhancement Collection Building User Interfaces NSDL Collections NSDL Collections NSDL Collections Core NSDL “Bus”
48
48 5S Modeling -> Systems
49
49 Tools/Applications
50
50
51
51 5SL: a DL design language Domain specific languages –Address a particular class of problems by offering specific abstractions and notations for the domain at hand –Advantages: domain-specific analysis, program management, visualization, testing, maintenance, modeling, and rapid prototyping. XML-based realization of 5S –Interoperability –Use of many sub-languages (e.g., MIME types, XML Schemas, UML notations)
52
52 5SL – The Minimal DL Metamodel
53
53 <stream value=`ETDText'> <stream value=`ETDAudio'>... %XMLSchema% Example of Document declaration in the Structures Model <Attribute name='name‘ type='String'/> <Attribute name='ID‘ type='Integer'/> Converting Reviewing Cataloguing ……… Example of Actors declaration in the Societies Model Simple scenario for an NDLTD site searching service Patron InterfaceManager collection query InterfaceManager SearchManager collection query SearchManager InterfaceManager WtdSet …. Example of Service declaration in the Scenario Model
54
54 Help users model their own instances of a digital library (DL) in the 5S language (5SL). A simple modeling process which enables rapid generation of digital libraries Features –5SGraph loads and displays a metamodel in a structured toolbox. –The structured editor of 5SGraph provides a top- down visual building environment for the DL designer. –5SGraph produces syntactically correct 5SL files according to the visual model built by the designer. 5SGraph: A DL Modeling Tool
55
55 Overview of 5SGraph Workspace (instance model) Structured toolbox (metamodel)
56
56
57
57 5SGen Version 1 -- MARIAN as the target system –Focused on rich structures: semantic networks –Behavior attached to nodes/links Version 2 -- Shifted for later work to componentized (ODL) approach –Focused on scenarios/societies –Structures/Spaces encapsulated within components (e.g., relational tables, indexes) –Only textual streams supported Version 3 – Practical DL (w. DSpace) – Doug Gorton
58
58 5SLGen – Version 2: ODL, Services, Scenarios
59
59 5S Meta Model 5SGraph DL Expert DL Designer 5SL DL Model 5SLGen Practitioner Researcher Tailored DL Services Teacher c omponent pool ODLSearch, ODLBrowse, ODLRate, ODLReview, ……. Requirements (1) Analysis (2) Implementation (4) Design (3) 5SGraph5SGen Mapping Tool 5SSuite
60
60 Describing Quality in Digital Libraries What’s a “good” digital Library? –Central Concept: Quality! –Hypotheses of this work: Formal theory can help to define “what’s a good digital library” by: New formalizations of quality indicators for DLs within our 5S framework Contextualizing these measures within the Information Life Cycle
61
61 Quality and the Information Life Cycle
62
62 Quality Dimensions
63
63 Services: Efficiency / Effectiveness Effectiveness –Very common measures: Precision, Recall, F1, 10- precision, R-Precision –Other services may have different measures: e.g., Recommending, etc. Efficiency –let t(e) be the time of an event e – let e ix and e fx be the initial and the final event of service se x. –For service se x, efficiency is defined as: Efficiency(se x ) = t(e fx ) - t(e ix )
64
64 DL Integration What is “DL Integration” –Hide distribution –Hide heterogeneity –Enable autonomy of individual component Why Integration –island-DLs –inability to seamlessly and transparently access knowledge across DLs Utilize various autonomous DLs in concert
65
65 Integration: Urgency, Longevity If we collect, capture, acquire, or produce information, will it be usable in 100 years? NSF Digital Archiving Program Library of Congress National Digital Information Infrastructure and Preservation Program
66
66 DL interoperability approach Intermediary-basedmapping-based Consists of mediatorwrapperagent use two architectures federationUnion Archiving used in Consists of hybrid mappercomposite mapper use schema mapping use Interrelated with GA trained by DL integration formalization based on
67
Union DL Definitions A Minimal Union Digital Library integrated from n DLs is given as a four-tuple: MinUnionDL=(Union Repository, Union Catalog, Minimal Union Services, Union Society). DL Integration Problem Definition: Given n individual digital libraries (DL1, DL2, …, DLn), each defined as described above, to integrate the n DLs is to create a Union DL.
68
68 Union Catalog Quality Measurement Complete –All the catalogs to be integrated are complete. Consistent –All the catalogs to be integrated are consistent. –Each descriptive metadata specification in the union catalog describes only one digital object.
69
Member DLs of ETANA-DL
70
Architecture of ETANA-DL, with centralized catalog and partially decentralized repository
71
71 Mapping confirmationMapping history
72
72 Union Catalog Integration VN Metadata Format Global Metadata Format VN Catalog HD Catalog Union Catalog Mapping Tool Wrapper Mapping Tool Wrapper HD Metadata Format Virtual Nimrin (VN) Halif DigMaster (HD) Union ArchDL
73
73 5SGraph 5S Archaeology MetaModel ArchDL Expert ArchDL Designer Structure Sub-model ETANA-DL Union Services Descriptions Harvesting Mapping Searching Browsing … Scenario Sub-model VN Metadata Format ETANA-DL Metadata Format HD Metadata Format Mapping Tool Wrapper4VNWrapper4HD Inverted Files Services DB Index Browse Service Search Service Browse DB Other ETANA-DL Services Web Interface XOAI VN Catalog HD Catalog Union Catalog 5SGen Component Pool Browsing …
74
5S definitional structure
75
Minimal archaeological DL in the 5S framework (A.i is from minimal DL, j is new)
76
StreamStructureSpaceServiceSociety Image Stream Feature Vector Image Descriptor Structured Featute Vector Image Content Description Image Digital Object Image Object User Info Need Image Collection Visualization Operation Content-based Image Searching Service Image Descriptor Metadata Catalog Composite Descriptor KNNQ RQ Minimal CBIR DL
77
DL Ref. Model Concepts -5S (see II.4.2) User -> Societies –Human and machine actors –End-users, Designers, Administrators, Application Developers + Librarians (DL curric) Content -> Streams, Structures Functionality -> Services -> Scenarios Quality -> Services (recall 5SQual) Policy -> Scenarios, Societies Architecture -> Scenarios, Structures, Spaces (components, protocols, standards, specs) 77
78
International Repository Infrastructure Workshop (Amsterdam, Mar 16-17, 2009) How can we strengthen the infrastructure for repositories: key solvable problems: Citation services - making citation data more easily available from repositories Repository handshake – talking to each other, user deposit into several at once Interoperable identification infrastructure – unambiguous people, documents (FRBR) 78
79
International Repository Infrastructure Workshop – and DL.org How are these 2 related? Can we learn from the Amsterdam meeting and focus on some important and solvable issues immediately? 79
80
Discussion Topics Faced in MARIAN, NCSTRL, CITIDEL, Ensemble, NSDL, ETANA Already solved: OAI-PMH Focus –Superimposed information / annotation –Citation information Approaches –5S: 5SL, 5SGen, 5SQual –XML representations –Protocols (VIDI) 80
81
Summary Contextual Background –DL Definitions, Scope –DL Curricula Efforts –Interoperability Approaches 5S 5S Services Work International Repository Infrastructure Workshop (Amsterdam, Mar 16-17, 2009) Discussion Topics 81
82
82 Questions? Discussion? Thank You!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.