1 Streams, Structures, Spaces, Scenarios, and Societies (5S): A Formal Digital Library Framework and Its Applications Marcos André Gonçalves Doctoral defense.

Slides:



Advertisements
Similar presentations
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Advertisements

Requirements Engineering n Elicit requirements from customer  Information and control needs, product function and behavior, overall product performance,
Object-Oriented Analysis and Design
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Formalizing the Design of Digital Libraries Based on UML Delos NoE, Preservation Cluster: Workshop: Persistency in Digital Libraries 13. February 2006,
Visual Web Information Extraction With Lixto Robert Baumgartner Sergio Flesca Georg Gottlob.
Architecture & Data Management of XML-Based Digital Video Library System Jacky C.K. Ma Michael R. Lyu.
© Anselm SpoerriInfo + Web Tech Course Information Technologies Info + Web Tech Course Anselm Spoerri PhD (MIT) Rutgers University
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
1/31 CS 426 Senior Projects Chapter 1: What is UML? Chapter 2: What is UP? [Arlow and Neustadt, 2005] January 22, 2009.
© Copyright Eliyahu Brutman Programming Techniques Course.
1 CS 426 Senior Projects Chapter 1: What is UML? Chapter 2: What is UP? [Arlow and Neustadt, 2002] January 26, 2006.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
ADL Slide 1 December 15, 2009 Evidence-Centered Design and Cisco’s Packet Tracer Simulation-Based Assessment Robert J. Mislevy Professor, Measurement &
The chapter will address the following questions:
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Semantic Interoperability Jérôme Euzenat INRIA & LIG France Natasha Noy Stanford University USA.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Web Archives, IDEAL, and PBL Overview Edward A. Fox Digital Library Research Laboratory Dept. of Computer Science Virginia Tech Blacksburg, VA, USA 21.
Aurora: A Conceptual Model for Web-content Adaptation to Support the Universal Accessibility of Web-based Services Anita W. Huang, Neel Sundaresan Presented.
Requirements Analysis
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Metadata, the CARARE Aggregation service and 3D ICONS Kate Fernie, MDR Partners, UK.
Introduction to MDA (Model Driven Architecture) CYT.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
CHAPTER TEN AUTHORING.
ETANA-DL NSF Digital Library Project Edward A. Fox, Virginia Tech ASOR Annual Meeting, 2004
Copyright 2002 Prentice-Hall, Inc. Chapter 2 Object-Oriented Analysis and Design Modern Systems Analysis and Design Third Edition Jeffrey A. Hoffer Joey.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Towards a Digital Library Theory: A Formal Digital Library Ontology Marcos André Gonçalves, Layne T. Watson, and Edward A. Fox Virginia Tech, Blacksburg,
ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University.
CIS 112 Exam Review. Exam Content 100 questions valued at 1 point each 100 questions valued at 1 point each 100 points total 100 points total 10 each.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.
The Question of Quality Most of this presentation is based on the work of Marcos Gonçales as cited in the references.
Requirements Engineering-Based Conceptual Modelling From: Requirements Engineering E. Insfran, O. Pastor and R. Wieringa Presented by Chin-Yi Tsai.
Digital Libraries Lillian N. Cassel Spring A digital library An informal definition of a digital library is a managed collection of information,
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Digital Library The networked collections of digital text, documents, images, sounds, scientific data, and software that are the core of today’s Internet.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Towards a Reference Quality Model for Digital Libraries Maristella Agosti Nicola Ferro Edward A. Fox Marcos André Gonçalves Bárbara Lagoeiro Moreira.
Introduction to Concept Maps Edward A. Fox and Rao Shen CS5604 Fall 2002 “Information Storage & Retrieval” Dept. of Computer Science Virginia Tech, Blacksburg,
Information Retrieval
1 DL.Org (Digital Library Interoperability, Best Practices and Modeling Foundations) Functionality Working Group Mtg June 2009, Athens “Functionality.
1 IBM Academic Initiative Introduction for Pamplin School of Business Virginia Tech – October 13, 2011 “IBM Academic Skills Cloud and Computing Education.
Lecture 13.  Failure mode: when team understands requirements but is unable to meet them.  To ensure that you are building the right system Continually.
Chapter 5 System Modeling. What is System modeling? System modeling is the process of developing abstract models of a system, with each model presenting.
Visual Semantic Modeling of Digital Libraries Qinwei Zhu, Marcos André Gonçalves, Rao Shen, Edward A. Fox – Virginia Tech,, Blacksburg, VA, USA Lillian.
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
SCENARIO-BASED GENERATION OF DIGITAL LIBRARY SERVICES Rohit Kelapure, Marcos André Gonçalves, Edward A. Fox Virginia Tech, Blacksburg, VA, USA.
Collaborative Query Previews in Digital Libraries Lin Fu, Dion Goh, Schubert Foo Division of Information Studies School of Communication and Information.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Foundations of, and Experiences with, Componentized Digital Libraries OCKHAM Panel ECDL Rome, Italy Edward A. Fox Digital Library Research.
5S Perspective Digital Libraries Foundations Workshop at JCDL 2007 Vancouver – June 23 Edward A. Fox Virginia Tech, USA
21/1/ Analysis - Model of real-world situation - What ? System Design - Overall architecture (sub-systems) Object Design - Refinement of Design.
Designing Protocols in Support of Digital Library Componentization Hussein Suleman and Edward A. Fox Digital Library Research Laboratory Virginia Tech.
Functionality Working Group Dagobert Soergel University at Buffalo 1.
Design Evaluation Overview Introduction Model for Interface Design Evaluation Types of Evaluation –Conceptual Design –Usability –Learning Outcome.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
DARE: Domain analysis and reuse environment Minwoo Hong William Frakes, Ruben Prieto-Diaz and Christopher Fox Annals of Software Engineering,
Chapter (12) – Old Version
Object-Oriented Software Engineering Using UML, Patterns, and Java,
Requirements – Scenarios and Use Cases
Data Model.
Database Design Hacettepe University
Practical Database Design and Tuning Objectives
Presentation transcript:

1 Streams, Structures, Spaces, Scenarios, and Societies (5S): A Formal Digital Library Framework and Its Applications Marcos André Gonçalves Doctoral defense Virginia Tech, Blacksburg, VA USA

2 Acknowledgments Funding: CAPES, NSF, AOL Collaborators Pavel Calado, Lilian Cassell, Marco Cristo, Patrick Fan, Ed Fox, Robert France, Filip Jagodzinski, Rohit Kelapure, Neill Kipp, Aaron Krowne, Alberto Laender, Claudia Medeiros, Naren Ramakrishnan, Berthier Ribeiro-Neto, Rao Shen, Hussein Suleman, Ricardo Torres, Layne Watson, Baoping Zhang, Qinwei Zhu, …

3 Publications and Accomplishments Book Chapters 4 published + 1 in press Journal/Magazine papers 8 published + 1 under revision + 1 accepted Conference/Workshop papers 25 published Other publications (poster and demo papers) 4 published Awards 3 (Lewis Trustee Award, AOL-CIT Fellowship– Honorable Mention, JCDL’04 Best Student Paper) Helped supervise three Masters students

4 Outline Motivation: the problem Hypotheses and research questions Part 1:Theory 5S: introduction, formal definitions The formal ontology Part 2: Tools/Applications Language Visualization Generation Logging Part 3: Quality Conclusions, Future Work

5 Motivation Digital Libraries (DLs): what are they?? No definitional consensus Conflicting views Makes interoperability a hard problem DLs are not benefiting from formal theories as are other CS fields: DB, IR, PL, etc. DL construction: difficult, ad-hoc, lack of support for tailoring/customization Conceptual modeling, requirements analysis, and methodological approaches are rarely supported in DL development. Lack of specific DL models, formalisms, languages

6 Hypotheses A formal theory for DLs can be built based on 5S. The formalization can serve as a basis for modeling and building high- quality DLs.

7 Research Questions 1. Can we formally elaborate 5S? 2. How can we use 5S to formally describe digital libraries? 3. What are the fundamental relationships among the Ss and high-level DL concepts? 4. How can we allow digital librarians to easily express those relationships? 5. Which are the fundamental quality properties of a DL? Can we use the formalized DL framework to characterize those properties? 6. Where in the life cycle of digital libraries can key aspects of quality be measured and how?

8 Outline Motivation: the problem Hypotheses and research questions Part 1:Theory 5S: introduction, formal definitions The formal ontology Part 2: Tools/Applications Language Visualization Generation Logging Part 3: Quality Conclusions, Future Work

9 Informal 5S Definitions: DLs are complex systems that help satisfy info needs of users (societies) provide info services (scenarios) organize info in usable ways (structures) present info in usable ways (spaces) communicate info with users (streams)

10 5Ss SsExamplesObjectives Streams Text; video; audio; image Describes properties of the DL content such as encoding and language for textual material or particular forms of multimedia data Structures Collection; catalog; hypertext; document; metadata Specifies organizational aspects of the DL content Spaces Measure; measurable, topological, vector, probabilistic Defines logical and presentational views of several DL components Scenarios Searching, browsing, recommending Details the behavior of DL services Societies Service managers, learners, teachers, etc. Defines service managers, responsible for running DL services; actors, that use those services

11 5S and DL formal definitions and compositions (April 2004 TOIS)

12 Glossary: Concepts in the Minimal DL and Representing Symbols

13 5S Static / Passive Dynamic / Active

14 Outline Motivation: the problem Hypotheses and research questions Part 1:Theory 5S: introduction, formal definitions The formal ontology Part 2: Tools/Applications Language Visualization Generation Logging Part 3: Quality Conclusions, Future Work

15 Digital Library Formal Ontology

16 Ontology: Applications Expand definition of minimal DL by characterizing typical DL services in the context of “employs” and “produces” relationships Use characterization to: reason about how DL services can be built from other DL components as well as be composed with other services through extension or reuse

17 Ontology: Applications

18 Ontology: Taxonomy of Services Binding Browsing Customizing Disseminating Expanding(query) Filtering Recommending Requesting Searching Annotating Classifying Clustering Evaluating Extracting Indexing Linking Logging Measuring Rating Reviewing (peer) Surveying Training (classifier) Translating Visualizing Conserving Converting Copying/Replicating Translating (format) Acquiring Authoring Cataloging Crawling (focused) Describing Digitizing Harvesting Submitting PreservationalCreational Add Value Repository-Building Information Satisfaction Services Infrastructure Services

19 Composition of key infrastructure services

20 Composition of additional services

21 Outline Motivation: the problem Hypotheses and research questions Part 1:Theory 5S: introduction, formal definitions The formal ontology Part 2: Tools/Applications Language Visualization Generation Logging Part 3: Quality Conclusions, Future Work

22 Approach

23 Part 2: Tools/Applications

24 Outline Motivation: the problem Hypotheses and research questions Part 1:Theory 5S: introduction, formal definitions The formal ontology Part 2: Tools/Applications Language Visualization Generation Logging Part 3: Quality Conclusions, Future Work

25 5SL: a DL Modeling language Domain specific languages Address a particular class of problems by offering specific abstractions and notations for the domain at hand Advantages: domain-specific analysis, program management, visualization, testing, maintenance, modeling, and rapid prototyping. XML-based realization of 5S Interoperability Use of many standard sub-languages (e.g., MIME types, XML Schemas, UML notations)

26 5SL – The Minimal DL Metamodel

27 <stream value=`ETDText'> <stream value=`ETDAudio'>... %XMLSchema% Example of Document declaration in the Structures Model <Attribute name='name‘ type='String'/> <Attribute name='ID‘ type='Integer'/> Converting Reviewing Cataloguing ……… Example of Actors declaration in the Societies Model Simple scenario for an NDLTD site searching service Patron InterfaceManager collection query InterfaceManager SearchManager collection query SearchManager InterfaceManager WtdSet …. Example of Service declaration in the Scenario Model

28 Outline Motivation: the problem Hypotheses and research questions Part 1:Theory 5S: introduction, formal definitions The formal ontology Part 2: Tools/Applications Language Visualization Generation Logging Part 3: Quality Conclusions, Future Work

29 Help users model their own instances of a digital library (DL) in the 5S language (5SL). A simple modeling process which enables rapid generation of digital libraries Features 5SGraph loads and displays a metamodel in a structured toolbox. The structured editor of 5SGraph provides a top-down visual building environment for the DL designer. 5SGraph produces syntactically correct 5SL files according to the visual model built by the designer. 5SGraph: A DL Modeling Tool

30 Overview of 5SGraph Workspace (instance model) Structured toolbox (metamodel)

31 5SGraph: Other Key Features Flexible and extensible architecture Reuse of models Load, save, and change common (sub-)models Synchronization of views Enforcing of semantic constraints

32 5SGraph Evaluation: Usability Study

33 Outline Motivation: the problem Hypotheses and research questions Part 1:Theory 5S: introduction, formal definitions The formal ontology Part 2: Tools/Applications Language Visualization Generation Logging Part 3: Quality Conclusions, Future Work

34 5SGen Version 1 -- MARIAN as the target system Focused on rich structures: semantic networks Behavior attached to nodes/links Version 2 -- Shifted for later work to componentized (ODL) approach Focused on scenarios/societies Structures/Spaces encapsulated within components (e.g., relational tables, indexes)

35 5SGen – Version 2: ODL, Services, Scenarios 5SL-Societies Model (1) XPATH/JDOM Transform (2) XMI:Class Model (3) Xmi2Java (4) Java Classes Model (5) Deterministic FSM (10) SMC (11) Java Finite State Machine Class Controller (12) 5SL-Scenario Model (6) XPath/JDOM Transform (7) StateChart Model (8) Scenario Synthesis (9) ODL Search Java Wrapping import Component Pool ODL Browse Java Wrapping import... JSP User Interface View (13) Generated DL Services DL Designer DL Designer binds 5SL-Societies Model (1) XPATH/JDOM Transform (2) XMI:Class Model (3) Xmi2Java (4) Java Classes Model (5) Deterministic FSM (10) SMC (11) Java Finite State Machine Class Controller (12) 5SL-Scenario Model (6) XPath/JDOM Transform (7) StateChart Model (8) Scenario Synthesis (9) ODL Search Java Wrapping import Component Pool ODL Browse Java Wrapping import... ODL Search Java Wrapping import Component Pool ODL Browse Java Wrapping import... JSP User Interface View (13) Generated DL Services DL Designer DL Designer binds 5SGen

36 5SGen Proof of Concept: prototyping CITIDEL VIADUCT NDLTD Union Catalog BDBComp

37 Outline Motivation: the problem Hypotheses and research questions Part 1:Theory 5S: introduction, formal definitions The formal ontology Part 2: Tools/Applications Language Visualization Generation Logging Part 3: Quality Conclusions, Future Work

38 XML-based DL Log Standard Log analysis is a source of information on: How patrons really use DL services How systems behave while supporting user information seeking activities Used to: Evaluate and enhance services Guide allocation of resources Common practice in the web setting Supported by web servers, proxy caches DL Logging can be more detailed.

39 DL Logging Features Captures high level user and system behaviors Organized according to the 5S framework Hierarchical organization (XML-based) Centered on the notions of events Record events related to initial user inputs and final system outputs Help to understand user interactions and the perceived value of responses

40 The XML Log Format Log SessionIdMachineInfo StatementTransactionTimestamp SessionInfoRegisterInfo EventErrorInfo Action SearchBrowse StoreSysInfoUpdate SearchBy QueryString CatalogCollection PresentationInfo StatusInfo Timeout

41 Outline Motivation: the problem Hypotheses and research questions Part 1:Theory 5S: introduction, formal definitions The formal ontology Part 2: Tools/Applications Language Visualization Generation Logging Part 3: Quality Conclusions, Future Work

42 Describing Quality in Digital Libraries What’s a “good” digital library? Central Concept: Quality! Hypotheses of this work: Formal theory can help to define “what’s a good digital library” by: New formalizations of quality indicators for DLs within our 5S framework Contextualizing these indicators/measures within the Information Life Cycle

43 Quality Dimensions

44 Digital Objects: Accessibility A digital object is accessible by an DL actor or patron, if it 1. exists in the DL collections 2. is retrievable from the repository 3. is not restricted from access by metadata on rights for an actor or actor’s society

45 Digital Objects: Pertinence Inf(do i ) = information carried by a digital object or any of its descriptions IN(ac j ) = information need of an actor Context jk = an amalgam of societal factors which can impact the judgment of pertinence by ac j at time k. Factors include time, place, the actor’s history of interaction, task, and factors implicit in the interaction and ambient environment.

46 Digital Objects: Pertinence The pertinence of a digital object do i to a user ac j is an indicator function Pertinence(do i, ac j ): Inf(do i )  IN(ac j )  Context jk defined as: 1, if Inf(do i ) is judged by ac j to be informative with regards to IN(ac j ) in context Context jk ; 0, otherwise

47 Digital Objects: Relevance Relevance (do i,q) 1, if do i is judged by an external-judge to be relevant to q 0, otherwise Relevance Estimate Rel(do i,q) = do i   q  / |do i  |  |q  | Objective, public, social notion Established by a general consensus in the field, not subjective, private judgment by an actor with an information need

48 Metadata Specifications and Metadata Format: Completeness Refers to the degree to which values are present in the description, according to a metadata standard. As far as an individual property is concerned, only two situations are possible: either a value is assigned to the property in question, or not. Completeness(ms x ) = 1 - (no. of missing attributes in ms x / total attributes of the schema to which ms x conforms)

49 Metadata Specifications and Metadata Format: Completeness OCLC NDLTD Union catalog

50 Metadata Specifications and Metadata Format: Conformance An attribute att xy of a metadata specification ms x is cardinally conformant to a metadata format/standard if: it appears at least once, if att xy is marked as mandatory; its value is from the domain defined for att xy ; it does not appear more than once, if it is not marked as repeatable. Conformance(ms x ) = (  (  attribute att xy of ms x ) degree of conformance of att xy )/ total attributes).

51 Metadata Specifications and Metadata Format: Conformance Based on ETD-MS

52 Services: Efficiency/ Effectiveness Effectiveness Very common measures: Precision, Recall, F1, 10- precision, R-Precision Other services may have different measures: e.g., Recommending, etc. Efficiency let t(e) be the time of an event e let e ix and e fx be the initial and the final events of service se x For service se x, efficiency is defined as: Efficiency(se x ) = t(e fx ) - t(e ix )

53 Services: Extensibility and Reusability A service Y reuses a service X if the behavior of Y incorporates the behavior of X. A service Y extends a service X if it subsumes the behavior of X and potentially includes additional subflows of events.

54 Services: Extensibility and Reusability (2) Macro-Reusability(Serv) = no. of reused services/ total number of services Micro-Reusability(Serv) = number of lines of code of managers that implement (run) reused services/ total lines of code

55 Services: Extensibility and Reusability Macro-Reusability = 4/16 = 0.25 Micro-Reusability = 3630 / = 0.304

56 Quality and the Information Life Cycle

57 Quality Model: Evaluation Focus groups 3 librarians Major points Focus on DLs not traditional libraries Some indicators may have more theoretical than practical use in some contexts Liked minimalist approach Interesting and potentially useful mainly for education and evaluation

58 Outline Motivation: the problem Hypotheses and research questions Part 1:Theory 5S: introduction, formal definitions The formal ontology Part 2: Tools/Applications Language Visualization Generation Logging Part 3: Quality Conclusions, Future Work

59 Conclusions We have answered the almost 40-year-old challenge of Licklider to build a unified CS / LIS theory by Proposing and formalizing the first comprehensive formal framework for digital libraries Showed how to move from theory to practice by Applying the framework to the problems of modeling, generating, and evaluating (by logging and assessing the quality of) digital libraries Materializing these applications into languages, tools, formats, etc. Explaining and evaluating these applications (usability studies, focus groups, prototyping, etc.)

60 Future Work Theory Apply to formally describe other systems Complete formal definitions of all services with further events Load axioms in knowledge base to automatically assess quality of models (correctness, etc.) Applications/Tools Language Make different versions uniform Extend with METS, less complex scenarios, society models New metamodels Domain/application oriented (e.g., archaeology, education) For traditional libraries

61 Future Work Applications/Tools Visualization Integration with other tools through Wizard New visualizations Applying as educational tool Generation Use of Web services Incorporation of Native XML repositories Improvement of Scenario Algorithms Logging Promote use Consider privacy issues New actions Deal with scalability issues

62 Future Work Quality Development of more usage-oriented indicators Current indicators are mostly system-oriented Focus on log format and evaluation Development of Quality ToolKit (5SQual) for DL managers with following features: Mapping tool to map local log format to standard XML Log format Components to implement all indicators Visualization of data and indicators Broken into several logical pieces to be used in the different phases of the Information Life Cycle Others, e.g., personalization Create theories, tools, languages, methods for personalization based on 5S

63 Questions/Discussion? Thanks!