Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Streams, Structures, Spaces, Scenarios, and Societies (5S): A Formal Digital Library Framework and Its Applications Marcos André Gonçalves Doctoral defense.

Similar presentations


Presentation on theme: "1 Streams, Structures, Spaces, Scenarios, and Societies (5S): A Formal Digital Library Framework and Its Applications Marcos André Gonçalves Doctoral defense."— Presentation transcript:

1 1 Streams, Structures, Spaces, Scenarios, and Societies (5S): A Formal Digital Library Framework and Its Applications Marcos André Gonçalves Doctoral defense Virginia Tech, Blacksburg, VA 24061 USA

2 2 Acknowledgments Funding: CAPES, NSF, AOL Collaborators Pavel Calado, Lilian Cassell, Marco Cristo, Patrick Fan, Ed Fox, Robert France, Filip Jagodzinski, Rohit Kelapure, Neill Kipp, Aaron Krowne, Alberto Laender, Claudia Medeiros, Naren Ramakrishnan, Berthier Ribeiro-Neto, Rao Shen, Hussein Suleman, Ricardo Torres, Layne Watson, Baoping Zhang, Qinwei Zhu, …

3 3 Publications and Accomplishments Book Chapters 4 published + 1 in press Journal/Magazine papers 8 published + 1 under revision + 1 accepted Conference/Workshop papers 25 published Other publications (poster and demo papers) 4 published Awards 3 (Lewis Trustee Award, AOL-CIT Fellowship– Honorable Mention, JCDL’04 Best Student Paper) Helped supervise three Masters students

4 4 Outline Motivation: the problem Hypotheses and research questions Part 1:Theory 5S: introduction, formal definitions The formal ontology Part 2: Tools/Applications Language Visualization Generation Logging Part 3: Quality Conclusions, Future Work

5 5 Motivation Digital Libraries (DLs): what are they?? No definitional consensus Conflicting views Makes interoperability a hard problem DLs are not benefiting from formal theories as are other CS fields: DB, IR, PL, etc. DL construction: difficult, ad-hoc, lack of support for tailoring/customization Conceptual modeling, requirements analysis, and methodological approaches are rarely supported in DL development. Lack of specific DL models, formalisms, languages

6 6 Hypotheses A formal theory for DLs can be built based on 5S. The formalization can serve as a basis for modeling and building high- quality DLs.

7 7 Research Questions 1. Can we formally elaborate 5S? 2. How can we use 5S to formally describe digital libraries? 3. What are the fundamental relationships among the Ss and high-level DL concepts? 4. How can we allow digital librarians to easily express those relationships? 5. Which are the fundamental quality properties of a DL? Can we use the formalized DL framework to characterize those properties? 6. Where in the life cycle of digital libraries can key aspects of quality be measured and how?

8 8 Outline Motivation: the problem Hypotheses and research questions Part 1:Theory 5S: introduction, formal definitions The formal ontology Part 2: Tools/Applications Language Visualization Generation Logging Part 3: Quality Conclusions, Future Work

9 9 Informal 5S Definitions: DLs are complex systems that help satisfy info needs of users (societies) provide info services (scenarios) organize info in usable ways (structures) present info in usable ways (spaces) communicate info with users (streams)

10 10 5Ss SsExamplesObjectives Streams Text; video; audio; image Describes properties of the DL content such as encoding and language for textual material or particular forms of multimedia data Structures Collection; catalog; hypertext; document; metadata Specifies organizational aspects of the DL content Spaces Measure; measurable, topological, vector, probabilistic Defines logical and presentational views of several DL components Scenarios Searching, browsing, recommending Details the behavior of DL services Societies Service managers, learners, teachers, etc. Defines service managers, responsible for running DL services; actors, that use those services

11 11 5S and DL formal definitions and compositions (April 2004 TOIS)

12 12 Glossary: Concepts in the Minimal DL and Representing Symbols

13 13 5S Static / Passive Dynamic / Active

14 14 Outline Motivation: the problem Hypotheses and research questions Part 1:Theory 5S: introduction, formal definitions The formal ontology Part 2: Tools/Applications Language Visualization Generation Logging Part 3: Quality Conclusions, Future Work

15 15 Digital Library Formal Ontology

16 16 Ontology: Applications Expand definition of minimal DL by characterizing typical DL services in the context of “employs” and “produces” relationships Use characterization to: reason about how DL services can be built from other DL components as well as be composed with other services through extension or reuse

17 17 Ontology: Applications

18 18 Ontology: Taxonomy of Services Binding Browsing Customizing Disseminating Expanding(query) Filtering Recommending Requesting Searching Annotating Classifying Clustering Evaluating Extracting Indexing Linking Logging Measuring Rating Reviewing (peer) Surveying Training (classifier) Translating Visualizing Conserving Converting Copying/Replicating Translating (format) Acquiring Authoring Cataloging Crawling (focused) Describing Digitizing Harvesting Submitting PreservationalCreational Add Value Repository-Building Information Satisfaction Services Infrastructure Services

19 19 Composition of key infrastructure services

20 20 Composition of additional services

21 21 Outline Motivation: the problem Hypotheses and research questions Part 1:Theory 5S: introduction, formal definitions The formal ontology Part 2: Tools/Applications Language Visualization Generation Logging Part 3: Quality Conclusions, Future Work

22 22 Approach

23 23 Part 2: Tools/Applications

24 24 Outline Motivation: the problem Hypotheses and research questions Part 1:Theory 5S: introduction, formal definitions The formal ontology Part 2: Tools/Applications Language Visualization Generation Logging Part 3: Quality Conclusions, Future Work

25 25 5SL: a DL Modeling language Domain specific languages Address a particular class of problems by offering specific abstractions and notations for the domain at hand Advantages: domain-specific analysis, program management, visualization, testing, maintenance, modeling, and rapid prototyping. XML-based realization of 5S Interoperability Use of many standard sub-languages (e.g., MIME types, XML Schemas, UML notations)

26 26 5SL – The Minimal DL Metamodel

27 27 <stream value=`ETDText'> <stream value=`ETDAudio'>... %XMLSchema% Example of Document declaration in the Structures Model <Attribute name='name‘ type='String'/> <Attribute name='ID‘ type='Integer'/> Converting Reviewing Cataloguing ……… Example of Actors declaration in the Societies Model Simple scenario for an NDLTD site searching service Patron InterfaceManager collection query InterfaceManager SearchManager collection query SearchManager InterfaceManager WtdSet …. Example of Service declaration in the Scenario Model

28 28 Outline Motivation: the problem Hypotheses and research questions Part 1:Theory 5S: introduction, formal definitions The formal ontology Part 2: Tools/Applications Language Visualization Generation Logging Part 3: Quality Conclusions, Future Work

29 29 Help users model their own instances of a digital library (DL) in the 5S language (5SL). A simple modeling process which enables rapid generation of digital libraries Features 5SGraph loads and displays a metamodel in a structured toolbox. The structured editor of 5SGraph provides a top-down visual building environment for the DL designer. 5SGraph produces syntactically correct 5SL files according to the visual model built by the designer. 5SGraph: A DL Modeling Tool

30 30 Overview of 5SGraph Workspace (instance model) Structured toolbox (metamodel)

31 31 5SGraph: Other Key Features Flexible and extensible architecture Reuse of models Load, save, and change common (sub-)models Synchronization of views Enforcing of semantic constraints

32 32 5SGraph Evaluation: Usability Study

33 33 Outline Motivation: the problem Hypotheses and research questions Part 1:Theory 5S: introduction, formal definitions The formal ontology Part 2: Tools/Applications Language Visualization Generation Logging Part 3: Quality Conclusions, Future Work

34 34 5SGen Version 1 -- MARIAN as the target system Focused on rich structures: semantic networks Behavior attached to nodes/links Version 2 -- Shifted for later work to componentized (ODL) approach Focused on scenarios/societies Structures/Spaces encapsulated within components (e.g., relational tables, indexes)

35 35 5SGen – Version 2: ODL, Services, Scenarios 5SL-Societies Model (1) XPATH/JDOM Transform (2) XMI:Class Model (3) Xmi2Java (4) Java Classes Model (5) Deterministic FSM (10) SMC (11) Java Finite State Machine Class Controller (12) 5SL-Scenario Model (6) XPath/JDOM Transform (7) StateChart Model (8) Scenario Synthesis (9) ODL Search Java Wrapping import Component Pool ODL Browse Java Wrapping import... JSP User Interface View (13) Generated DL Services DL Designer DL Designer binds 5SL-Societies Model (1) XPATH/JDOM Transform (2) XMI:Class Model (3) Xmi2Java (4) Java Classes Model (5) Deterministic FSM (10) SMC (11) Java Finite State Machine Class Controller (12) 5SL-Scenario Model (6) XPath/JDOM Transform (7) StateChart Model (8) Scenario Synthesis (9) ODL Search Java Wrapping import Component Pool ODL Browse Java Wrapping import... ODL Search Java Wrapping import Component Pool ODL Browse Java Wrapping import... JSP User Interface View (13) Generated DL Services DL Designer DL Designer binds 5SGen

36 36 5SGen Proof of Concept: prototyping CITIDEL VIADUCT NDLTD Union Catalog BDBComp

37 37 Outline Motivation: the problem Hypotheses and research questions Part 1:Theory 5S: introduction, formal definitions The formal ontology Part 2: Tools/Applications Language Visualization Generation Logging Part 3: Quality Conclusions, Future Work

38 38 XML-based DL Log Standard Log analysis is a source of information on: How patrons really use DL services How systems behave while supporting user information seeking activities Used to: Evaluate and enhance services Guide allocation of resources Common practice in the web setting Supported by web servers, proxy caches DL Logging can be more detailed.

39 39 DL Logging Features Captures high level user and system behaviors Organized according to the 5S framework Hierarchical organization (XML-based) Centered on the notions of events Record events related to initial user inputs and final system outputs Help to understand user interactions and the perceived value of responses

40 40 The XML Log Format Log SessionIdMachineInfo StatementTransactionTimestamp SessionInfoRegisterInfo EventErrorInfo Action SearchBrowse StoreSysInfoUpdate SearchBy QueryString CatalogCollection PresentationInfo StatusInfo Timeout

41 41 Outline Motivation: the problem Hypotheses and research questions Part 1:Theory 5S: introduction, formal definitions The formal ontology Part 2: Tools/Applications Language Visualization Generation Logging Part 3: Quality Conclusions, Future Work

42 42 Describing Quality in Digital Libraries What’s a “good” digital library? Central Concept: Quality! Hypotheses of this work: Formal theory can help to define “what’s a good digital library” by: New formalizations of quality indicators for DLs within our 5S framework Contextualizing these indicators/measures within the Information Life Cycle

43 43 Quality Dimensions

44 44 Digital Objects: Accessibility A digital object is accessible by an DL actor or patron, if it 1. exists in the DL collections 2. is retrievable from the repository 3. is not restricted from access by metadata on rights for an actor or actor’s society

45 45 Digital Objects: Pertinence Inf(do i ) = information carried by a digital object or any of its descriptions IN(ac j ) = information need of an actor Context jk = an amalgam of societal factors which can impact the judgment of pertinence by ac j at time k. Factors include time, place, the actor’s history of interaction, task, and factors implicit in the interaction and ambient environment.

46 46 Digital Objects: Pertinence The pertinence of a digital object do i to a user ac j is an indicator function Pertinence(do i, ac j ): Inf(do i )  IN(ac j )  Context jk defined as: 1, if Inf(do i ) is judged by ac j to be informative with regards to IN(ac j ) in context Context jk ; 0, otherwise

47 47 Digital Objects: Relevance Relevance (do i,q) 1, if do i is judged by an external-judge to be relevant to q 0, otherwise Relevance Estimate Rel(do i,q) = do i   q  / |do i  |  |q  | Objective, public, social notion Established by a general consensus in the field, not subjective, private judgment by an actor with an information need

48 48 Metadata Specifications and Metadata Format: Completeness Refers to the degree to which values are present in the description, according to a metadata standard. As far as an individual property is concerned, only two situations are possible: either a value is assigned to the property in question, or not. Completeness(ms x ) = 1 - (no. of missing attributes in ms x / total attributes of the schema to which ms x conforms)

49 49 Metadata Specifications and Metadata Format: Completeness OCLC NDLTD Union catalog

50 50 Metadata Specifications and Metadata Format: Conformance An attribute att xy of a metadata specification ms x is cardinally conformant to a metadata format/standard if: it appears at least once, if att xy is marked as mandatory; its value is from the domain defined for att xy ; it does not appear more than once, if it is not marked as repeatable. Conformance(ms x ) = (  (  attribute att xy of ms x ) degree of conformance of att xy )/ total attributes).

51 51 Metadata Specifications and Metadata Format: Conformance Based on ETD-MS

52 52 Services: Efficiency/ Effectiveness Effectiveness Very common measures: Precision, Recall, F1, 10- precision, R-Precision Other services may have different measures: e.g., Recommending, etc. Efficiency let t(e) be the time of an event e let e ix and e fx be the initial and the final events of service se x For service se x, efficiency is defined as: Efficiency(se x ) = t(e fx ) - t(e ix )

53 53 Services: Extensibility and Reusability A service Y reuses a service X if the behavior of Y incorporates the behavior of X. A service Y extends a service X if it subsumes the behavior of X and potentially includes additional subflows of events.

54 54 Services: Extensibility and Reusability (2) Macro-Reusability(Serv) = no. of reused services/ total number of services Micro-Reusability(Serv) = number of lines of code of managers that implement (run) reused services/ total lines of code

55 55 Services: Extensibility and Reusability Macro-Reusability = 4/16 = 0.25 Micro-Reusability = 3630 / 11910 = 0.304

56 56 Quality and the Information Life Cycle

57 57 Quality Model: Evaluation Focus groups 3 librarians Major points Focus on DLs not traditional libraries Some indicators may have more theoretical than practical use in some contexts Liked minimalist approach Interesting and potentially useful mainly for education and evaluation

58 58 Outline Motivation: the problem Hypotheses and research questions Part 1:Theory 5S: introduction, formal definitions The formal ontology Part 2: Tools/Applications Language Visualization Generation Logging Part 3: Quality Conclusions, Future Work

59 59 Conclusions We have answered the almost 40-year-old challenge of Licklider to build a unified CS / LIS theory by Proposing and formalizing the first comprehensive formal framework for digital libraries Showed how to move from theory to practice by Applying the framework to the problems of modeling, generating, and evaluating (by logging and assessing the quality of) digital libraries Materializing these applications into languages, tools, formats, etc. Explaining and evaluating these applications (usability studies, focus groups, prototyping, etc.)

60 60 Future Work Theory Apply to formally describe other systems Complete formal definitions of all services with further events Load axioms in knowledge base to automatically assess quality of models (correctness, etc.) Applications/Tools Language Make different versions uniform Extend with METS, less complex scenarios, society models New metamodels Domain/application oriented (e.g., archaeology, education) For traditional libraries

61 61 Future Work Applications/Tools Visualization Integration with other tools through Wizard New visualizations Applying as educational tool Generation Use of Web services Incorporation of Native XML repositories Improvement of Scenario Algorithms Logging Promote use Consider privacy issues New actions Deal with scalability issues

62 62 Future Work Quality Development of more usage-oriented indicators Current indicators are mostly system-oriented Focus on log format and evaluation Development of Quality ToolKit (5SQual) for DL managers with following features: Mapping tool to map local log format to standard XML Log format Components to implement all indicators Visualization of data and indicators Broken into several logical pieces to be used in the different phases of the Information Life Cycle Others, e.g., personalization Create theories, tools, languages, methods for personalization based on 5S

63 63 Questions/Discussion? Thanks!


Download ppt "1 Streams, Structures, Spaces, Scenarios, and Societies (5S): A Formal Digital Library Framework and Its Applications Marcos André Gonçalves Doctoral defense."

Similar presentations


Ads by Google