Download presentation
Presentation is loading. Please wait.
Published byKlara Arnesen Modified over 6 years ago
1
Edward A. Fox fox@vt.edu http://fox.cs.vt.edu
Un Modelo Formal para la Biblioteca Digital: Flujos, Estructuras, Espacios, Escenarios, y Sociedades Universidad de Buenos Aires May 19, 2004 This overview on digital libraries covers many aspects of the field, from hardware to software to projects to theory. I’m Ed Fox, a Professor of Computer Science at Virginia Tech. I also direct Virginia Tech’s Digital Library Research Laboratory and its Internet Center. Edward A. Fox 1 1 1 1
2
Acknowledgements (Selected)
Sponsors: ACM, Adobe, AOL, IBM, Microsoft, NASA, NLM, NSF, OCLC, SUN, US Dept. of Ed. VT Faculty/Staff: Debra Dudley, Weiguo Fan, Gail McMillan, Manuel Perez, Naren Ramakrishnan, Layne Watson, … VT Students: Yuxin Chen, Shahrooz Feizabadi, Marcos Gonçalves, Nithiwat Kampanya, S.H. Kim, Bing Liu, Paul Mather, Fernando Das Neves, Unni. Ravindranathan, Ryan Richardson, Rao Shen, Ricardo Torres, Wensi Xi, Baoping Zhang, Qinwei Zhu, … Our efforts have been made possible through the support of sponsors, faculty, staff, and students. We gratefully acknowledge their assistance and collaboration. IBM has donated a large amount of equipment. The largest grants have come from NSF, FIPSE, and SURA. Content has been shared by ACM in a variety of efforts related to learning about computing. Many companies, like Adobe, Microsoft, and OCLC, have provided software and related assistance. A large number of colleagues have worked on the various projects discussed. Students, serving as research assistants, preparing a thesis or dissertation, or engaged in class projects, have helped develop many of the systems and publications about Virginia Tech digital library initiatives.
3
ACKNOWLEDGEMENTS (NDLTD)
NDLTD Board of Directors, previous Steering Committee + other NDLTD committees; those running Electronic Thesis & Dissertation (ETD) initiatives in universities, regions, countries Helpful sponsorship by many organizations, especially Adobe (new initiative!), CONACyT, DFG, FIPSE (US Dept. Education), IBM, Microsoft, NSF (IIS , , , ; DUE , , , ), OCLC, SOLINET, SUN, SURA, UNESCO, VTLS, many governments (Australia, Germany, India, …), … Colleagues at Virginia Tech (faculty, staff, students), and collaborators at many universities Slides included from: Vinod Chachra, Thom Hickey, Joan Lippincott, Gail McMillan, Axel Plathe, Hussein Suleman, …
4
Other Collaborators (Selected)
Brazil: FUA, UFMG, UNICAMP Case Western Reserve University Emory, Notre Dame, Oregon State Germany: Univ. Oldenburg Mexico: UDLA (Puebla), Monterrey College of NJ, Hofstra, Penn State, Villanova University of Arizona University of Florida, Univ. of Illinois University of Virginia Endowment: VTLS
5
UNESCO Cláudio Menezes [cmenezes@unesco.org.uy] Purpose: Emphasize:
Reinforce local solutions, commitments Emphasize: ETD does not need many resources. Open source and free software is available. International cooperation can help. Local training is crucial. => Inclusion of ETD in practices, processes => Schedule for ETD projects
6
The 5S Model: A Formal Model for the Digital Library
Part 2 The 5S Model: A Formal Model for the Digital Library
7
Motivation DLs are not benefiting from formal theories as have other CS fields: DB, IR, PL, etc. DL construction: difficult, ad-hoc, lacking support for tailoring/customization Conceptual modeling, requirements analysis, and methodological approaches are rarely supported in DL development. Lack of specific DL models, formalisms, languages
8
5S Layers Societies Scenarios Spaces Structures Streams
9
Definition: Digital Libraries are complex systems that
help satisfy info needs of users (societies) provide info services (scenarios) organize info in usable ways (structures) present info in usable ways (spaces) communicate info with users (streams)
10
DL Student Research: Gonçalves
5S as a basis for developing digital libraries Theory Syntax, Semantics; Definitions, Relationships Specification of requirements Generation of systems Quality
11
DL Services/Activities Taxonomy (Gonçalves)
Infrastructure Services Information Satisfaction Services Repository-Building Add Value Creational Preservational Acquiring Cataloging Crawling (focused) Describing Digitizing Federating Harvesting Purchasing Submitting Conserving Converting Copying/Replicating Emulating Renewing Translating (format) Annotating Classifying Clustering Evaluating Extracting Indexing Measuring Publicizing Rating Reviewing (peer) Surveying Translating (language) Browsing Collaborating Customizing Filtering Providing access Recommending Requesting Searching Visualizing
12
Defining Quality in Digital Libraries
DL Concept Dimensions of Quality Digital object Accessibility Pertinence (*) Preservability Relevance Similarity Significance Timeliness Metadata specification Accuracy Completeness Conformance Collection Impact Factor Catalog Consistency Repository Structures for Navigation Navigability Services Composability Efficiency Effectiveness Extensibility Reusability Reliability
13
5S Model: Examples, Objectives
Models Examples Objectives Stream Text; video; audio; image Describes properties of the DL content such as encoding and language for textual material or particular forms of multimedia data Structures Collection; catalog; hypertext; document; metadata; organization tools Specifies organizational aspects of the DL content Spatial Measure; measurable, topological, vector, probabilistic Defines logical and presentational views of several DL components Scenarios Searching, browsing, recommending, Details the behavior of DL services Societies Service managers, learners, Teachers, etc. Defines managers, responsible for running DL services; actors, that use those services; and relationships among them
14
Document Models, Representations, and Accesses
Doc = stream + structure + use-scenario; hybrid (paper/electronic), digital only Multilingual: content, summary, metadata Multimedia: structure, quality (oS), search Structured: MARC, SGML, by user: MVD Distributed collection: Kleisli, CIMI, Z39.50 Federated search: collecting, picking site(s), parallel search / fall-back, fusing results Access: IPR, payment, security, scenarios
15
Architectural Issues Internet middleware
Independent system / part of federation Decompositions vary search engine, browser, DBMS, MM support repository, handle server, client information resources + mediators, bus or agent collection + client with workspace/environment Metrics: e.g., for federated search
16
Standards Protocols/federation Z39.50, CIMI Dienst, NCSTRL
OAI protocol Metadata TEI: inline, detailed (structure in stream) MARC: two-level, fine-grained Dublin Core: high-level, 15 elements RDF: describing resources/collections, annotation OAMS -> DC and others used in OAI
17
Digital Library Courseware
WWW pages or large PDF copy files Online quizzes based on book by Michael Lesk (Morgan Kaufmann Publishers) Contents based on book, with several other popular topics added (e.g., agents) Separate pages to supplement: Definitions, Resources (People, Projects), and References UNC-CH proposal; book plans for 2005
18
Topical Outline - Foundations
Early visions Definitions Resources References Projects
19
Topical Outline – IR Areas
Search, Retrieval, Resource Discovery Information storage and retrieval Boolean vs. natural language Search engines Indexing, phrases, thesauri, concepts Federated search and harvesting, OAI Integrating links and ratings Crawlers, spiders, metasearch, fusion Details following – Li Wang indep. study
20
What is a Crawler? A Program An Important Module For Web Search Engine
Crawls On The Web According To Its Algorithm Retrieves Web Pages Gets Useful Information Stores The Web Pages For Future Refining
21
Jobs For Threads Get A New URL From Buffer Contact The Server
For File Type Download The File Put New URLs Into Buffer Parse The Web Page
22
Advanced Functions Backward Linkage Information Collector A Web Page
23
Topical Outline - Multimedia
Multiple media types, representations Text, audio, image, video, graphics, animation Capture, digitization, standards, interchange Compression, content-based retrieval Playback (Real), SMIL, QoS JPEG, MPEG (and versions)
24
Topical Outline - Architectures
Distributed, centralized Modular, componentized Bus (InfoBus), hierarchical, star Mediators, wrappers (TSIMMIS) Light weight protocols Architecture of OAI and XOAI
25
Topical Outline – Interfaces
Taxonomy of interface components Workflow Visualization Environments Design Usability testing
26
Topical Outline – Metadata
MARC Dublin Core RDF IMS OAI (Open Archives Initiative) Crosswalks, mappings Ontologies Topics maps, concept maps
27
Topical Outline – Epub, SGML, XML
Authoring Rendering, presenting Structure Tagging, Markup, DOM Semi-structured information Dual-publishing, eBooks Styles (XSL, XSLT) Structure queries
28
Topical Outline – Databases
Extending database technology Structured and unstructured info Multimedia databases Link databases Performance Replicated storage, I2-DSI (details following)
29
Topical Outline – Agents
Protocols Knowledge interchange Negotiation, registries Distributed issues Ontologies (standard upper) Webbots (automatic indexing)
30
Topical Outline – Economics
E-commerce Sustainability Preservation and archiving DLF, Besser, Lorie, Gladney Self-archiving Open collections Economic models, business plans
31
Topical Outline – IPR Intellectual property rights (IPR) Legal issues
Terms and conditions Copyright Patents, trademarks Distributed rights management Security
32
Topical Outline – Social Issues
Cooperation, collaboration Annotation, ratings Digital divide Educational applications Cultural heritage Museums (AMICO) Organizational acceptance Personalization Internationalization
33
5S Model: Definitions 5S Definition Streams Structures Spatial
Sequences of elements of an arbitrary type Structures Labeled directed graphs Spatial Sets and operations on those sets Scenarios Sequences of events that modify states of a computation in order to accomplish some functional requirement. Societies Sets of communities and relationships among them
34
Overview of 5S and DL formal definitions and compositions (Gonçalves)
35
Semantic relationships among DL concepts: Partial concept map (Gonçalves)
36
5S Framework and DL Development (Gonçalves)
37
5SLGen: Automatic DL Generation
38
MARIAN DL Generation 5SL Design Component Pool MARIAN Digital Library
XML PARSERS: DOM, SAX MARIAN API Component Pool 5SL Design MARIAN Digital Library Generator Resource Manager Configuration and Processing Classes Indexing Classes User interfaces Class managers Loader
39
Challenges with Approach
The designer should know the 5S theory very well and be very familiar with the syntax and semantics of 5SL to be able to write correct 5SL files. It is difficult to get the big picture of a digital library just from a textual 5SL file.
40
5SGraph: A DL Modeling Tool (Qinwei Zhu MS Thesis)
Overall objective of 5SGraph: Help users model their own instances of a digital library (DL) in the 5S language (5SL). A simple modeling process which enables rapid generation of digital libraries is needed. Support non-expert users. Speed-up development process. Increase the quality of final product.
41
Goals of 5SGraph To help digital library designers understand the 5S model quickly and easily To help digital library designers build their own digital libraries without difficulty To help digital library designers transform their models into 5SL files automatically To help digital library designers understand, maintain, and upgrade existing digital library models conveniently
42
5SGraph How does 5SGraph work?
5SGraph loads and displays a metamodel in a structured toolbox. The structured editor of 5SGraph provides a top-down visual environment for the DL designer. 5SGraph produces correct 5SL files according to the visual model built by the designer.
43
Overview of 5SGraph Workspace (instance model) Structured toolbox
(metamodel)
44
Overview of 5SGraph(cont.)
Structured toolbox Show the available concepts in metamodel and the relationships between those concepts. Visualize the Metamodel Concepts in structured toolbox can be added into workspace. Workspace Visualize the model The place where the user creates his/her model.
45
Visualization Features
The structured toolbox Visualization of the metamodel Visual components that can be added Truncated display of trees Node-link representation Deep-node problem Icons Type/Instance relationship Cardinality
46
Component Reuse Components can be loaded/saved.
Load and save sub-trees Component reuse saves time and effort. Full reuse from component pool Partial reuse: adapting components
47
Functionalities of 5SGraph
Load/Close a metamodel Load/Save/Close a model Explore the structure of metamodel/model Add concepts from metamodel to model Delete concepts from model Change the properties of concepts Load/Save a existing concept Specifying inter-model constraints
48
Open/Close metamodel
49
Load/Save/Close a model
50
Explore the structure of metamodel and model
51
Add a concept to user model
Top-down: Before you want to add a concept, make sure you have added its parent. You can only add a concept to its parent node Make sure the parent node is chosen before you add a new concept. If the highlight color is blue, the concept has satisfied all the requirements and can be added. If the highlight color is yellow, click the parent node in workspace and then add the concept.
52
Add a concept to user model(cont.)
Double-click the concept in toolbox or Right-click and choose the item in the pop-up menu
53
Add a concept to user model(cont.)
54
Add a concept to user model(cont.)
55
Add a concept to user model(cont.)
56
Delete a concept If the concept has no child concepts, click the concept first, then press “Delete” key. If the concept has child concepts, delete the child concept first, and then delete this concept.
57
Change the name and properties of concepts
58
Change the name and properties of concepts (cont.)
59
Change the name and properties of concepts (cont.)
60
Load/Save concepts
61
Semantic Constraints There are inherent semantic constraints in the hierarchical structure of the 5S model. 5SGraph maintains the constraints and enforces these constraints over the instance model to ensure correctness.
62
Example 1 (Constraint Enforcement)
An actor can only participate in the services that have been defined in the Scenario Model.
67
Example 2 (Constraint Enforcement)
A catalog has descriptive metadata for digital objects in a specific collection. Therefore, a catalog must have a 1:1 relationship with an existing collection. Thus, a catalog is not independent.
70
The Preliminary Test of 5SGraph
Research Questions Does the tool help users understand and use the 5S model to build their own digital libraries? Does the tool help users efficiently describe digital library models in the 5SL language? Are users satisfied with the tool?
71
The Preliminary Test of 5SGraph: Experimental Design
Three tasks Build a simple digital library using existing components. Complete the partial model for CITIDEL. Build a model for NDLTD from scratch. Three measures Effectiveness Efficiency User satisfaction 17 subjects
72
Measures Effectiveness Efficiency Satisfaction Completion rate
Goal achievement Efficiency Task completion time Closeness to expertise: minimum task time divided by task time Satisfaction Subjective rating
73
Test Results Task 1 Task 2 Task 3 Completion Rate (%) 100
Task 1 Task 2 Task 3 Completion Rate (%) 100 Mean Task Time (min) 11.3 11.4 15.1 Mean Closeness to Expertise 0.483 0.752 0.712 Mean Goal Achievement (%) 97.4 98.2
74
Satisfaction and Usefulness
The average rating of user satisfaction is 91%. The average rating of usefulness of the tool is 92%. Statistical analysis shows that the mean value of post-understanding of the 5S model is significantly greater than that of pre-understanding.
75
Educational Use Say a little about other problems.
76
Learnability
77
Semantic Modeling of Digital Library with Concept Maps
Customized “plugin” tool to model scenarios and societies Tools with common principles, abstractions, graphical notations, and operations Solution: Concepts Maps Conceptual tools for organizing knowledge and representation
78
Conclusions Presented a domain specific visual modeling tool for DLs.
Evaluated the tool and proved efficiency, effectiveness, and learnability. Built new tools based on concept maps for scenario and societies modeling.
79
Future work on 5SGraph Integration of tools
Further usability studies with “digital librarians” Usiing the tools as educational aids for teaching about digital libraries
80
Motivating Problems – Toward 5SLGen (MS Thesis of Rohit Kelapure)
Lack of general models for Digital Libraries (DLs) Little focus on simplifying the process of modeling and building DLs Divergent DL architectures Monolithic: Tightly integrated and generally inflexible Componentized: A network of interoperable components aggregated without a design methodology Digital Libraries are complex information systems, which integrate research and findings from disciplines such as hypertext, information retrieval, multimedia, database management, and human computer interaction. The broad and deep requirements of DLs demands models in order to understand better the interaction among its components. DLs are not benefiting from formal theories as other CS fields: DB, IR, PL, etc While much attention has been paid to the study of making better digital libraries, little focus has been put on simplifying the process of modeling and building DLs. Most DLs are built by technical staff with little or no formal training in software engineering, or computer scientists who have little background in information science. This has resulted in the construction of DLs of arbitrary complexity that are difficult to maintain and extend. DLs are either built as monolithic, tightly integrated, and generally inflexible systems or by assembling components.
81
Implication: Problems with
Problems (contd.) Lack of DL-specific modeling languages, software toolkits, prototyping and CASE tools Lack of a scenario-based requirements analysis and design approach to DLs Implication: Problems with Interoperability Customizability Domain specific languages are explicitly designed to address a particular class of problems by offering specific abstractions and notations for the domain at hand. Current prototyping tools in this inter-disciplinary domain are limited to database and web applications. They do not capture the specific abstractions and notations for the domain at hand. They do not support specific DL patterns, models, and methodologies. A scenario is a sequence of events that occur during one particular execution of a system. Scenarios provide the means for capturing requirements specifications as well as a means of communication between users and software developers. Scenarios are key artifacts in systems engineering, but their management is poorly understood. A scenario-based design approach to DL development is lacking in current DL toolkits and prototyping tools. Interoperability (Lack of definitional consensus coupled with divergent approaches to building DLs lead to problems in interoperability ) Customizability (Lack of scenario-based approach result in lack of ability to tailor DL components and behaviors to particular user communities)
82
Approach Based on the formal 5S theory
Streams, Structures, Spaces, Scenarios and Societies Use of Domain-specific declarative languages (5SL) Scenario-based requirements analysis and design Componentized architectures Automatic transformations/mappings from models to code Special attention paid to issues of flexibility, reusability, and extensibility Our approach is based on the formal theory of 5S which provides a foundation for the DL generator. The 5S theory provides the abstractions of Streams, Structures, Spaces, Scenarios and Societies to define DLs Use of :--- We use a domain-specific language based on 5S, called 5SL, for declarative specification and automatic generation of DLs. Domain-specific languages enable applications to be programmed with domain abstractions Use of scenario-based design for defining the behavior of a system. We use scenarios to describe the behavior of DL services and societal interactions. The generated DL makes use of well-defined components that each carry out key DL functions interacting with one another using lightweight protocols. We draw heavily upon work with the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) and Open Digital Libraries. The DL Generator implements automatic transformations from 5SLSocieties and 5SLScenarios models to code. We bridge the model-system gap by successively lowering the layer of abstraction. Special Attention--- Flexibility: 5SLGen exports models of 5SLSocieties to other modeling languages such as the Unified Modeling Language (UML) through XMI. This open interchange of 5SL DL models provides flexibility in modeling DLs. Reusability: The set of 5SLSocieties and 5SLScenarios models generated during the course of this study can be reused as-is while implementing services for new DLs. Extensibility: 5SL Models for DL services can be extended or customized according to specific community needs
83
Approach: 5SLGen 5SLGen is a new generic digital library generator.
It has been developed, implemented, and deployed in several applications. 5SLGen yields implementations of digital library services from models of DL “societies” and “scenarios” (and from the other “Ss”). To address the issues raised above we have developed, implemented, and deployed a new generic digital library generator yielding implementations of digital library services from models of DL “societies” and “scenarios”. What exactly does it yield, will be elaborated upon later. For now implementation of DL services should suffice. A DL service exposes a specific functionality to its users to fulfill the users’ information needs. Basic DL services include services for indexing,searching, browsing and cataloging digital resources. The title of this thesis is scenario-based generation of DL services rather than scenario-based generation of DLs as we do not model and generate all components of a DL. We are only concerned with services that operate on the data rather than with storage, creation, archiving, and preservation. In the context of this work, the generation of DL services means generation of a DL that exposes the services without generating the content (data and metadata) stored in the DL.
84
5S Model/ 5SL Model Objective Primitives in 5SL Streams
Describes properties of the DL content text, audio, video, pictures, … Structures Specifies organizational aspects of the DL content digital object, metadata schema, collection, … Spaces Defines logical properties and presentational views of a DL vector, probabilistic, boolean, … Scenarios Details the behavior of DL services service, event, message, condition, action, state, … Societies Defines managers, (responsible for running DL services); actors (those who use services) and their relationships Service Managers, actors (e.g., learners, teachers, naïve users) The 5S theory provides a justification for the modeling concepts of the 5S metamodel. This table describes the 5S metamodel for DLs. The 5S metamodel provides streams, structures, spaces, scenarios and societies as fundamental modeling elements.
85
5SLGen: Model Model of 5SLGen:
The model defines composite services recursively as an aggregation of other services, composite or elementary. The application logic of a composite service is described by a workflow, i.e., a combination of control and data flows that mirror the behavior defined in the scenarios for the services. We chose Statecharts to represent the workflow of a service. Statecharts represent a compact way of describing the dynamic aspects of the system. Statecharts are an extension of finite state automata to include hierarchical decomposition, concurrency and structured transitions. The distinct aspects of this model are 1. The combination of an explicit workflow and service aggregation to support composite services. 2. The emphasis on scenario-based modeling of services and the automatic synthesis of Statecharts from them. 3. The role of the SM (a societal member) as the binding point for societal relationships,scenario interactions, and spatial visualizations. From an architectural and implementation point of view, point 1 becomes significant, since combining a small set of basic DL services (like searching and browsing) from a pool of DL components allows a designer to model and generate most digital libraries (at least from the behavioral point of view) with a minimum amount of coding. The only situations when coding is unavoidable are, for example, when a specific behavior of a composite service (e.g., Multiclassificationbrowsing) is not defined by any component in the core pool or cannot be reused
86
5SLSocieties Model Service Manager characteristics:
We now explain the 5SLSocieties model Large and complex systems involve a large number of SMs collaborating in different ways. When modeling 5SLSocieities we model both the SMs and the relationships in which they participate. Each SM is described by its name, attributes operations, type, visibility and its relationships. We model three kinds of relationships among SMs: associations, dependencies and generalizations. Figure shows the 5 fundamental modeling constructs of the 5SLSocieties model viz., attributes, operations, associations, generalizations and dependencies in a tree structure. Name: The name of the SM serves to distinguish it from other SMs. Attributes: An attribute represents a property of the SM that is shared by all instances of the SM. Operations:An operation is an abstraction of something that you can do to an instance of the SM. Dependencies A dependency is a relationship that states that a change in specification of one thing may affect another thing that is dependent on it. It is also referred to as a “using” relationship. Associations:An association is a structural relationship that specifies how SMs are connected to one another. Generalization:A generalization represents a relationship between a general thing (parent SM) and a more specific kind of thing (child SM). It is also referred to as an “is-a” relationship. Note for name and attribute, operation explain with this and next slide Note for association, dependency, generalization explain with this and next two slides Service Manager characteristics: Name, attributes, operations, type, visibility Service Manager relationships: Associations, generalizations (extends), dependencies
87
5SLScenarios Model We now present the 5SLScenarios model for a service and illustrate with an example Each service has a name and is composed of one-to-many scenarios. Each scenario consists of a note, an interface object, a start message and a list of events see Figure. A note serves the purpose of documenting the scenario. The interface object specifies the SM responsible for receiving user input events and presenting output to the user. The start message specifies the state of the interface SM before the scenario begins. Each event see is described by a sender, a receiver, the message between them and the list of actions taken on receipt of the message. The sender sends the message to the receiver, which on the receipt of the message takes the appropriate actions. Explain event w.r.t to the next two slides A message has a name, a method and a list of arguments. The method corresponds to the name of the operation defined in the 5SLSocieties model for the receiver SM. The name signifies the state of the SM after sending the message. An action is comprised of a name, a list of arguments and any exceptions thrown. An action is taken by a sender on receiving an event. Difference between an action and a message is that an action is always taken passively in response to an event whereas a message is sent actively by a SM.
88
Overview Architecture for DL Modeling and Generation
5S Meta Model 5SGraph DL Expert Designer 5SL Models 5SLGen Practitioner Researcher Tailored DL Services Teacher component pool ODLSearch, ODLBrowse, ODLRate, ODLReview, ……. The process of building a DL: The role of the DL expert is to design a metamodel for DLs which will be used for modeling the DL. We already have defined the 5S metamodel for DLs. The DL expert may either create a new metamodel, use the 5S metamodel or extend the 5S metamodel for digital libraries. The 5SGraph modeling tool processes the metamodel, allowing the DL designer to visualize the components of the metamodel. The DL designer must be aware of the functional requirements . The designer uses those visualized graphical components of the metamodel to put together the final model of his own digital library. The DL requirements acquired with 5SGraph are captured with user-level 5SL models. The 5SL DL models are input to the 5SLGen DL generator, along with a pool of reusable DL components (a component implements a specific service such as search, browse, rate, etc.,) to generate classes for the 5SFramework. The 5SFramework is a reusable design, expressed as a set of classes, that implements the modeled DL and supports reuse at a larger granularity than classes. This transformation from 5SL models to object-oriented classes involves scenario analysis, scenario-synthesis, component pool utilization and mapping of 5SL modeling elements to programming language concepts. The DL designer completes the modeling and generation process by modifying the generated 5SFramework classes and coupling them with the user-interface provided by the web developer.
89
DLServices Implementation
5SLScenarios Model DL Designer 5SLGen 5SLSocieties Societies converter Scenarios Java Classes XMI Serialized model Controller Class Synthesized Statechart import Component Pool ODL Browse Wrapping Search JSP User Interface View Web DLServices Implementation 5SLGen:Architecture First explain 5SLGen: Input, 5SLGen: Architectural Components, 5SLGen: Output 5SLGenInput 5SLGen DL generator takes three things as input. Focus on the components here as the models have been explained in the previous slides. 5SLSocieties model of the services being modeled 5SLScenarios model of the services being modeled Reusable set of classes that compose the component pool The functionality of each elementary service such as searching, browsing, etc., is provided by a component. A collection of such components is referred to as the component pool. The ODL project has helped constitute such a pool of components[11]. The implementation of the component can be in any language. Therefore, in order to use them on a particular platform or with a specific language we need to define translators or wrappers that translate any foreign operations on a component to its native operation. In the context of 5SLGen, the ODL components, originally implemented in Perl, have been encapsulated through a Java interface, allowing them to be imported by the Java classes for the SMs. Any component that follows the ODL protocols can be included into the component pool, by defining a wrapper that exposes its functionality in Java. 5SLGen: architectural components 5SLGen is primarily composed two architectural components viz., the scenarios-converter and the societies-converter. They are responsible for transforming the 5SLSocieties and 5SLScenarios models to the 5SFramework classes that implement the DL services.We focus on the functionalities of individual 5SLGen components. Focus on “who is doing what?” rather than “How it is being done?” The societies-converter operates on the 5SLSocieties model. Model over here represents the user-level model. i.e. an instance of the 5SLSocieties and 5SLScenarios model The societies-converter is responsible for transforming the 5SLSocietiesmodel to a programming language specific code skeleton. The generated Java classes along with the classes from the component pool constitute the application logic/model of the DL. The societies-converter is also responsible for serializing the 5SLSocieities model to XMI. This serialization of the 5SLSocieties model to XMI entails deriving a mapping of the UML metamodel to the 5S metamodel. Scenarios-converter operates on the 5SLScenarios model The scenarios-converter implements the scenario-synthesis algorithm on the 5SLScenarios models and generates a Statechart for DL services This transformation from multiple service scenarios to a single Statechart is explained later. The Statechart generated can be input to any valid state machine compiler to generate a controller for the DL services. The controller is responsible for interpreting the user's request, producing the appropriate request for action, examining the result of each action, and deciding what to do next. The controller class is based on the State design pattern by GOF. The State pattern localizes state-specific behavior in an individual class for each state. In the context of 5SFramework when the controller class receives an event, it delegates the request to its state object, which provides the appropriate state specific behavior. 5SLGen: Output The architecture of the generated 5SFramework classes can be differentiated into: a.The logic of the application (the model). The model consists of the SMs from the component pool and other SMs that are implemented/extended by the DL designer. b. The control of the interaction triggered by the user's actions (the controller). The controller is responsible for interpreting the user's request, producing the appropriate request for action, examining the result of each action, and deciding what to do next. The controller class is generated by synthesizing all the scenarios of the composite service. c. The interface presented to the user (the view). The View embodies the presentation logic for assembling the user-interface. 5SFramework
90
Societies-converter: Workflow
5SLSocieties Model DL Designer 5SLSocieties Model DL Designer Societies Societies - - converter converter Java Represen-tation. XMI Serializer JDOM JDOM XMI XMI Transform Transform Serializer Serializer Java Mapper Java XMI:Class XMI:Class Figure illustrates the workflow of the scenarios-converter. JDOM Transform: The 5SLSocieties model is parsed by a XML parser to generate an in memory representation of the input 5SLSocieties model. Construction of the parse tree is absolutely essential since it captures the document in a data structure on which all further processing algorithms operate. After construction of the parse tree, two kind of processing occur, namely, mapping to Java code and serialization to XMI. This corresponds to the two workflows of the societies-converter component. Java Mapper: The mapping of 5SLSocieties model constructs to Java is carried out by the Java Mapper. XMISerializer: The XMISerializer then applies a set of transformation rules to create an XMI representation of the 5SLSocieties model. The XMISerializer has an operation similar to that of the Java Mapper, however instead of mapping to Java classes, we are map from the 5SLSocieties model to the UML DTD for XMI. For a 5SLSocieties model containing 139 lines of XML, the corresponding XMI document is 1080 lines long. This gives an idea of the verbosity of the XMI standard and the complexity of the task. XMI:Class: The XMI generated from the XMISerializer is referred to as XMI:Class because we are mapping only the 5SLSocieties model to XMI. For a complete mapping to XMI, we also need to map the 5SLScenarios model to XMI. This XMI:Class can be imported into other CASE tools. XMI2Java:Once the XMI:Class model has been generated by the XMISerializer, one can use open-source XMI2Java or CASE, tools for XMI2Java conversion. The dotted arrow in Figure 5.1 from the XMI2Java package to the Java classes indicates that the workflow involving the XMISerializer, XMI:Class, and XMI2Java is optional. With XMI we have provided the flexibility to the DL designer to move away from 5SLGen’s Java Mapper to other code generation tools and CASE tools. CASE tools. Mapper Model Model Java Java Classes Classes Xmi2Java Xmi2Java Model Model 5SLGen:Architecture
91
Scenarios-Converter: Workflow
JDOM Transform 5SLScenarios Model Scenario Synthesizer Java Controller Class Synthesized Statechart State Machine Compiler DL Designer Figure illustrates the workflow of the scenarios-converter. The first step as before is parsing the 5SLScenarios model to create a parse tree. This is done by the JDOM Transform package. State-design pattern
92
Relevance Feedback Search Service UML Sequence Diagram
Event seq.no. = 3 Sequence diagram showing the sequence of interactions modeled with the 5SLScenarios instance 5SLScenarios instance
93
Scenarios-converter: Scenario-Synthesis
We will illustrate the implementation of the scenario-synthesis algorithm with respect to the scenarios for the Union Catalog DL. The Union Catalog DL exposes a composite service that is comprised of three elementary services, viz., search, browse and search-similar (search all items similar to a particular item). To maintain clarity we explain the scenario-synthesis with respect to the primary scenarios for each service (search, browse and search-similar). The scenario synthesis will be carried out for the controller SM (UIMFSM) of the composite service. The scenario synthesis algorithm consists of two transformations that are applied to the scenarios of a service. The scenario-synthesis is always carried out with respect to a particular object. Consider the sequence diagram and the corresponding Statechart for the search scenario of the Union Catalog DL. The doGet, search_query and return_search_results are incoming events in the sequence diagram. These are mapped to the triggers the Statechart. On the reception of doGet, search_query and return_search_results events, the controller carries out the UIMFSM.parseRequest, ODLSearch.odlsearch, and ODLUtilties.transformResults actions respectively. These actions are mapped to their corresponding triggers. This mapping of trigger to actions completely specifies all the transitions of the Statechart. The outgoing event names, search and display along with the initial state MainMenu constitute the states in the Statechart. The initial state of the controller is specified in the 5SLScenarios model for the service. Rule1: Incoming events trigger transitions Outgoing events become actions of the transitions leading to states On receiving an event the list of actions associated with the event are carried out by the receiver
94
Scenarios-converter: Scenario-Synthesis (contd.)
The two figures shows the Statecharts generated by applying rule 1 to the search, browse and search-similar scenarios. Rule 2- merge all the statecharts generated in step 1 according to the following rules: In a succession of two scenarios, the resulting Statechart merges the two basic corresponding Statecharts in temporal order If a transition is common to the two Statecharts, it will be taken only once in the final Statechart If at a certain moment in time either one or another scenario is executed, the Statecharts are combined with OR type substate within a composite state If two scenarios are executed at the same time they are combined with AND type substate within a composite state
95
Synthesized-Statechart
Figure shows the synthesized Statechart that results from application of rules 2.1, 2.2, 2.3 and 2.4. The search-similar scenario follows the search and browse scenarios in temporal order (application of rule 2.1). The doGet transition in the MainMenu state is common to both the search and the browse scenarios and appears only once in the synthesized Statechart (application of rule 2.2). The search and the browse states are joined in an OR relationship. (application of rule 2.3). Rule 2.4 cannot be applied while merging the generated Statecharts, as two scenarios are never executed simultaneously. Component statecharts
96
Generated DLs Union Catalog CITIDEL, including VIADUCT
Simple DL with maximum reuse 2 components used: Search and Browse CITIDEL, including VIADUCT Aggregates all the 5SLSocieties and 5SLScenarios models for its elementary services Thus, the approach we followed was that of modeling and generation of the services, followed by modeling and generation of the DL. CITIDEL: Multi-Classification Browsing Service Generation of complex services without any component resuse CITIDEL: Profile Based Filtering Service Demonstrate reusability with the ODL Browse component CITIDEL: Relevance Feedback Search Service Demonstrate extensibility with the ODL Search component CITIDEL: Binding Service Complete the set of CITIDEL services DLs Generated Union Catalog Simple DL with maximum reuse (2 components used Search, Browse) CITIDEL Aggregates the 5SLSocieties and 5SLScenarios models for its elementary services VIADUCT Completes the implementation of CITIDEL and its sister DLs
97
Generated DL Services CITIDEL: Relevance Feedback Search Service
Demonstrate extensibility with the ODL Search component CITIDEL: Profile Based Filtering Service Demonstrate reusability with the ODL Browse component CITIDEL: Multi-Classification Browsing Service Generate complex services without any component reuse CITIDEL: Binding Service Complete the set of CITIDEL services Thus, the approach we followed was that of modeling and generation of the services, followed by modeling and generation of the DL. CITIDEL: Multi-Classification Browsing Service Generation of complex services without any component resuse CITIDEL: Profile Based Filtering Service Demonstrate reusability with the ODL Browse component CITIDEL: Relevance Feedback Search Service Demonstrate extensibility with the ODL Search component CITIDEL: Binding Service Complete the set of CITIDEL services DLs Generated Union Catalog Simple DL with maximum reuse (2 components used Search, Browse) CITIDEL Aggregates the 5SLSocieties and 5SLScenarios models for its elementary services VIADUCT Completes the implementation of CITIDEL and its sister DLs
98
Profile Based Filtering (PBF) Service 5SFramework
ODL-Browse componenent Controller Model In the Figure one can see three sets of SMs, the controller (UIMFSM), the view SM (ProfileFilteringImpl) and the model SMs (ProfileBasedFilteringImpl, SAXPrintHandler, ODLBrowseImpl). All the SMs together constitute the 5SFramework classes. This demonstrates the architecture of 5SFramework classes based on the MVC design pattern. View
99
Conclusion Introduced a scenario-based approach to the generation of componentized DLs Applied the 5SFramework for generation of DLs Partially validated the theory of 5S Demonstrated that complex DLs can be built on the basis of a formal theory Adherence to open standards (OAI-PMH, ODL, XMI, UML) and established design patterns (MVC, GOF’s State) ensures relevance and extensibility of our work. Pretty explanatory. Just narrate.
100
Future Work Integration of 5SLGen with 5SGraph
Improvements to 5SFramework architecture Scalability of the generated DLs and DL services Automated construction of user-interfaces with statecharts Support for transaction scoping and error handling Web services support Incorporating the uPortal framework Model Validation Personalization of the 5S approach using PIPE Integration of 5SLGen with 5SLGraph 5SGraph is a visual tool for modeling DLs 5SGraph in its current state does not support the modeling of 5SLScenarios and 5SLSocieties. 5SGraph needs to provide support for the creation and reuse of the 5SLScenarios and 5SLSocieties models to realize the integration of both tools. This integration of 5SGraph and 5SLGen will result in the creation of a complete CASE tool based on the 5S theory that will provide for the complete lifecycle development of DLs Incorporating the uPortal framework uPortal is a distributed multi-tiered Internet application framework for developing a web portal and developing content for display within that portal. The uPortal framework consists of a number of uPortal components or portlets communication together via XML. uPortal has a number of components that provide services such as LDAP authentication, single-sign on, etc. The 5SFramework could incorporate the uPortal framework for the presentation of DL content and include its components into the component pool thereby allowing the 5S approach to build on the advantages offered by this open-source uPortal technology. As uPortal components talk to one another through XML the integration with 5SLGen should be smooth one. Scalability of the Generated DLs and DL Services Each DL or DL service implemented using the 5SFramework has a single controller. Thus if two or more users use the service simultaneously the single controller for the service cannot maintain a consistent state. The solution to this is to create an instance of the controller for every user. Model Validation We need to carry out 5SL model validation to ensure the correctness and consistency of the generated DLs. This will involve comparison of the real systems and the modeled systems on a number of metrics including time for implementation, complexity of generated code, functionality, maintainability, reliability etc. An alternative way to perform model validation would be to carry out Wizard of Oz tests. Personalization of the 5S Approach PIPE is an approach to personalizing information-seeking interactions by transforming programmatic representations.PIPE models can be mapped on to 5SL models. The implementation of 5SLGen needs to adapted to accept personalized5SL DL models as input so as to generate personalized 5SFramework classes that can be customized to implement a personalizable DL.
101
DL Student Research: Torres
Search in collections of fish images using combination of image properties (CBIR) and textual descriptions
102
Textual information retrieval
Query on Google using Sunset and Rio de Janeiro Query result
103
Content Based Information Retrieval
104
Torres: Visualizations
Concentric Rings Pattern Spiral Pattern
105
DL Student Research: Shen
5S and component architecture to allow handling of very complex DL applications: archaeology Information visualization, clustering Mappings across streams, structure, spaces
106
Case Study (Archaeology): ETANA
NSF ITR with CWRU (and Vanderbilt …) Faster DL development for complex application domains, with suitable tailoring Approach ODL – pool of components 5S – theory-based generation of systems
107
ETANA Website
108
Lahav Website
109
Megiddo Opening Screen
110
Locus Screen: Pictures
View all
111
Area Screen: Distribution of Artifacts
113
ETANA-DL Website
114
Archaeology DL – Approach
Solve the following DL problems: interoperability, making primary data available, data preservation Modeling archaeological information systems using 5S theory to design system and services Rapidly prototyping DLs that handle heterogeneous archaeological data using componentized frameworks
115
ETANA-DL Schema Design
Bone Seed Figurine ETANA-DL Object Count Animal …… Species Name Description Dimensions Owner Subpartition Partition Locus ID Container Collection
116
Data Mapping
117
ETANA-DL Architecture
DigKit DigBase Users Services Data ETANA-DL Union Services Users
118
ETANA-DL Architecture DigBase and DigKit
Search U S E R I N T F A C Lahav D A T B S E W R P Browse Nimrin Recommend Umayri ETANA-DL UNION CATALOG Note Hisban Personalize Megiddo Review Visualizations Jalul Archaeology Specific … New Sites Work in progress
119
Architecture DigKit DigBase Inverted Files Search XOAI
Union Catalog Inverted Files Services DB Index Browse Component Search Browse DB Other ETANA-DL Services Web Interface XOAI DigBase DB Data Mapping OAI Data Provider OAI Archaeological Site DigKit Configure
120
Searching – Search Results
121
Searching – Advanced Search
122
Searching – Advanced Search Results
123
Review of Gonçalves Achievements in Past Year
Book Chapters Fox, E. A., Gonçalves, M. A., Luo, M., Chen, Y., Krowne, A., Zhang, B., McDevitt,, K. Pérez-Quiñones, M., Cassel, L. N. Harvesting: Broadening the Field of Distributed Information Retrieval. In Multimedia Distributed Information Retrieval, eds. Fabio Crestani, Mark Sanderson, and Jamie Callan, 2003. Fox, E., McMillan, G., Suleman, H., Gonçalves, M., Networked Digital Library of Theses and Dissertations. Invited chapter for “Digital Libraries: Policy, Planning, and Practice”, eds. Judith Andrews and Derek Law, Ashgate Publishing, 2003 Journal papers 5S TOIS paper (April 2004, issue) S. Perugini, M. A. Gonçalves, and E. A. Fox. A Connection-Centric Survey of Recommender Systems Research. Journal of Intelligent Information Systems, Jun, 2004. Zhu, Q., Gonçalves, M. A., Fox, E. A.. 5SGraph: A Domain-Specific Visual Modeling Tool for Digital Libraries. Journal of the American Society for Information Science and Technology, submitted 2003, in revision Baoping Zhang, Marcos Andre Goncalves, Yuxin Chen, Edward A. Fox, and Pavel Calado, "Combining Support Vector Machines and Structural Rules for Effective Filtering of OAI-Based Repositories", submitted to Journal of Digital Libraries (Springer Verlag) Special Issue on Asian Digital Libraries, 2004
124
Conference papers Pável P. Calado, Marcos André Gonçalves, Edward A. Fox, Berthier Ribeiro-Neto, Alberto H. F. Laender, Altigran S. da Silva, Davi C. Reis, Pablo A. Roberto,Monique V. Vieira, and Juliano P. Lage. The Web-DL Environment for Building Digital Libraries from the Web. JCDL'2003, Third Joint ACM / IEEE-CS Joint Conference on Digital Libraries, May 27-31, 2003, Houston. Marcos André Gonçalves, Ganesh Panchanathan, Unnikrishnan Ravindranathan, Aaron Krowne, Edward A. Fox, Filip Jagodzinski, and Lillian Cassel. The XML Log Standard for Digital Libraries: Analysis, Evolution, and Deployment. Proc. JCDL'2003, Third Joint ACM / IEEE-CS Joint Conference on Digital Libraries, May 27-31, 2003, Houston. Qinwei Zhu, Marcos André Gonçalves, Rao Shen, Lillian Cassel, Edward A. Fox. Visual Semantic Modeling of Digital Libraries. ECDL'2003, 7th European Conference on Research and Advanced Technology for Digital Libraries, August, 2003, Trondheim, Norway. Rohit Kelapure, Marcos André Gonçalves, Edward A. Fox. Scenario-Based Generation of Digital Library Services. ECDL'2003, 7th European Conference on Research and Advanced Technology for Digital Libraries, August, Trondheim, Norway Marco Cristo, Pavel Calado, Edleno Moura, Nivio Ziviani, Berthier Ribeiro-Neto, and Marcos André Gonçalves. Combining Link-Based and Content-Based Methods for Web Document Classification. CIKM 2003, 3-8 November, New Orleans, Louisiana, USA, 2003. Baoping Zhang, Marcos Andre Goncalves, and Edward A. Fox. An OAI-based Filtering Service for CITIDEL from NDLTD. ICADL 2003, 6th International Conference of Asian Digital Libraries, 8-11 December, Kuala Lumpur, Malaysia, 2003 U. Ravindranathan, R. Shen, M. A. Goncalves, W. Fan, E. A. Fox, and J. W. Flanagan. ETANA-DL: A Digital Library for Integrated Handling of Heterogeneous Archaeological Data. To be presented at ACM-IEEE Joint Conference on Digital Libraries (JCDL 2004), Tucson, AZ, June 7-11, 2004.
125
Conference papers Other publications
U. Ravindranathan, R. Shen, M. A. Goncalves, W. Fan, E. A. Fox, and J. W. Flanagan. ETANA-DL: A Digital Library for Integrated Handling of Heterogeneous Archaeological Data. To be presented at ACM-IEEE Joint Conference on Digital Libraries (JCDL 2004), Tucson, AZ, June 7-11, 2004. M. A. Goncalves, E. A. Fox, A. Krowne, P. Calado, A. H. F. Laender, A. S. da Silva, and B. Ribeiro-Neto. The Effectiveness of Automatically Structured Queries in Digital Libraries. To be presented at ACM-IEEE Joint Conference on Digital Libraries (JCDL 2004), Tucson, AZ, June 7-11, 2004. Alberto H. F. Laender, M. A. Goncalves, Pablo A. Roberto. BDBComp: Building a Digital Library for the Brazilian Computer Science Community. To be presented at ACM-IEEE Joint Conference on Digital Libraries (JCDL 2004), Tucson, AZ, June 7-11, 2004. U. Ravindranathan, R. Shen, M. A. Goncalves, W. Fan, E. A. Fox, and J. W. Flanagan. Prototyping Digital Libraries Handling Heterogeneous Data Sources - The ETANA-DL Case Study. European Conference on Digital Libraries (ECDL 2004), Bath, UK, September 12-17, (submitted) Other publications R. da S. Torres, C. B. Medeiros, M. A. Goncalves, and E. A. Fox. An OAI-based Digital Library Framework for Biodiversity Information Systems. Department of Computer Science, Virginia Tech, Technical Report No. TR-04-01, 2004. R. da S. Torres, C. B. Medeiros, M. A. Goncalves, and E. A. Fox. An OAI Compliant Content-Based Image Search Component. Demo to be presented at ACM-IEEE Joint Conference on Digital Libraries (JCDL 2004), Tucson, AZ, June 7-11, 2004. R. da S. Torres, C. B. Medeiros, Renata Q. Dividino, Mauricio A. Figueiredo, M. A. Goncalves, E. A. Fox, and R. Richardson. Using Digital Library Components for Biodiversity Systems. Poster to be presented at ACM-IEEE Joint Conference on Digital Libraries (JCDL 2004), Tucson, AZ, June 7-11, 2004. U. Ravindranathan, R. Shen, M. A. Goncalves, W. Fan, E. A. Fox, and J. W. Flanagan. ETANA-DL: Managing Complex Information Applications – An Archaeology Digital Library. Demo to be presented at ACM-IEEE Joint Conference on Digital Libraries (JCDL 2004), Tucson, AZ, June 7-11, 2004. Qinwei Zhu, Marcos André Gonçalves, E. Fox. 5SGraph Demo: A Graphical Modeling Tool for Digital Libraries. Proc. JCDL'2003, Third Joint ACM / IEEE-CS Joint Conference on Digital Libraries, May 27-31, 2003, Houston.
126
Proposed Outline of Dissertation (Marcos André Gonçalves)
Chapter 1 – Introduction and Motivation Chapter 2 – Background and Related Work Chapter 3 – Streams, Structures, Spaces, Scenarios and Societies: the 5S Formal Model for Digital Libraries Chapter 4 – Towards a Digital Library Theory: A Formal Digital Library Ontology based on 5S Chapter 5 – Applications of the 5S Model/Ontology 5.1 Declarative Specification of DLs: the 5S Language 5.2 Semantic Visual Modeling of DLs: the 5SGraph Tool 5.3 (Semi-) Automatic Generation of Componentized DLs: The 5SGen Tool 5.4 Evaluating DLs: The XML Log Standard for DLs 5.5 Formally comparing Architectures: Fedora and Buckets (time permitting) Chapter 6 – Defining Quality in Digital Libraries Chapter 7 – Conclusions and Future Work Appendix 1- Mathematical Preliminaries
127
Questions/Discussion?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.