Nnadi & Bieber, NJIT © Lightweight Integration of Documents and Services (Digital Library Integration Infrastructure) Nkechi Nnadi and Michael Bieber Information Systems Department College of Computing Sciences New Jersey Institute of Technology November 2004
Nnadi & Bieber, NJIT © Integration through Linking automatically generate link anchors on elements we recognize based on: –Structural relationships –Lexical relationships automatically generate links –to related information –to relevant services ==> lightweight integration of –documents containing links and –documents/services the links point to
Nnadi & Bieber, NJIT © Prototype Services for a launch-date element: - search by launch date - search by month and year - search by year
Nnadi & Bieber, NJIT © Prototype Services for a document element: - open - summarize in 3 sentences
Nnadi & Bieber, NJIT © Mock-up for a library database Services from multiple systems (customized to user tasks/preferences)
Nnadi & Bieber, NJIT © Two Types of Links: (1) structural based on element type * title, author, source (2) lexical (found in a glossary) structural elements and links lexical elements and links
Nnadi & Bieber, NJIT © Structural Relationships Links generated based on application structure, not search or lexical analysis –You cannot do a search on the display text “$127,322.12” to find related information… –But you can find relationships for the element Sales[2002] $85,101.99$127, Expenses2002 Sales
Nnadi & Bieber, NJIT © Three Types of Integration: (1) for documents to receive anchors and links (2) to provide services (which become links) (3) to provide glossaries for content analysis Require a document schema mapper to recognize structural elements: -wrapper -fixed template -XML markup -etc.
Nnadi & Bieber, NJIT © Three Types of Integration: (1) for documents to receive anchors and links (2) to provide services (which become links) (3) to provide glossaries for content analysis Linking Rules represent * every service * that a system can provide * for each kind of element.
Nnadi & Bieber, NJIT © Example Linking Rule from the AskNSDL system –a) element type (“concept”) –b) link display label (“Ask an expert about this concept”) –c) relationship metadata –d) destination collection or service (“Ask NSDL”) –e) the exact command to send to the destination system (logs the user into AskNSDL, opens question template, fills in the element instance (i.e., “Plant Pathology”) as the subject, and places the cursor in the question area) –f) any relevant conditions for including this relationship
Nnadi & Bieber, NJIT © Three Types of Integration: (1) for documents to receive anchors and links (2) to provide services (which become links) (3) to provide glossaries for content analysis Lexical analysis by: NJIT Noun Phrase Extractor NJIT Ontology Developer
Nnadi & Bieber, NJIT © Each system is integrated independently: (1) Schema mappers for individual systems (2) Linking rules are plugged in” independently for each service (3) Glossaries and thesauri can be independent of other systems
Nnadi & Bieber, NJIT © Contributions straightforward, sustainable approach for integrating documents and services –Lightweight integration through linking developing filtering mechanisms for customizing large sets of links combining structural links with content-based links using advanced lexical analysis tools (NJIT Noun Phrase Extractor and Ontology Developer)
Nnadi & Bieber, NJIT © Looking for Collaboration Additional document systems, digital library collections, services and glossaries to integrate Libraries to integrate Web services to integrate Other suggestions welcome!
Nnadi & Bieber, NJIT © Ph.D. Research Opportunities Integration through linking Customizing services to work in many domains Service chaining: creating new services by chaining existing ones –e.g., translate and summarize documents Collaborative Filtering for digital libraries Lexical analysis –Automatic summarization of returned documents –including maintaining multiple glossaries/thesauri Full Virtual Community support –analysis, tools, processes, evaluation
Nnadi & Bieber, NJIT © To Integrate: (1) Schema mapper: parses screens to identify elements (2) provide metadata/structural relationship rules (3) identify glossaries for content relationships User’s Web Browser AskNSDL Wrapper AVC Wrapper NSSDC Wrapper CI Search Service Wrapper Service Wrapper (i) AskNSDL AVC N’l Space Science Data Center NSDL CI Search Service Service (i) ME Relationship Engine ME Broker ME Desktop Metainformation Engine ME Lexical Analysis existing system or Web service uses Java, XML, XPath, etc.
Nnadi & Bieber, NJIT © Benefits of Integration for a system (collection/service) Users: direct access to related systems –enlarges a system’s feature set Links leads users to a system –systems gain wider use Users become aware of other systems –systems gain wider awareness Direct access to a system’s features –streamlined access (bypassing menus)