Digital Library Service Integration (DLSI) --> Looking for Collections and Services to be DLSI Testbeds <-- Michael Bieber, Il Im, Yi-Fang Wu Xin Chen, Dong-ho Kim, Nkechi Nnadi, Prateek Shrivastava Information Systems Department College of Computing Sciences New Jersey Institute of Technology Lexical Analysis Two Kinds of Links (1) Structural Links based on object type (2) Links based on lexical analysis Why Integrate with DLSI? Users gain direct access to related systems –enlarges your system’s feature set DLSI leads users to your system –your system gains wider use Users become aware of other systems –your system gains wider awareness Direct access to your system’s features –adds streamlined access Structural Links: Based on Object Type Example: document link to author information link to all locations for this document link to peer reviewing for this document Example: concept link to definition link to related concepts Example: every object link to discussions about this object link to comments about this object link to service for starting a discussion, comment, etc. Issue: Generalizing Services Looking for services to generalize and share among collections! Example: peer review originally designed for 3 reviewers and anonymous how to generalize for another collection wanting 5 reviewers and not anonymous Collaboration Opportunity DLSI Integration Architecture Services and collections integrate with minimal or no changes. They also continue to operate independently of DLSI. To Integrate a Collection or Service with DLSI: Write a wrapper for the collection/service Initiate communications between collection/service & the wrapper Define relationship rules for generating links Collaboration Opportunity: We’ll help you do this! Dashed paths indicate that once integrated, collections and services can share features through DLSI links automatically. DLSI Core Search & Discovery Service DLSI Integration: What Users See DLSI automatically generates links to related collections and services. Links generated to the concept “Plant Pathology” Links generated to the document as a whole Our Concept Hierarchy Developer uses indexed noun phrases and their co-occurrences in the text to develop document-set dependent concept hierarchies for faster browsing and navigation. Purposes: 1.To identify concepts that are not recognized by structural analysis 2.To organize concepts and link them to relevant text for passage retrieval Lexical Analysis and Concept Extraction Noun Phrase Extractor parses documents to find noun phrases by using syntactic rules and Wordnet lexical database. Returned Documents Concept Organization and Linking Upon selecting a term of interest, a user first sees relevant paragraphs. This saves the user’s time by filtering out irrelevant parts of a long document. For more information, fulltext is also available. Relevant Paragraphs display modeFulltext display mode Collaborative Filtering: Customizing the Set of Generated Links Purposes: 1.To present links most relevant to current user’s task 2.To reduce information overload by reducing the number of links presented Collaborative Filtering Evaluation Acquisition Engine Collaborative Filtering Engine Evaluation Database “Computerized Word-of-Mouth”: Finds people with similar tastes/interests and utilizes their evaluations to estimate the likelihood that the current user would like an item. 1.Calculate degree of similarity between the current user and other users. 2.Identify a group of people (Reference group) who share common interests with the current user. 3.Calculate estimated evaluations for items that the current user has not seen (or evaluated). An estimated evaluation predicts the current user’s evaluation on an item. 4.Rank order the items according to the estimated evaluations and select the top n items to recommend to the current user. Collaborative Filtering in DL Service Integration Three types of data to be used as users’ evaluations: Direct evaluation Clickstream (a sequence of mouse clicks) Time spent for each link Multiple needs (multiple contexts) will also be supported Collaborative Filtering Architecture in DL DLSI Integration Manager Sorted list Unsorted list Recommendation request Clickstream Evaluations Time information Inferred evaluation Digital Libraries