AgentMatcher Search in Weighted, Tree-Structured Learning Object Metadata H. Boley, V.C. Bhavsar, D. Hirtle, A. Singh, Z. Sun and L. Yang National Research.

Slides:



Advertisements
Similar presentations
An Overview of the Integration of the UCSF Dept. of Radiology Teaching File with MIRC Wyatt M. Tellis University of California San Francisco Departments.
Advertisements

EPrints Web Configuratio n Management. SQL database Web server Scripts to configure repository activities Configuration files EPrints - the Administrator's.
Advanced XSLT II. Iteration in XSLT we sometimes wish to apply the same transform to a set of nodes we iterate through a node set the node set is defined.
28 March 2003e-MapScholar: content management system The e-MapScholar Content Management System (CMS) David Medyckyj-Scott Project Director.
Retrieval of Information from Distributed Databases By Ananth Anandhakrishnan.
CG0119 Web Database Systems Parsing XML: using SimpleXML & XSLT.
1 XSLT – eXtensible Stylesheet Language Transformations Modified Slides from Dr. Sagiv.
Information Retrieval in Practice
Parametric search and zone weighting Lecture 6. Recap of lecture 4 Query expansion Index construction.
Visual Web Information Extraction With Lixto Robert Baumgartner Sergio Flesca Georg Gottlob.
1 COS 425: Database and Information Management Systems XML and information exchange.
G. Gottlob, C. Koch & R. Pichler TU Wien, Vienna, Austria Elias Politarhos Advanced Databases M.Sc. in Information Systems Athens University of Economics.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
XML –Query Languages, Extracting from Relational Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria.
XML Technologies and Applications Rajshekhar Sunderraman Department of Computer Science Georgia State University Atlanta, GA 30302
1 System: Teallach Presenters: Baolinh Le, [Bryce Carder] Course: Knowledge-based User Interfaces Date: April 29, 2003 Teallach: A Model-Based User Interface.
WSDL Web Services Description Language Neet Wadhwani University of Colorado 3 rd October, 2001.
Overview of Search Engines
OCLC Online Computer Library Center Two Paths to Interoperable Metadata Jean Godby, Devon Smith, Eric Childress DC-2003 September 29, 2003.
1 LOMGen: A Learning Object Metadata Generator Applied to Computer Science Terminology A. Singh, H. Boley, V.C. Bhavsar National Research Council and University.
Sheet 1XML Technology in E-Commerce 2001Lecture 6 XML Technology in E-Commerce Lecture 6 XPointer, XSLT.
Aurora: A Conceptual Model for Web-content Adaptation to Support the Universal Accessibility of Web-based Services Anita W. Huang, Neel Sundaresan Presented.
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
1 A Weighted-Tree Similarity Algorithm for Multi-Agent Systems in e-Business Environments Virendra C.Bhavsar* Harold Boley** Lu Yang* * Faculty of Computer.
16-1 The World Wide Web The Web An infrastructure of distributed information combined with software that uses networks as a vehicle to exchange that information.
1 The Architectural Design of FRUIT: A Family of Retargetable User Interface Tools Yi Liu, H. Conrad Cunningham and Hui Xiong Computer & Information Science.
XSLT for Data Manipulation By: April Fleming. What We Will Cover The What, Why, When, and How of XSLT What tools you will need to get started A sample.
XML Overview. Chapter 8 © 2011 Pearson Education 2 Extensible Markup Language (XML) A text-based markup language (like HTML) A text-based markup language.
Learning Object Metadata Mining Masoud Makrehchi Supervisor: Prof. Mohamed Kamel.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
INTERPRETING IMPERATIVE PROGRAMMING LAGUAGES IN EXTENSIBLE STYLESHEET LANGUAGE TRANSFORMATIONS (XSLT) Authors: Ruhsan Onder Assoc.
DSpace UI Alexey Maslov. DSpace in general A digital library tool useful for storage, maintenance, and retrieval of digital documents Two types of interaction:
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
Meta Tagging / Metadata Lindsay Berard Assisted by: Li Li.
1 Weighted Partonomy-Taxonomy Trees with Local Similarity Measures for Semantic Buyer-Seller Matchmaking By: Lu Yang March 16, 2005.
1 XSLT An Introduction. 2 XSLT XSLT (extensible Stylesheet Language:Transformations) is a language primarily designed for transforming the structure of.
CITA 330 Section 6 XSLT. Transforming XML Documents to XHTML Documents XSLT is an XML dialect which is declared under namespace "
SharePoint 2010 Search Architecture The Connector Framework Enhancing the Search User Interface Creating Custom Ranking Models.
1 Lu Yang, Biplab Sarker, Virendrakumar C. Bhavsar and Harold Boley Faculty of Computer Science University of New Brunswick (UNB) Fredericton,
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
1 Weighted Partonomy-Taxonomy Trees with Local Similarity Measures for Semantic Buyer-Seller Match-Making Lu Yang, Marcel Ball, Virendra C. Bhavsar and.
CPSC 203 Introduction to Computers Lab 33 By Jie Gao.
ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University.
SNOWTAM Trial: REST Interface. AIXM XML Developers' Seminar 2 Contents Digital-SNOWTAM Trial Introduction REST Introduction REST in the Digital-SNOWTAM.
1 Weighted-Tree Simplicity Algorithm for Similarity Matching of Partial Product Descriptions Lu Yang, Biplab Sarker, Virendra C. Bhavsar and Harold Boley.
Weighted Slotted RuleML for Similarity Matching in AgentMatcher Information Agents Harold Boley, NRC IIT e-Business Virendra Bhavsar, UNB, Faculty of Computer.
The AgentMatcher Architecture Applied to Power Grid Transactions Riyanarto Sarno Faculty of Information Technology, Sepuluh Nopember Institute of Technology.
LOGO A comparison of two web-based document management systems ShaoxinYu Columbia University March 31, 2009.
Masoud Makrehchi, PAMI, UW Learning Object Metadata Masoud Makrehchi PAMI University of Waterloo August 2004.
HTML Basic. What is HTML HTML is a language for describing web pages. HTML stands for Hyper Text Markup Language HTML is not a programming language, it.
Introduction to Morpho RCN Workshop Samantha Romanello Long Term Ecological Research University of New Mexico.
1 Registry Services Overview J. Steven Hughes (Deputy Chair) Principal Computer Scientist NASA/JPL 17 December 2015.
Information Retrieval Transfer Cycle Dania Bilal IS 530 Fall 2007.
Computing & Information Sciences Kansas State University Friday, 20 Oct 2006CIS 560: Database System Concepts Lecture 24 of 42 Friday, 20 October 2006.
Search Engine Know- How: How To Optimize Your Content, Navigation Pages, & Documents For Search Engines.
Reviews Crawler (Detection, Extraction & Analysis) FOSS Practicum By: Syed Ahmed & Rakhi Gupta April 28, 2010.
Feb 24-27, 2004ICDL 2004, New Dehli Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
2) Database System Concepts and Architecture. Slide 2- 2 Outline Data Models and Their Categories Schemas, Instances, and States Three-Schema Architecture.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
A MARC-LOM Crosswalk for e-Learning Yang Cao, Fuhua Lin, Rory McGreal, etc May 19, 2004.
I Copyright © 2004, Oracle. All rights reserved. Introduction.
Information Retrieval in Practice
Databases (CS507) CHAPTER 2.
Introduction to Internet Programming
XML Data Introduction, Well-formed XML.
Presentation transcript:

AgentMatcher Search in Weighted, Tree-Structured Learning Object Metadata H. Boley, V.C. Bhavsar, D. Hirtle, A. Singh, Z. Sun and L. Yang National Research Council and University of New Brunswick Learning Objects Summit Fredericton, NB, Canada, March 29-30, 2004

1 Outline AgentMatcher Overview User Interface Translator Similarity Engine LOM Generator Conclusion

2 AgentMatcher Overview AgentMatcher developed for –match-making between –buyer and seller agents Instantiated to –learners in search for –procurable learning objects (LOs) described by Learning Object Metadata (LOM)

3 Indexing Components Architecture Similarity Engine (Java) Translator (XSLT) CANLOM (XML) Prefilter (SQL) LOMGen (Java) LOR (HTML) End user Administrator user input prefilter parameters (Query URI) WOO RuleML file Recommended results HTML files partial CanCore files CanCore files prefiltered CanCore files WOO RuleML files DATABASE (Access) UI (Java) Keyword Table Administrator input (1) (2) (4) (5) (6) (7) (3) (8) (a) (b) (c) Search Results Components developed by TeleEd or a third party Dataflow between TeleEd and UNB/NRC Dataflow realized by UNB Dataflow realized by TeleEd Components developed by UNB/NRC Retrieval Components

4 AgentMatcher Components User Interface Generates prefilter parameters (Query URIs) Generates query trees Communicates between internal and external components Translator Transforms LOM from Prefilter to WOO RuleML

5 Similarity Engine Computes the similarity between query and LOM trees Presents ranked search results as HTML table LOMGen (Learning Object Metadata Generator) Semi-automatically extracts information from LOs (in HTML) Constructs keyword/key phrase database Posts to CANLOM Metadata Repository AgentMatcher Components (Cont’d)

6 Interacts between user and the advanced search components User Interface (ZhongWei Sun) Overview Similarity Engine (Java) Translator (XSLT) CANLOM (XML) Prefilter (SQL) LOMGen (Java) LOR (HTML) Administrator user input WOO RuleML file Recommended results HTML files partial CanCore files CanCore files prefiltered CanCore files WOO RuleML files DATABASE (Access) UI (Java) Keyword Table Administrator input Search Results prefilter parameters (Query URI) End user

7 User Interface Retrieves keywords from LOMGen database Allows convenient input of tree structured features as well as corresponding weights for them Generates a Weighted Object-Oriented RuleML document Sends prefiltering query to, and retrieves LOM response stream from, KnowledgeAgora Splits the returned response LOM document Invokes Translator and Similarity Engine — Actions

8 Overview of User Interface Features within the same CanCore schema category are grouped within one box “Don’t care” is in green to tell the user it will match arbitrary values Default values except for “Don’t care” are in red to remind the user of potential need for change The relative importance of each feature is entered via sliders from 0.0 to 1.0, summing up to 1.0

9 User Interaction User inputs tree structured features and assigns a weight to each of them This weight indicates the feature’s importance to the user relative to other features within the same category Weights can be entered with sliders, where all other weights of that group self-adjust to keep their sum at 1.0 User inputs a threshold as a cut-off line for the final result recommendation The learning objects with similarities above the threshold are considered as recommended

10 User Interface (Screenshot 1) cannot be changed, hence grayed out “database” (not visible here) and “Oracle” are chosen as Keywords

11 Corresponding Query Tree (1) generalclassification lom title language en Introduction to Oracle general-set taxonpath 1.0 taxon Applied Sciences, Technology classification-set taxon-set 1.0 keyword 0.5 keyword- set database 0.5  …  oracle 0.5 lexicographically ordered subtrees

12 kept empty since not standardized and not important to students non-CanCore element, hence not filled out User Interface (Screenshot 2)

13 Corresponding Query Tree (2) lifeCycle educational lom language contribute-set contribute lifeCycle- set learning Resource Type 0.0 context 0.1 educational-set intende dEndUs erRole 0.3 typicalAge Range 0.0 CourseLearner 0.6 … entity    1.0

14 User Interface (Screenshot 3) Threshold

15 Corresponding Query Tree (3) technicalrights lom format technical-set 0.8 copyrightAnd OtherRestrictions 0.2 rights-set descriptions …  YesHTML otherPlatform Requirements

16 —From User Interface Overall Query Tree classification lom taxonpath 1.0 taxon Applied Sciences, Technology classification-set taxon-set general title language enIntroduction To Oracle general-set keyword 0.5 keyword-set database lifeCycle Contribute -set 1.0 educational language learning Resource Type 0.0 context 0.1 educational-set intende dEndUs erRole 0.3 typicalAge Range 0.0 courseLearner    lifeCycle-set contribute entity  technicalrights format technical-set 0.6 copyrightAnd Other Restrictions 0.4 rights-set descriptions  otherPlatform Requirements  oracle 0.5 Yes HTML different weights

17 User Interface Outputs Query URI sent to Prefilter: uestTimeOut=300&query=database+oracle&selectedLanguageID= 1&selectedgranID=1&maxPrice=&UpperCategory=184&publisherQ uery=&selectedEduContext=&selectedCopyRightValue=Yes&selecte dSoftwareRequirements=&selectedTechFormat=HTML encodes “en” encodes “course” encodes “Applied Science, Technology” XML document sent to Similarity EngineXML document both tree structure and branch weights Threshold sent to Similarity Engine (0.7 is sent in this example)

18 Communicates between the internal components and the external KnowledgeAgora website Invokes both internal and external components User Interface Summary Enables convenient interaction between user and the internal components

19 Translator (David Hirtle) Similarity Engine (Java) Translator (XSLT) CANLOM (XML) Prefilter (SQL) LOMGen (Java) LOR (HTML) End user Administrator user input WOO RuleML file Recommended results HTML files partial CanCore files CanCore files prefiltered CanCore files WOO RuleML files DATABASE (Access) UI (Java) Keyword Table Administrator input Search Results prefilter parameters (Query URI) Overview

20 Translator Overview XSLT Stylesheet XSLT Processor Output file Input file CANLOM XML WOO RuleML Input: CANLOM XML (from Prefilter) Output: Weighted Object-Oriented RuleML (to Similarity Engine) Tool: Extensible Stylesheet Language Transformations (XSLT) Language for transforming XML to XML (W3C Rec) Similarity EnginePrefilter

21 Sample Translation lom... general_set... Introduction to Databases Introduction to Databases WOO RuleMLCANLOM XML

22 Translated Weighted LOM Tree classification lom taxonpath 1.0 taxon Applied Sciences, Technology classification-set taxon-set general title language enIntroduction to Databases general-set keyword 0.33 keyword-set database lifeCycle Contribute -set 1.0 educational language learning ResourceTy pe 0.2 context 0.2 educational-set intende dEndUs erRole 0.2 typicalAge Range 0.2 courseLearner   lifeCycle-set contribute entity  technicalrights format technical-set 0.5 copyrightAnd Other Restrictions 0.5 rights-set descriptions 0.5  otherPlatform Requirements database administration 0.5   Yes HTML same weights Different weights possible in LOM Tree, like in Query Tree

23 Translator Summary Uses XSLT engine embedded in main program Transforms CANLOM XML returned from Prefilter into WOO RuleML Resulting WOO RuleML then used by Similarity Engine

24 Computes similarity between query and LOM trees Displays ranking search results in a browser Similarity Engine (Lu Yang) Overview Similarity Engine (Java) Translator (XSLT) CANLOM (XML) Prefilter (SQL) LOMGen (Java) LOR (HTML) End user Administrator user input WOO RuleML file Recommended results HTML files partial CanCore files CanCore files prefiltered CanCore files WOO RuleML files DATABASE (Access) UI (Java) Keyword Table Administrator input Search Results prefilter parameters (Query URI)

25 Similarity Engine Inputs Similarity Engine LOM 1 Query Prefiltered LOMs: Query from User Interface: LOM 2 LOM n

26 Tree representations Arcs at the same level of a (sub)tree are in lexicographic order For linear left-to-right traversal Arc weights at the same level of a (sub)tree add up to 1.0 No limit on tree breadth and depth Don’t care nodes allowed Similarity Engine Tree Similarity Algorithm Tree similarity algorithm [1,2] Similarity values fall into [0.0,1.0] Top-down tree traversal Bottom-up value propagation

Similarity Engine Similarity Computation (example) 0.3 general classification lom title language en Introduction to Oracle general- set taxonpath 1.0 taxon Applied Sciences, Technology classification- set taxon-set 1.0 keyword 0.1 keyword- set oracle 1.0  … generalclassification lom title Introduction to Oracle general-set taxonpath 1.0 taxon Applied Sciences, Technology classification-set taxon-set 1.0 keyword 0.5 keyword-set oracle 1.0  A Query Tree (user.xml) A LOM Tree (WOORuleML2.xml) …

28 Similarity Engine Results (low threshold)

29 Similarity Engine Results (high threshold)

30 The core of Similarity Engine Tree similarity algorithm Threshold for cut-off Ranking and HTML output of LOMs and LOs Similarity Engine Summary Functionality of Similarity Engine: Similarity computation Ranked result display

31 Similarity Engine Reference [1] Bhavsar, V.C., H. Boley, L. Yang, “A Weighted-Tree Similarity Algorithm for Multi-Agent Systems in E-Business Environments”, In Proceedings of 2003 Workshop on Business Agents and the Semantic Web, Halifax, June 14, 2003, National Research Council of Canada, Institute for Information Technology, Fredericton, pp , To appear in Computational Intelligence, November 2004.

32 LOM Generation (Anurag Singh) Similarity Engine (Java) Translator (XSLT) CANLOM (XML) Prefilter (SQL) LOMGen (Java) LOR (HTML) End user Administrator user input WOO RuleML file Recommended results HTML files partial CanCore files CanCore files prefiltered CanCore files WOO RuleML files DATABASE (Access) UI (Java) Keyword Table Administrator input Search Results prefilter parameters (Query URI) LOMGen Overview

33 LOMGen Extracts keywords and keyphrases from an HTML document Derives a dictionary of “surface terms” and “deep terms” from Free Online Dictionary of Computing (FOLDOC) Provides a GUI for metadata administrator Makes up-to-date list of keyphrases and keywords available to the AgentMatcher UI Makes similarity computation more accurate Automates metadata (XML) posting to LOM repository Functionality

34 LOMGen The keywords and keyphrases extracted from the HTML file are presented to the metadata administrator The administrator can select and add any additional terms they find are important The administrator is also prompted to select a category for the LO Administrator GUI

35 LOMGen LOs analyzed for LOM content: –Meta tags title, description, keywords of HTML head turned out to be directly usable for dictionary –Shallow parsing of natural language HTML body further processed interactively with the user Dual use of LOMGen: –AgentMatcher UI keywords –KnowledgeAgora metadata repository Summary

36 AgentMatcher Conclusion UI to input LO search features structured as LOM-like trees with relative branch weights LOs prefiltered from CANLOM repository, then translated for use by Similarity Engine Similarity Engine computes similarity between query and prefiltered results using weighted tree similarity algorithm LOs ranked and displayed to user in HTML LOMGen generates metadata from a Learning Object Repository and posts them to the CANLOM database Demo’s next, as time permits