Download presentation
Presentation is loading. Please wait.
Published byGwenda Allen Modified over 9 years ago
1
1 Weighted Partonomy-Taxonomy Trees with Local Similarity Measures for Semantic Buyer-Seller Match-Making Lu Yang, Marcel Ball, Virendra C. Bhavsar and Harold Boley BASeWEB, May 8, 2005
2
2 Outline Introduction Motivation Partonomy Similarity Algorithm – Tree representation – Tree simplicity – Partonomy similarity – Experimental Results Node Label Similarity – Inner-node similarity – Leaf-node similarity Conclusion
3
3 Introduction Buyer-Seller matching in e-business, e-learning Main Server User Info User Profiles User Agents Agents Cafe-1 Cafe-n To other sites (network) Web Browser User A multi-agent system Matcher 1 Matcher n
4
4 Introduction An e-Learning scenario Cafe Learner 1 Course Provider 1 Matcher Learner 2 Learner n Course Provider m Course Provider 2 H. Boley, V. C. Bhavsar, D. Hirtle, A. Singh, Z. Sun and L. Yang, A match-making system for learners and learning Objects. Learning & Leading with Technology, International Society for Technology in Education, Eugene, OR, 2005 (to appear).
5
5 Motivation – Keywords/keyphrases – Trees Metadata for buyers and sellers Tree similarity
6
6 Tree representation Characteristics of our trees – Node-labled, arc-labled and arc-weighted – Arcs are labled in lexicographical order – Weights sum to 1 0.3 0.2 0.5 Make Model Year 2002 Car Ford Explorer
7
7 Tree representation – Serialization of trees – XML attributes for arc weights and subelements for arc labels – Weighted Object-Oriented RuleML Car Make Ford Model Explorer Year 2002 Tree serialization in WOO RuleML
8
8 Tree simplicity – The deeper the leaf node, the less its contribution to the tree simplicity Depth degradation index (0.9) Depth degradation factor (0.5) – Reciprocal of tree breadth L. Yang, B. Sarker, V.C. Bhavsar and H. Boley, A weighted-tree simplicity algorithm for similarity matching of partial product descriptions (submitted for publication). (0.225) A 0.1 a ed 0.9 0.8 0.2 E b 0.7 0.3 B C f D c FG (0.9) (0.45) tree simplicity: 0.0563
9
9 Tree simplicity – Computation if T is a leaf node, otherwise. Š(T) : the simplicity value of a single tree T D I and D F : depth degradation index and depth degradation factor d : depth of a leaf node m : root node degree of tree T that is not a leaf w j : arc weight of the j th arc below the root node of tree T T j : subtree below the j th arc with arc weight w j
10
10 Partonomy similarity – Simple trees Escape Car Make Model Ford 0.3 0.7 Mustang Car Make Model Ford 0.3 0.7 tree ttree t´ (House) 0 1 Inner nodes 0 1 Leaf nodes
11
11 Partonomy similarity – Complex trees (s i (w i + w' i )/2) (A(s i )(w i + w' i )/2) A(s i ) ≥ s i lom educational 0.5 general format platform 0.5 Introduction to Oracle t t´t´ technical 0.3334 0.3333 edu-setgen-set tec-set language en title HTMLWinXP lom 0.1 general format platform 0.9 0.8 0.2 Basic Oracle technical 0.7 0.3 gen-set tec-set language en title * WinXP * : Don’t Care
12
12 Partonomy similarity – Main functions – Treesim(t,t'): Recursively compares any (unordered) pair of trees Paremeters N and i Three main functions (Relfun) – Treemap(l,l'): Recursively maps two lists, l and l', of labeled and weighted arcs: descends into identical– labeled subtrees – Treeplicity(i,t): Decreases the similarity with decreasing simplicity V. C. Bhavsar, H. Boley and L. Yang, A weighted-tree similarity algorithm for multi-agent systems in e- business environments. Computational Intelligence, 2004, 20(4):584-602.
13
13 Similarity of simple trees
14
14 Similarity of simple trees (Cont’d) Experiments Tree Results 3 0.1 make auto mustang auto 0.45 model 2000 ford year t1t1 t2t2 1.0 model 0.45 explorer 0.9 make auto mustang auto 0.05 model 2000 ford year t3t3 t4t4 1.0 model 0.05 explorer 0.2823 0.1203
15
15 Similarity of identical tree structures ExperimentsTree Results 4 0.2 make auto 0.3 1999 ford year t2t2 model 0.5 explorer make auto 1999 ford year t4t4 model explorer 0.3333 0.3334 0.2 make auto 0.3 2002 ford yea r t1t1 model 0.5 explorer make 2002 ford yea r t3t3 model explorer 0.3333 0.3334 auto 0.55 0.7000
16
16 b2 Similarity of complex trees 0.3334 0.3333 b1 1.0 0.25 0.3334 0.3333 1.0 0.3333 0.3334 c2 0.3333 0.25 c3 c1 c2 c b3 A B C D b d b1 b4 c1 c3 d1 B1 B4 C1 C3 D1 B2B3 c4 c A B C D bd d1 B1 C1 C4 C3D1 0.3334 0.3333 0.5 0.3333 E F t t´t´ 0.8160 0.93160.89960.92300.96470.9793
17
17 b2 Similarity of complex trees (Cont’d) 0.3334 0.3333 b1 1.0 0.25 0.3334 0.3333 1.0 0.3333 0.3334 c2 0.3333 0.25 c3 c1 c2 c b3 A B C D b d b1 b4 c1 c3 d1 B1 B4 C1 C3 D1 B2B3 c4 c A B C D bd d1 B1 C1 C4 C3D1 0.3334 0.3333 0.5 0.3333 E E F t t´t´ 0.8555 0.96260.93140.94990.98240.9902
18
18 b2 Similarity of complex trees (Cont’d) 0.3334 0.3333 b1 1.0 0.25 0.3334 0.3333 1.0 0.3333 0.3334 c2 0.3333 0.25 c3 c1 c2 c b3 A B C D b d b1 b4 c1 c3 d1 B1 B4 C1 C3 D1 B2B3 c4 c A B * D bd d1 B1 C1 C4 C3D1 0.3334 0.3333 0.5 0.3333 E F t t´t´ 0.9134 0.96970.95300.96410.98440.9910
19
19 Node label similarity For inner nodes and leaf nodes – Exact string matching binary result 0.0 or 1.0 – Permutation of strings “Java Programming” vs. “Programming in Java” Number of identical words Maximum length of the two strings Example For two node labels “a b c” and “a b d e”, their similarity is: 2 4 = 0.5
20
20 Example Node labels “electric chair” and “committee chair” Node label similarity (Cont’d) 1 2 = 0.5 meaningful? Semantic similarity
21
21 Node label similarity – Inner nodes vs. leaf nodes Inner nodes — class-oriented – Inner node labels can be classes – classes are located in a taxonomy tree – taxonomic class similarity measures Leaf nodes — type-oriented – address, currency, date, price and so on – type similarity measures (local similarity measures)
22
22 Node label similarity String Permutation (both inner and leaf nodes) Exact String Matching (both inner and leaf nodes) Non-Semantic Matching Taxonomic Class Similarity (inner nodes) Type Similarity (leaf nodes) Semantic Matching
23
23 Inner node similarity – Partonomy trees Distributed Programming Credit “Introduction to Distributed Programming” Textbook Tuition Duration $800 2months 3 0.2 0.1 0.3 0.4 t1t1 t2t2 Object-Oriented Programming Credit “Objected-Oriented Programming Essentials” Textbook Tuition Duration $1000 3months 3 0.1 0.5 0.2
24
24 Inner node similarity – Taxonomy tree Programming Techniques Applicative Programming 0.5 0.2 General Automatic Programming Concurrent Programming Sequential Programming Object-Oriented Programming Distributed Programming Parallel Programming 0.7 0.4 0.5 0.3 0.9 Arc weights same level of a subtree: do not need to add up to 1 assigned by human experts or extracted from documents A. Singh, Weighted tree metadata extraction. MCS Thesis (in preparation), University of New Brunswick, Fredericton, Canada, 2005.
25
25 Programming Techniques Applicative Programming 0.5 0.2 General Automatic Programming Concurrent Programming Sequential Programming Object-Oriented Programming Distributed Programming Parallel Programming 0.7 0.4 0.5 0.3 0.9 red arrows stop at the nearest common ancestor the product of subsumption factors on the two paths = 0.018 Inner node similarity – Taxonomic class similarity
26
26 Inner node similarity – Integration of taxonomy tree into partonomy trees Taxonomy tree – extra taxonomic class similarity measures Semantic similarity without – changing our partonomy similarity algorithm – losing taxonomic semantic similarity Encode the (subsections) of taxonomy tree into partonomy trees www.teclantic.ca
27
27 Inner node similarity – Encoding taxonomy tree into partonomy tree Programming Techniques Applicative Programming 0.1 General Automatic Programming Concurrent Programming Sequential Programming Object-Oriented Programming Distributed Programming Parallel Programming 0.3 0.15 0.4 0.6 0.2 0.15 * * * * * * * * encoded taxonomy tree
28
28 Credit Title Tuition Duration $800 2months 3 0.05 0.1 0.15 0.05 t1t1 Classification 0.65 taxonomy Object- Oriented Programming $1000 3months 3 0.2 0.05 t2t2 Classification 0.65 taxonomy Distributed Programming course Concurrent Programming Parallel Programming 0.6 0.4 Object-Oriented Programming 0.7 0.3 0.8 0.2 course 1.0 Programming Techniques 1.0 * Distributed Programming Credit Title Tuition Duration Programming Techniques Sequential Programming * * * * * * * Inner node similarity – Encoding taxonomy tree into partonomy tree (Cont’d) encoded partonomy trees
29
29 Leaf node similarity (local similarity) Different leaf node types different type similarity measures Various leaf node types – “Price”-typed leaf nodes e.g. for buyer ≤$800 [0, Max] for seller ≥$1000 [Min, ∞]
30
30 Leaf node similarity (local similarity) 0.5 end_date Nov 3, 2004 0.5 t1t1 t 2 start_date May 3, 2004 Project 0.5 end_date Feb 18, 2005 0.5 start_date Jan 20, 2004 Project Example: “Date”-typed leaf nodes DS(d 1, d 2 ) = { 0.0 otherwise. if | d 1 – d 2 | ≥ 365, 1 – | d 1 – d 2 | 365 0.74
31
31 Conclusion Arc-labeled and arc-weighted trees Partonomy similarity algorithm – Traverses trees top-down – Computes similarity bottom-up Node label similarity – Exact string matching (inner and leaf nodes) – String permutation (inner and leaf nodes) – Taxonomic class similarity (inner nodes) Taxonomy tree Encoding taxonomy tree into partonomy tree – Type similarity (leaf nodes) date-typed similarity measures
32
32 Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.