1 / 32

Lu Yang, Marcel Ball, Virendra C. Bhavsar and Harold Boley BASeWEB, May 8, 2005

Weighted Partonomy-Taxonomy Trees with Local Similarity Measures for Semantic Buyer-Seller Match-Making. Lu Yang, Marcel Ball, Virendra C. Bhavsar and Harold Boley BASeWEB, May 8, 2005. Outline. Introduction Motivation Partonomy Similarity Algorithm Tree representation Tree simplicity

kennethmoon
Download Presentation

Lu Yang, Marcel Ball, Virendra C. Bhavsar and Harold Boley BASeWEB, May 8, 2005

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Weighted Partonomy-Taxonomy Trees with Local Similarity Measures for Semantic Buyer-Seller Match-Making Lu Yang, Marcel Ball, Virendra C. Bhavsar and Harold Boley BASeWEB, May 8, 2005

  2. Outline • Introduction • Motivation • Partonomy Similarity Algorithm • Tree representation • Tree simplicity • Partonomy similarity • Experimental Results • Node Label Similarity • Inner-node similarity • Leaf-node similarity • Conclusion

  3. Introduction • Buyer-Seller matchingin e-business, e-learning Main Server Agents User Info Web Browser User Profiles User User Agents To other sites (network) Cafe-n Cafe-1 Matcher 1 Matcher n A multi-agent system

  4. Introduction • An e-Learning scenario Learner 1 Course Provider 1 Learner 2 Cafe Course Provider 2 Matcher Learner n Course Provider m H. Boley, V. C. Bhavsar, D. Hirtle, A. Singh, Z. Sun and L. Yang, A match-making system for learners and learning Objects. Learning & Leading with Technology, International Society for Technology in Education, Eugene, OR, 2005 (to appear).

  5. Motivation • Metadata for buyers and sellers • Keywords/keyphrases • Trees • Tree similarity

  6. 0.5 0.3 0.2 Car Make Year Model 2002 Explorer Ford Tree representation • Characteristics of our trees • Node-labled, arc-labled and arc-weighted • Arcs are labled in lexicographical order • Weights sum to 1

  7. Tree representation– Serialization of trees • Weighted Object-Oriented RuleML • XML attributes forarc weights and subelements for • arc labels <Cterm> <Ctor>Car</Ctor> <slot weight="0.3"><Ind>Make</Ind><Ind>Ford</Ind></slot> <slot weight="0.2"><Ind>Model</Ind><Ind>Explorer</Ind></slot> <slot weight="0.5"><Ind>Year</Ind><Ind>2002</Ind></slot> </Cterm> Tree serialization in WOO RuleML

  8. A b a 0.7 0.3 C B c d e f 0.9 0.1 0.8 0.2 D E F G Tree simplicity (0.9) tree simplicity: 0.0563 (0.45) (0.225) • The deeper the leaf node, the less its contribution to the tree simplicity • Depth degradation index (0.9) • Depth degradation factor (0.5) • Reciprocalof tree breadth L. Yang, B. Sarker, V.C. Bhavsar and H. Boley, A weighted-tree simplicity algorithm for similarity matching of partial product descriptions (submitted for publication).

  9. if T is a leaf node, otherwise. Tree simplicity – Computation Š(T): the simplicity value of a single tree T DI and DF: depth degradation index and depth degradation factor d: depth of a leaf node m: root node degree of tree T that is not a leaf wj: arc weight of the jtharc below the root node of tree T Tj: subtree below the jth arc with arc weight wj

  10. 1 0 Inner nodes 0 1 Leaf nodes Partonomy similarity – Simple trees tree t tree t´ (House) Car Car Make Model Model Make 0.7 0.3 0.7 0.3 Mustang Ford Escape Ford

  11. lom t lom technical general educational general technical 0.7 0.3 0.3334 0.3333 0.3333 tec-set gen-set tec-set edu-set gen-set format format platform platform language title language title 0.9 0.5 0.1 0.5 0.8 0.2 0.5 0.5 en en Basic Oracle * WinXP Introduction to Oracle HTML WinXP *:Don’t Care A(si) ≥ si (A(si)(wi + w'i)/2) Partonomy similarity – Complex trees (si (wi + w'i)/2)

  12. Partonomy similarity – Main functions • Three main functions (Relfun) • Treesim(t,t'): Recursively compares any (unordered) pair of trees • Paremeters N and i • Treemap(l,l'):Recursively maps two lists, l and l', of labeled • and weighted arcs: descends into identical– • labeled subtrees • Treeplicity(i,t): Decreases the similarity with decreasing simplicity V. C. Bhavsar, H. Boley and L. Yang, A weighted-tree similarity algorithm for multi-agent systems in e-business environments. Computational Intelligence, 2004, 20(4):584-602.

  13. Experiments Results Tree Tree auto auto make make year year 0.1 0.5 0.5 0.5 0.5 1 chrysler ford 2002 t2 1998 t1 auto auto year make make year 0.55 0.0 1.0 1.0 0.0 ford ford 2002 1998 t1 t2 2 auto auto make year year make 1.0 0.0 1.0 0.0 1.0 ford ford 2002 2002 t4 t3 Similarity of simple trees

  14. Similarity of simple trees (Cont’d) Tree Tree Results Experiments auto auto year make model model 0.45 0.45 0.2823 1.0 0.1 2000 explorer ford mustang t1 t2 3 auto auto year make model model 0.05 0.05 1.0 0.9 0.1203 explorer 2000 ford mustang t3 t4

  15. Similarity of identical tree structures Results Experiments Tree Tree auto auto year year make make model model 0.3 0.5 0.5 0.3 0.2 0.2 0.55 1999 explorer ford explorer 2002 ford t2 t1 4 auto auto make year year make model model 0.3334 0.3333 0.3334 0.3333 0.3333 0.3333 0.7000 explorer ford explorer ford 2002 1999 t3 t4

  16. Similarity of complex trees A A t´ t d b b d c 0.3333 0.3333 0.3334 c 0.3334 0.3333 0.3333 B C B D C D b3 c1 b1 b1 c1 c4 c3 d1 0.3334 d1 0.25 b2 c2 c2 0.3333 0.3333 0.3334 c3 b4 0.5 0.25 1.0 0.25 1.0 0.3333 0.25 0.3333 0.5 B1 C4 B1 B2 B3 F C3 D1 C1 B4 D1 C3 E C1 0.9316 0.8996 0.9230 0.9647 0.9793 0.8160

  17. Similarity of complex trees (Cont’d) A A t´ t d b b d c 0.3333 0.3333 0.3334 c 0.3334 0.3333 0.3333 B C B D C D b3 c1 b1 b1 c1 c4 c3 d1 0.3334 d1 0.25 b2 c2 c2 0.3333 0.3333 0.3334 c3 b4 0.5 0.25 1.0 0.25 1.0 0.3333 0.25 0.3333 0.5 B1 C4 B1 B2 B3 E F C3 D1 C1 B4 D1 C3 E C1 0.9626 0.9314 0.9499 0.9824 0.9902 0.8555

  18. Similarity of complex trees (Cont’d) A A t´ t d b b d c 0.3333 0.3333 0.3334 c 0.3334 0.3333 0.3333 B * B D C D b3 c1 b1 b1 c1 c4 c3 d1 0.3334 d1 0.25 b2 c2 c2 0.3333 0.3333 0.3334 c3 b4 0.5 0.25 1.0 0.25 1.0 0.3333 0.25 0.3333 0.5 B1 C4 B1 B2 B3 F C3 D1 C1 B4 D1 C3 E C1 0.9697 0.9530 0.9641 0.9844 0.9910 0.9134

  19. Number of identical words Maximum length of the two strings 2 = 0.5 4 Node label similarity • For inner nodes and leaf nodes • Exact string matching binary result 0.0 or 1.0 • Permutation of strings “Java Programming” vs. “Programming in Java” Example For two node labels “a b c” and “a b d e”, their similarity is:

  20. 1 = 0.5 2 Node label similarity (Cont’d) Example Node labels “electric chair” and “committee chair” meaningful? • Semantic similarity

  21. Node label similarity – Inner nodes vs. leaf nodes • Inner nodes — class-oriented • Inner node labels can be classes • classes are located in a taxonomy tree • taxonomic class similarity measures • Leaf nodes — type-oriented • address, currency, date, price and so on • type similarity measures (local similarity measures)

  22. Semantic Matching Non-Semantic Matching String Permutation (both inner and leaf nodes) Node label similarity Taxonomic Class Similarity (inner nodes) Exact String Matching (both inner and leaf nodes) Type Similarity (leaf nodes)

  23. Inner node similarity – Partonomy trees Distributed Programming Object-Oriented Programming Tuition Tuition Credit Credit 0.4 0.2 0.2 0.1 Duration Duration Textbook Textbook 0.1 0.5 0.3 0.2 $1000 $800 2months 3months 3 3 “Introduction to Distributed Programming” “Objected-Oriented Programming Essentials” t1 t2

  24. Programming Techniques 0.3 0.5 Object-Oriented Programming 0.5 0.7 0.4 0.2 General Concurrent Programming Sequential Programming Applicative Programming Automatic Programming 0.9 0.3 Parallel Programming Distributed Programming Inner node similarity – Taxonomy tree • Arc weights • same level of a subtree: do not need to add up to 1 • assigned by human experts or extracted from documents A. Singh, Weighted tree metadata extraction. MCS Thesis (in preparation), University of New Brunswick, Fredericton, Canada, 2005.

  25. Inner node similarity – Taxonomic class similarity Programming Techniques 0.3 0.5 Object-Oriented Programming 0.5 0.7 0.4 0.2 General Concurrent Programming Sequential Programming Applicative Programming Automatic Programming 0.3 0.9 Parallel Programming Distributed Programming • red arrows stop at the nearest common ancestor • the product of subsumption factors on the two paths = 0.018

  26. Inner node similarity – Integration of taxonomy tree into partonomy trees • Taxonomy tree • extra taxonomic class similarity measures • Semantic similarity without • changing our partonomy similarity algorithm • losing taxonomic semantic similarity  Encode the (subsections) of taxonomy tree into partonomy trees www.teclantic.ca

  27. Inner node similarity – Encoding taxonomy tree into partonomy tree Programming Techniques Applicative Programming Sequential Programming Concurrent Programming 0.1 0.15 Automatic Programming 0.1 Object-Oriented Programming General 0.3 0.2 0.15 * * * * * * Distributed Programming Parallel Programming 0.4 0.6 * * encoded taxonomy tree

  28. Inner node similarity – Encoding taxonomy tree into partonomy tree (Cont’d) course course Classification Classification Tuition Tuition 0.65 Duration 0.65 Duration Title Credit 0.05 0.05 Title Credit taxonomy 0.05 0.1 taxonomy 0.2 0.05 0.15 0.05 Programming Techniques 2months 3months $800 Distributed Programming Object-Oriented Programming 3 3 $1000 Programming Techniques 1.0 * 1.0 Concurrent Programming Sequential Programming * Sequential Programming Object-Oriented Programming 0.7 0.3 * * 0.8 0.2 Distributed Programming Parallel Programming * * 0.4 0.6 t2 t1 * * encoded partonomy trees

  29. Leaf node similarity (local similarity) • Different leaf node types  different type similarity measures • Various leaf node types • “Price”-typed leaf nodes e.g. for buyer ≤$800 [0, Max] for seller ≥$1000 [Min, ∞]

  30. Example: “Date”-typed leaf nodes { Project Project if | d1 – d2 | ≥ 365, 0.0 DS(d1, d2) = | d1–d2 | start_date start_date end_date end_date – 1 otherwise. 0.5 0.5 0.5 0.5 365 Nov 3, 2004 May 3, 2004 Jan 20, 2004 Feb 18, 2005 t 2 t1 Leaf node similarity (local similarity) 0.74

  31. Conclusion • Arc-labeled and arc-weighted trees • Partonomy similarity algorithm • Traverses trees top-down • Computes similarity bottom-up • Node label similarity • Exact string matching (inner and leaf nodes) • String permutation (inner and leaf nodes) • Taxonomic class similarity (inner nodes) • Taxonomy tree • Encoding taxonomy tree into partonomy tree • Type similarity (leaf nodes) • date-typed similarity measures

  32. Questions?

More Related