1 / 30

Discriminative Probabilistic Models for Relational Data

Discriminative Probabilistic Models for Relational Data. Ben Taskar, Pieter Abbeel, Daphne Koller. Tradition statistic classification Methods. Dealing with only ‘ flat ’ data – IID

Download Presentation

Discriminative Probabilistic Models for Relational Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Discriminative Probabilistic Models for Relational Data Ben Taskar, Pieter Abbeel, Daphne Koller

  2. Tradition statistic classification Methods • Dealing with only ‘flat’ data – IID • In many supervised learning tasks, entities to be labeled are related to each other in complex way and their labels are not independent • This dependence is an important source of information to achieve better classification Guohua Hao

  3. Collective Classification • Rather than classify each entity separately • Simultaneously decide on the class label of all the entities together • Explicitly take advantage of the correlation between the labels of related entitiies Guohua Hao

  4. Undirected vs. directed graphical models • Undirected graphical models do not impose the acyclicity constraint, but directed ones need acyclicity to define a coherent generative model • Undirected graphical models are well suited for discriminative training, achieving better classification accuracy over generative training Guohua Hao

  5. Our Hypertext Relational Domain Label Label ... ... HasWordk HasWord1 HasWordk HasWord1 Doc Doc From To Link Guohua Hao

  6. Schema • A set of entity types • Attribute of each entity type • Content attribute E.X • Label attribute E.Y • Reference attribute E.R Guohua Hao

  7. Instantiation • Provide a set of entities I(E) for each entity type E • Specify the values of all the attribute of the entities, I.x, I.y, I.r • I.r is the instantiation graph, which is call relational skeleton in PRM Guohua Hao

  8. Markov Network • Qualitative component – Cliques • Quantitative component – Potentials Guohua Hao

  9. Cliques • A set of nodes in the graph G such that for each are connected by an edge in G Guohua Hao

  10. Potentials • The potential for the clique c defines the compatibility between values of variables in the clique • Log-linearly combination of a set of features Guohua Hao

  11. Probability in Markov Network • Given the values of all nodes in the Markov Network Guohua Hao

  12. Conditional Markov Network • Specify the probability of a set of target variables Y given a set of conditioning variables X Guohua Hao

  13. Relational Markov Network (RMN) • Specifies the conditional probability over all the labels of all the entities in the instantiation given the relational structure and the content attributes • Extension of the Conditional Markov Networks with a compact definition on a relational data set Guohua Hao

  14. Relational clique template • F --- a set of entity variables (From) • W--- the condition about the attributes of the entity variables (Where) • S --- subset of attributes (content and label attribute) of the entity variables (Select) Guohua Hao

  15. Relationship to SQL query SELECT doc1.Category,doc2.Category FROM doc1,doc2,Link link WHERE link.From=doc1.key and link.To=doc2.key Doc1 Doc1 Doc2 Link Guohua Hao

  16. Potentials • Potentials are defined at the level of relational clique template • The cliques of the same relational clique template have the same potential functions Guohua Hao

  17. Unrolling the RMN • Given an instantiation of a relational schema, unroll the RMN as follows • Find all the cliques in the unrolled the relational schema where the relational clique templates are applicable • The potential of a clique is the same as that of the relational clique template which this clique belongs to Guohua Hao

  18. link1 Doc1 Doc2 link2 Doc3 Guohua Hao

  19. Probability in RMN Guohua Hao

  20. Guohua Hao

  21. Learning RMN • Given a set of relational clique templates • Estimate feature weight w using conjugate gradient • Objective function--Product of likelihood of instantiation and parameter prior • Assume a shrinkage prior over feature weights Guohua Hao

  22. Learning RMN (Cont’d) • The conjugate gradient of the objective function where Guohua Hao

  23. Inference in RMN • Exact inference • Intractable due to the network is very large and densely connected • Approximate inference • Belief propagation Guohua Hao

  24. Experiments • WebKB dataset • Four CS department websites • Five categories (faculty,student,project,course,other) • Bag of words on each page • Links between pages • Experimental setup • Trained on three universities • Tested on fourth Guohua Hao

  25. Flat Models • Based only on the text content on the WebPages • Incorporate meta-data Guohua Hao

  26. Relational model • introduce relational clique template over the labels of two pages that are linked Doc1 Doc2 Link Guohua Hao

  27. Relational model (Cont’d) • relational clique template over the label of section and the label of the pages it is on • Relational clique template over the label of the section containing the link and the label of the target page Guohua Hao

  28. Guohua Hao

  29. Discriminative vs. Generative • Exit+Naïve Bayes: a complete generative model proposed by Getoor et al • Exit+logistic: using logistic regression for the conditional probability distribution of page label given words • Link: a fully discriminative training model Guohua Hao

  30. Thank You! Guohua Hao

More Related