Discriminative Probabilistic Models for Relational Data

Discriminative Probabilistic Models for Relational Data
Ben Taskar, Pieter Abbeel, Daphne Koller

Tradition statistic classification Methods
Dealing with only ‘flat’ data – IID In many supervised learning tasks, entities to be labeled are related to each other in complex way and their labels are not independent This dependence is an important source of information to achieve better classification 4/30/2019 Guohua Hao

Collective Classification
Rather than classify each entity separately Simultaneously decide on the class label of all the entities together Explicitly take advantage of the correlation between the labels of related entitiies 4/30/2019 Guohua Hao

Undirected vs. directed graphical models
Undirected graphical models do not impose the acyclicity constraint, but directed ones need acyclicity to define a coherent generative model Undirected graphical models are well suited for discriminative training, achieving better classification accuracy over generative training 4/30/2019 Guohua Hao

Our Hypertext Relational Domain
Label Label ... ... HasWord1 HasWordk HasWord1 HasWordk Doc Doc From To Link 4/30/2019 Guohua Hao

Schema A set of entity types Attribute of each entity type
Content attribute E.X Label attribute E.Y Reference attribute E.R 4/30/2019 Guohua Hao

Provide a set of entities I (E) for each entity type E
Instantiation Provide a set of entities I (E) for each entity type E Specify the values of all the attribute of the entities, I.x, I.y, I.r I.r is the instantiation graph, which is call relational skeleton in PRM 4/30/2019 Guohua Hao

Markov Network Qualitative component – Cliques
Quantitative component – Potentials 4/30/2019 Guohua Hao

Cliques A set of nodes in the graph G such that
for each are connected by an edge in G 4/30/2019 Guohua Hao

Potentials The potential for the clique c defines the compatibility between values of variables in the clique Log-linearly combination of a set of features 4/30/2019 Guohua Hao

Probability in Markov Network
Given the values of all nodes in the Markov Network 4/30/2019 Guohua Hao

Conditional Markov Network
Specify the probability of a set of target variables Y given a set of conditioning variables X 4/30/2019 Guohua Hao

Relational Markov Network (RMN)
Specifies the conditional probability over all the labels of all the entities in the instantiation given the relational structure and the content attributes Extension of the Conditional Markov Networks with a compact definition on a relational data set 4/30/2019 Guohua Hao

Relational clique template
F --- a set of entity variables (From) W--- the condition about the attributes of the entity variables (Where) S --- subset of attributes (content and label attribute) of the entity variables (Select) 4/30/2019 Guohua Hao

Relationship to SQL query
SELECT doc1.Category,doc2.Category FROM doc1,doc2,Link link WHERE link.From=doc1.key and link.To=doc2.key Doc1 Doc1 Doc2 Link 4/30/2019 Guohua Hao

Potentials Potentials are defined at the level of relational clique template The cliques of the same relational clique template have the same potential functions 4/30/2019 Guohua Hao

Unrolling the RMN Given an instantiation of a relational schema, unroll the RMN as follows Find all the cliques in the unrolled the relational schema where the relational clique templates are applicable The potential of a clique is the same as that of the relational clique template which this clique belongs to 4/30/2019 Guohua Hao

link1 Doc1 Doc2 link2 Doc3 4/30/2019 Guohua Hao

Probability in RMN 4/30/2019 Guohua Hao

4/30/2019 Guohua Hao

Learning RMN Given a set of relational clique templates
Estimate feature weight w using conjugate gradient Objective function--Product of likelihood of instantiation and parameter prior Assume a shrinkage prior over feature weights 4/30/2019 Guohua Hao

Learning RMN (Cont’d) The conjugate gradient of the objective function
where 4/30/2019 Guohua Hao

Inference in RMN Exact inference Approximate inference
Intractable due to the network is very large and densely connected Approximate inference Belief propagation 4/30/2019 Guohua Hao

Experiments WebKB dataset Four CS department websites
Five categories (faculty,student,project,course,other) Bag of words on each page Links between pages Experimental setup Trained on three universities Tested on fourth 4/30/2019 Guohua Hao

Flat Models Based only on the text content on the WebPages
Incorporate meta-data 4/30/2019 Guohua Hao

Relational model introduce relational clique template over the labels of two pages that are linked Doc1 Doc2 Link 4/30/2019 Guohua Hao

Relational model (Cont’d)
relational clique template over the label of section and the label of the pages it is on Relational clique template over the label of the section containing the link and the label of the target page 4/30/2019 Guohua Hao

4/30/2019 Guohua Hao

Discriminative vs. Generative
Exit+Naïve Bayes: a complete generative model proposed by Getoor et al Exit+logistic: using logistic regression for the conditional probability distribution of page label given words Link: a fully discriminative training model 4/30/2019 Guohua Hao

Thank You! 4/30/2019 Guohua Hao

Discriminative Probabilistic Models for Relational Data

Similar presentations

Presentation on theme: "Discriminative Probabilistic Models for Relational Data"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Discriminative Probabilistic Models for Relational Data

Similar presentations

Presentation on theme: "Discriminative Probabilistic Models for Relational Data"— Presentation transcript:

Similar presentations

About project

Feedback