Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fast Learning of Relational Dependency Networks

Similar presentations


Presentation on theme: "Fast Learning of Relational Dependency Networks"— Presentation transcript:

1 Fast Learning of Relational Dependency Networks
Oliver Schulte Zhensong Qian Arthur Kirkpatrick Xiaoqian Yin Yan Sun

2 Relational Dependency Networks
CoffeeDr(A) Friend(A,B) gender(A) gender(B) A in Person B in Person Structure: Directed graph, cycles are allowed. Parents of Node = Markov Blanket of Node. Parameter = distribution of child given parents. Accommodates relational autocorrelations. Neville, J. & Jensen, D. (2007), 'Relational Dependency Networks', Journal of Machine Learning Research 8,

3 Task: learn relational dependency network structure + parameters
Overview Task: learn relational dependency network structure + parameters our new approach previous approaches single generative model fast learning Bayesian network e.g., 1 min for 1M records. multiple discriminative models independently learned (one for each predicate) transform both BN structure and BN parameters. Main motivation for using BNs: can learn fast. Because of closed-form parameter estimation and model evaluation. Another motivation: interpretability. see Lowd and Davis. new closed-form transformation method Convert Bayesian network to Relational Dependency Network

4 From BN Structure To DN Structure
Solid arrows = Bayesian Network Solid + dash arrows = Dependency Network CoffeeDr(A) Friend(A,B) gender(A) gender(B) Heckerman, D.; Chickering, D. M.; Meek, C.; Rounthwaite, R.; Kadie, C. & Kaelbling, P. (2000), 'Dependency Networks for Inference, Collaborative Filtering, and Data Visualization', Journal of Machine Learning Research 1, 49—75.

5 From BN Parameters to DN Parameters
Log-linear model for probability of target instance given its Markov blanket. Example: Predict the gender of Sam, given that 40% of Sam’s friends are Women, and Sam is a coffee drinker. BN Parameter Markov Blanket P(target = value|Markov blanket) ∝ exp {∑target instance + children ∑ parent values PV, child values CV ln(P(CV|PV)) ∙ frequency(CV,PV)} DN Parameter Equivalently: shows how to perform classification with a relational Bayes net Fast Learning of Relational Dependency Networks

6 Example Predict the gender of Sam, given that
40% of Sam’s friends are Women, and Sam is a coffee drinker: P(g(A) = W | g(B) = W, F(A,B) = T) =0.55 P(g(A) = M | g(B) = M, F(A,B) = T) = 0.63 P(cd(A) = T|g(A) = M) = 0.6 P(cd(A) = T|g(A) = W) = 0.8 CoffeeDr(sam) Friend(sam,B) gender(sam) gender(B) Child Value Parent State CP log(CP) Rel. Freq. * Freq. g(sam) = W g(B) = W, F(sam,B) = T 0.55 -0.60 0.40 -0.24 g(B) = M, F(sam,B) = T 0.37 -0.99 0.60 cd(sam) = T 0.80 -0.22 1.00 cd(sam) = F 0.20 -1.61 0.00 Sum{ EXP(Sum) ∝ P(gender(sam)=W|MB) } -1.06 the reviewers asked for one. Think of parameters learned from training set, now we are testing. This is the dense slide. Score of gender(sam) = M?

7 Evaluation Metrics Running time Conditional Log Likelihood (CLL)
How confident we are with the prediction Area Under Precision-Recall Curve (PR) For skewed distributions. Results are averaged over 5-fold cross-validation, over all two-class predicates in the dataset. Comparison Methods: RDN-Boost, MLN-Boost. Note: We can do multi-class too. A restriction of RDN-Boost. Boolean: not rating. Natarajan, S.; Khot, T.; Kersting, K.; Gutmann, B. & Shavlik, J. W. (2012), 'Gradient-based boosting for statistical relational learning: The relational dependency network case', Machine Learning 86(1),

8 Accuracy Comparison Arrows point in the direction of better performance. This is for unary predicates only. For binary predicates, see paper. Boosting method inference doesn’t term

9 Learning Time Comparison
Dataset # Predicates # tuples RDN_Boost MLN_Boost RDN_Bayes UW 14 612 15±0.3 19±0.7 1±0.0 Mondial 18 870 27±0.9 42±1.0 102±6.9 Hepatitis 19 11,316 251±5.3 230±2.0 286±2.9 Mutagenesis 11 24,326 118±6.3 49±1.3 MovieLens(0.1M) 7 83,402 44±4.5 min 31±1.87 min MovieLens(1M) 1,010,051 >24 hours 10±0.1 MovieLens 1M takes only 1 sec more because counting in RDBMS scales really well. DNs are only learned for predicates for both MLN-Boost and RDN_Boost (two possible values only). IMDB is evaluated only on unary predicates, for all methods. IMDB evaluated only on one fold. We report the average per node. Hepatitis has a complex schema: longer to learn single model than learn independently. Not available yet = not run yet. Standard deviations are shown. Units are seconds unless otherwise stated. Fast Learning of Relational Dependency Networks

10 RDN-Bayes uses more relevant predicates and more first-order variables
Our best predicate for each database: Database Target Predicate # extra predicates # extra first order variables CLL-diff Mondial religion 11 1 0.58 IMDB gender 6 2 0.30 UW-CSE student 4 0.50 Hepatitis sex 0.20 Mutagenesis ind1 5 0.56 MovieLens 0.26 Suggested by reviewers. IMDB extra variables don’t jibe with example. Fast Learning of Relational Dependency Networks

11 Structure Comparison Example IMDB
UserID Occupation Age gender UserID MovieID Rating RDN-Boost MovieID Time 🎥 Model Target Markov Blanket RDN-Boost gender(U) Occupation(U), Age(U) RDN-Bayes Age(U), Rating(U,M), RunningTime(M), CastMember(M,X), AGender(X) RDN- Bayes ActorID MovieID RDN-Boost stays within the original target table. RDN-Bayes finds predictive features that are 2 links away. Zhensong: what is cast number? Make ILP consistent with CIKM (number attributes, runtimes, MJ times). I want to see it run. ActorID AGender Fast Learning of Relational Dependency Networks

12 Conclusions Basic Idea: convert Bayesian networks to relational dependency networks. fast BN learning ⇒ fast DN learning. dependency networks ⇒ inference with cyclic dependencies/autocorrelations. New log-linear model for converting BN parameters to DN parameters. I.e., define probability of a node given Markov blanket, Bayes net model. Empirical evaluation Scales very well with number of records. Competitive accuracy with functional gradient boosting. Fast Learning of Relational Dependency Networks

13 There’s More Empirical Comparisons
counts instead of frequencies weight learning more on MLN-Boost Theorems about dependency network consistency Fast Learning of Relational Dependency Networks

14 The End Any questions? Fast Learning of Relational Dependency Networks


Download ppt "Fast Learning of Relational Dependency Networks"

Similar presentations


Ads by Google