Download presentation
Presentation is loading. Please wait.
1
Online Stacked Graphical Learning Zhenzhen Kou +, Vitor R. Carvalho *, and William W. Cohen + Machine Learning Department + / Language Technologies Institute *, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA Abstract Statistical relational learning has been widely studied to predict class labels simultaneously for relational data, such as hyperlinked webpages, social networks, and data in a relational database. The existing collective classification methods are usually expensive due to the iterative inference in graphical models and their learning procedures based on iterative optimization. When the dataset is large, the cost of maintaining large graphs or related instances in memory becomes a problem as well. Stacked graphical learning has been proposed for collective classification with efficient inference. However, the memory and time cost of standard stacked graphical learning is still expensive since it requires cross-validation-like predictions to be constructed during training. In this paper, we proposed a new scheme to integrate recently-developed single-pass online learning with stacked learning, to save training time and to handle large streaming datasets with minimal memory overhead. Experimentally we showed that online stacked graphical learning gives accurate and reliable results on eleven sample problems from three domains, with much less time and memory cost. With competitive accuracy, high efficiency and low memory cost, online stacked graphical learning is very promising in real world large-scale applications. Also, with the online learning scheme, stacked graphical learning is able to be applied to streaming data. Introduction There are many relational datasets in reality, where the instances are not independent to each other Web pages linked to each other; Data in a database; Papers with citations, co- authorships; … Statistical relational learning Traditional machine learning algorithms assume independence among records Relational models analyze dependence among instances Relational Bayesian networks / Relational Markov networks / Relational dependency networks / Markov logic networks/ … Most existing models are expensive Iterative inference in graphical models An algorithm with efficient inference is important in applications Experimental results - continued Sequential partitioning –Task Sequential classification with long runs of identical labels –Datasets Signature dataset; FAQ dataset; video segmentation –Baseline –Stacked models –Competitive models Conditional random fields –Relational template: Exists Predictions of ten adjacent examples Named Entity extractions –Datasets Person name extraction in emails and protein name extraction in Medline abstracts –Baseline –Stacked models –Competitive models Conditional random fields –Relational template: A second one including adjacent words and repeated words Online stacked graphical learning Single-pass online algorithm - modified balanced Winnow (MBW) A single-pass online learning algorithm needs only a single training pass over the available data. Previous work showed that MBW can provide batch-level performance. Intermediate predictions for training data are generated to learn the online model Combining online learning with stacked learning may help save training time and memory Efficiency Learning efficiency –Task: compare the training time –Competitive models Standard stacked graphical models Competitive relational models Inference efficiency –Demonstrated in SDM 07 and Kou’s dissertation ~80 times faster than Gibbs sampling Summary Accurate Represent dependencies among relational data Competitive compared to state-of-art relational models Relational Markov networks; relational dependency networks Efficient During inference: ~80 times faster than Gibbs sampling During learning: online stacked learning is over 10 times faster than competitive relational models Standard stacked graphical models (SGMs) Predict the class labels based on local features with a base learning method Get an expanded feature vector and train a model with the expanded features Standard stacked graphical learning – SDM 07 is effective and efficient in inference, while still expensive in learning requires to calculate the predictions for training data in a cross-validated way Solution: online stacked graphical learning Collective classification (accuracy) Name extraction (F) Sequential Partitioning (accuracy) SLIFWebKbCoraCiteSeerUTYapexGenia CSpac e FAQSignatureVideo Local models Maxent MBW 81.5 82.3 57.6 58.0 63.1 62.8 54.9 55.8 69.1 67.9 62.1 62.3 66.5 66.9 74.2 75.1 67.3 64.9 96.3 96.5 80.9 78.4 Competitive model86.773.172.357.973.165.772.080.385.698.183.0 Stacked graphical models Standard stacked models (MaxEnt) Standard stacked models (MBW) Online stacked graphical models 90.1 92.1 92.3 72.9 73.8 73.5 73.0 72.9 70.7 59.3 60.0 - 77.3 76.7 76.6 68.2 68.9 69.1 78.5 78.9 82.1 83.3 83.4 87.1 84.1 86.3 98.1 98.3 85.8 85.5 85.7 x2x2 x1x1 x5x5 x4x4 x3x3 stacked models + y2y2 y3y3 y1y1 y4y4 y5y5 x 2, y’ 2 x 1, y’ 1 x 5, y’ 5 x 4, y’ 4 x 3, y’ 3 x 2, C(x 2,y’) x 1, C(x 1,y’) x 5, C(x 5,y’) x 4, C(x 4,y’) x 3, C(x 3,y’) Iterate Very few! x1x1 x2x2 … xbxb x 2b ……… xnxn …… … … ……………………… burn-in data size Efficiency analysis when there are infinitely many training examples, i.e., kb << n, single-pass training over the training set at level k there are reliable predictions after (k+1)b examples have streamed by the learner needs to maintain only k classifiers and does not need to store examples Experimental results Collective classification over relational data –Datasets Document classifications: WebKB, Cora, Citeseer Text region detection in SLIF –Baselines MaxEnt learner MBW –Competitive models Relational dependency networks –Stacked models Standard stacked models based on MaxEnt and MBW, respectively Online stacked graphical models based on MBW –Relational template: Count Standard SGM vs Online SGM Competitive relational model vs online SGM SLIF WebKB Cora 38.1 50.0 49.7 7.9 10.1 9.9 Signature FAQ Video 67.4 69.0 45.0 13.6 14.0 9.7 UT Yapex Genia CSpace 68.7 60.6 69.4 52.0 20.3 17.1 22.4 15.3 Average speed-up57.014.0 w0w0 w1w1 w2w2 w3w3 w4w4 ras4gene ras4 The w 10 ……
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.