Download presentation
Presentation is loading. Please wait.
Published byMae Bennett Modified over 9 years ago
1
Multi-core Structural SVM Training Kai-Wei Chang Department of Computer Science University of Illinois at Urbana-Champaign Joint Work With Vivek Srikumar and Dan Roth 1
2
Motivation 2
3
Inference with General Constraint Structure [Roth&Yih’04,07] Recognizing Entities and Relations Dole ’s wife, Elizabeth, is a native of N.C. E 1 E 2 E 3 R 12 R 23 other 0.05 per 0.85 loc 0.10 other 0.05 per 0.50 loc 0.45 other 0.10 per 0.60 loc 0.30 irrelevant 0.10 spouse_of 0.05 born_in 0.85 irrelevant 0.05 spouse_of 0.45 born_in 0.50 irrelevant 0.05 spouse_of 0.45 born_in 0.50 other 0.05 per 0.85 loc 0.10 other 0.10 per 0.60 loc 0.30 other 0.05 per 0.50 loc 0.45 irrelevant 0.05 spouse_of 0.45 born_in 0.50 irrelevant 0.10 spouse_of 0.05 born_in 0.85 other 0.05 per 0.50 loc 0.45 Improvement over no inference: 2-5% 3
4
Structured Learning and Inference 4
5
Structural SVM: Inference and Learning DEMI-DCD for Structural SVM Related Work Experiments Conclusions Outline 5
6
Structural SVM: Inference and Learning DEMI-DCD for Structural SVM Related Work Experiments Conclusions Outline 6
7
Set of allowed structures often specified by constraints Weight parameters (to be estimated during learning) Features on input- output Structured Prediction: Inference 7
8
Structural SVM Score of gold structure Score of predicted structure Loss functionSlack variable 8 For all samples and feasible structures
9
Dual Problem of Structural SVM 9
10
Active Set 10
11
Structural SVM: Inference and Learning DEMI-DCD for Structural SVM Related Work Experiments Conclusions Outline 11
12
Overview of DEMI-DCD 12 Learning Active Set Selection 12
13
Learning Thread 13
14
Synchronization 14
15
Structural SVM: Inference and Learning DEMI-DCD for Structural SVM Related Work Experiments Conclusions Outline 15
16
A parallel Dual Coordinate Descent Algorithm 16 Master Slave Sent current w Solve loss-augmented inference and update A Master Update w based on A 16
17
Structured Perceptron and its Parallel Version 17
18
Structural SVM: Inference and Learning DEMI-DCD for Structural SVM Related Work Experiments Conclusions Outline 18
19
Experiment Settings POS tagging (POS-WSJ): Assign POS label to each word in a sentence. We use standard Penn Treebank Wall Street Journal corpus with 39,832 sentences. Entity and Relation Recognition (Entity-Relation): Assign entity types to mentions and identify relations among them. 5,925 training samples. Inference is solved by an ILP solver. We compare the following methods: DEMI-DCD: the proposed method. MS-DCD: A master-slave style parallel implementation of DCD. SP-IPM: parallel structured Perceptron. 19
20
Convergence on Primal Function Value 20 Relative primal function value difference along training time POS-WSJEntity-Relation Log-scale
21
Test Performance 21 Test Performance along training time POS-WSJ SP-IPM converges to a different model
22
Test Performance 22 Test Performance along training time Entity-Relation Task Entity F1 Relation F1
23
Moving Average of CPU Usage 23 POS-WSJEntity-Relation DEMI-DCD fully utilizes CPU power CPU usage drops because of the synchronization
24
Different Number of Threads 24 Relative primal function value along training time POS-WSJEntity-Relation
25
Structural SVM: Inference and Learning DEMI-DCD for Structural SVM Related Work Experiments Conclusions Outline 25
26
Conclusion We proposed DEMI-DCD for training structural SVM on multi- core machine. The proposed method decouples the model update and inference phases of learning. As a result, it can fully utilize all available processors to speed up learning. Software will be available at: http://cogcomp.cs.illinois.edu/page/software http://cogcomp.cs.illinois.edu/page/software Thank you. 26
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.