Survey on Semi-Supervised CRFs Yusuke Miyao Department of Computer Science The University of Tokyo.

Survey on Semi-Supervised CRFs Yusuke Miyao Department of Computer Science The University of Tokyo

Contents 1.Conditional Random Fields (CRFs) 2.Semi-Supervised Log-Linear Model 3.Semi-Supervised CRFs 4.Dynamic Programming for Semi-Supervised CRFs

1. Conditional Random Fields (CRFs) Log-linear model: for sentence x = and label sequence y =, – λ k : parameter – f k : feature function – Z : partition function

Parameter Estimation (1/2) Estimate parameters λ k, given labeled training data D = { } Objective function: log-likelihood (+ regularizer)

Parameter Estimation (2/2) Gradient-based optimization is applied (CG, pseudo-Newton, etc.) model expectation

Dynamic Programming for CRFs Computation of model expectations requires summation over y → exponential Dynamic programming allows for efficient computation of model expectations His friend runs the company Noun Det Noun x y

Dynamic Programming for CRFs Assumption: (0-th order) His friend runs the company Noun Det Verb Adj t ytyt

Forward/Backward Probability His friend runs the company Noun Det Verb Adj Computed by Dynamic Programming

2. Semi-Supervised Log-Linear Model Grandvalet et al. (2004) Given labeled data D L = { } and unlabeled data D U = {z i } Objective function: log-likelihood + negative entropy regularizer

Negative Entropy Regularizer Maximizing → Minimizing class overwraps = Targets are separated

Gradient of Entropy (1/2)

Gradient of Entropy (2/2)

3. Semi-Supervised CRFs Jiao et al. (2006) Given labeled data D L ={ } and unlabeled data D U = {z i } Objective function: log-likelihood + negative entropy regularizer

Application to NER Gene and protein identification A (labeled): 5448 words, B (unlabeled): 5210 words, C: 10208 words, D: 25145 words Self-training did not get any improvements A & BA & CA & D γ PRFPRFPRF 00.800.360.500.770.290.430.740.300.43 0.10.820.400.540.790.320.460.740.310.44 0.50.820.400.540.790.330.460.740.310.44 10.820.400.540.770.340.470.730.330.45 50.840.450.590.780.380.510.720.360.48 100.780.460.580.660.380.480.660.380.47

Results

4. Dynamic Programming for Semi-Supervised CRFs Mann et al. (2007) We have to compute: where y -t = and y -t ・ y =

Example Enumerate all y while fixing t -th state to y t If we can compute efficiently, we can compute the gradient His friend runs the company Noun Det Verb Adj

Decomposition of Entropy In the following, we use

Subsequence Constrained Entropy Computed from forward-backward probability Subsequence constrained entropy His friend runs the company Noun Det Verb Adj

Forward/Backward Subsequence Constrained Entropy His friend runs the company Noun Det Verb Adj

Dynamic Computation of H α H α can be computed incrementally Computed from forward-backward probability =

References Y. Grandvalet and Y. Bengio. 2004. Semi-supervised learning by entropy minimization. In NIPS 2004. F. Jiao, S. Wang, C.-H. Lee, R. Greiner, and D. Schuurmans. 2006. Semi-supervised conditional random fields for improved sequence segmentation and labeling. In COLING/ACL 2006. G. S. Mann and A. McCallum. 2007. Efficient Computation of Entropy Gradient for Semi-Supervised Conditional Random Fields. In NAACL-HLT 2007. X. Zhu. 2005. Semi-supervised learning literature survey. Technical Report 1530, Computer Sciences, University of Wisconsin-Madison.

Survey on Semi-Supervised CRFs Yusuke Miyao Department of Computer Science The University of Tokyo.

Similar presentations

Presentation on theme: "Survey on Semi-Supervised CRFs Yusuke Miyao Department of Computer Science The University of Tokyo."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Survey on Semi-Supervised CRFs Yusuke Miyao Department of Computer Science The University of Tokyo.

Similar presentations

Presentation on theme: "Survey on Semi-Supervised CRFs Yusuke Miyao Department of Computer Science The University of Tokyo."— Presentation transcript:

Similar presentations

About project

Feedback