A Survey of Unsupervised Grammar Induction Baskaran Sankaran Senior Supervisor: Dr Anoop Sarkar School of Computing Science Simon Fraser University.

A Survey of Unsupervised Grammar Induction Baskaran Sankaran Senior Supervisor: Dr Anoop Sarkar School of Computing Science Simon Fraser University

Motivation Languages have hidden regularities karuppu naay puunaiyai thurathiyathu iruttil karuppu uruvam marainthathu naay thurathiya puunai vekamaaka ootiyathu 2

Motivation Languages have hidden regularities karuppu naay puunaiyai thurathiyathu iruttil karuppu uruvam marainthathu naay thurathiya puunai vekamaaka ootiyathu 3

FORMAL STRUCTURES 4

Phrase-Structure Sometimes the bribed became partners in the company 5

Phrase-Structure 6 Binarize, CNF Sparsity issue with words Use POS tags S ADVP@S RBNP VP VBD @VP NP PP DT VBN NNSIN NP INDT NN S  ADVP @S @S  NP VP VP  VBD @VP @VP  NP PP NP  DT VBN NP  DT NN NP  NNS PP  IN NP ADVP  RB IN 

Evaluation Metric-1 Unsupervised Induction ◦ Binarized output tree  Possibly unlabelled Evaluation ◦ Gold treebank parse ◦ Recall - % of true constituents found ◦ Also precision and F-score Wall Street Journal (WSJ) dataset 7 S XX XX VBDX X X IN X VBNDT RB NNS NNDT

Dependency Structure 8 VBD VBN NNS VBN*DT VBD* IN IN* NN DT NN* RB Sometimes the NNS* the company bribed partners became in

Dependency Structure 9 VBDDTNNSNN INDT VBN RB Sometimes the bribed became partners in the company

Evaluation Metric-2 Unsupervised Induction ◦ Generates directed dependency arcs Compute (directed) attachment accuracy ◦ Gold dependencies ◦ WSJ10 dataset 10 VBDDTNNSNN INDT VBN RB Sometimes the bribed became partners in the company

Unsupervised Grammar Induction To learn the hidden structure of a language ◦ POS tag sequences as input ◦ Generates phrase-structure/ dependencies ◦ No attempt to find the meaning Overview ◦ Phrase-structure and dependency grammars ◦ Mostly on English (few on Chinese, German etc.) ◦ Learning restricted to shorter sentences ◦ Significantly lags behind the supervised methods 11

PHRASE-STRUCTURE INDUCTION 12

Toy Example Corpus the dog bites a mandog sleeps a dog bites a bonethe man sleeps Grammar S  NP VPNP  NN  man VP  V NPDet  aN  bone VP  VDet  theV  sleeps NP  Det NN  dogV  bites 13

EM for PCFG (Baker ’79; Lari and Young ’90) Inside-Outside ◦ EM instance for probabilistic CFG  Generalization of Forward-backward for HMMs ◦ Non-terminals are fixed ◦ Estimate maximum likelihood rule probabilities 14 S  NP VPV --> dog NP --> Det NDet --> man NP --> NN --> man VP --> VV --> man VP --> V NPDet --> bone VP --> NP VN --> bone Det --> theV --> bone N --> theDet --> bites V --> theN --> bites Det --> aV --> bites N --> aDet --> sleeps V --> aN --> sleeps Det --> dogV --> sleeps N --> dog S  NP VP1.0V --> dog NP --> Det N0.875Det --> man NP --> N0.125N --> man0.375 VP --> V0.5V --> man VP --> V NP0.5Det --> bone VP --> NP VN --> bone0.125 Det --> the0.428571V --> bone N --> theDet --> bites V --> theN --> bites Det --> a0.571429V --> bites0.5 N --> aDet --> sleeps V --> aN --> sleeps0.5 Det --> dogV --> sleeps N --> dog0.5

Inside-Outside Sometimes the bribed became partners in the 15 company @S  NP VP P(NP  the bribed) P(@S  NP VP) P(VP  became … company) P(S  Sometimes @S)

Constraining Search 16 Sometimes the bribed became partners in the company (Pereira and Schabes ’92; Schabes et al. ’93)

Constraining Search (Pereira and Schabes ’92; Schabes et al. ’93; Hwa ’99) Treebank bracketings ◦ Bracketing boundaries constrain induction What happens with limited supervision? ◦ More bracketed data exposed iteratively ◦ 0% bracketed data ◦ 100% bracketed data Right-branching baseline 17 Recall: 50.0 Recall: 78.0 Recall: 76.0

Distributional clustering (Adriaans et al. ’00; Clark ’00; van Zaanen ’00) Cluster the word sequences ◦ Context: adjacent words or boundaries ◦ Relative frequency distribution of contexts the black dog bites the man the man eats an apple Identifies constituents ◦ Evaluation on ATIS corpus 18 Recall: 35.6

Constituent-Context Model 19 (Klein and Manning ’02) Valid constituents in a tree should not cross S XX X X VBD X X DT VBN X X X DT NN RB NNS IN S XX XX VBDX X X IN X VBNDT RB NNS NNDT

Constituent-Context Model 20 Sometimes the bribed became partners in the company DT VBN RB VBD Recall Right-branch: 70.0 CCM: 81.6 S XX X X VBD X X DT VBN X X X DT NN RB NNS IN

DEPENDENCY INDUCTION 21

Dependency Model w/ Valence (Klein and Manning ’04) Simple generative model ◦ Choose head – P(Root) ◦ End – P(End | h, dir, v)  Attachment dir (right, left)  Valence (head outward) ◦ Argument – P(a | h, dir) 22 Dir Accuracy CCM: 23.8 DMV: 43.2 Joint: 47.5 VBDDTNNSNN INDT VBN RB Sometimes the bribed became partners in the company Sometimes the bribed became partners in the company Head – P(Root) Argument – P(a | h, dir) End – P(End | h, dir, v) Head – P(Root) Argument – P(a | h, dir) End – P(End | h, dir, v)

DMV Extensions (Headden et al. ’09; Blunsom and Cohn ’10) Extended Valence (EVG) ◦ Valence frames for the head  Allows different distributions over arguments Lexicalization (L-EVG) Tree Substitution Grammar ◦ Tree fragments instead of CFG rules 23 Dir Acc: 68.8 Dir Acc: 65.0 VBDDTNNSNN INDT VBN RB Sometimes the bribed became partners in the company Dir Acc: 67.7

MULTILINGUAL SETTING 24

Bilingual Alignment & Parsing (Wu ’97) Inversion Transduction Grammar (ITG) ◦ Allows reordering 25 S X X e2f4e2f4 e1f3e1f3 e4f2e4f2 e3f1e3f1 e 1 e 2 e 3 e 4 f 1 f 2 f 3 f 4

Bilingual Parsing (Snyder et al. ’09) Bilingual Parsing ◦ PP Attachment ambiguity I saw (the student (from MIT) 1 ) 2 ◦ Not ambiguous in Urdu میں ( یمآئٹی سے ) 1 ( طالب علم ) 2 کو دیکھا I ((MIT of) student) saw 26

Summary & Overview 27 EM for PCFG Constrain with bracketing Distributional Clustering CCM DMV Contrastive Estimation EVG & L-EVG TSG + DMV Data-oriented Parsing Parametric Search Methods Structural Search Methods EM for PCFG Constrain with bracketing Contrastive Estimation Distributional Clustering CCM DMV EVG & L-EVG TSG + DMV Data-oriented Parsing State-of-the-art Phrase-structure (CCM + DMV) Recall: 88.0 Dependency (Lexicalized EVG) Dir Acc: 68.8 Prototype

QUESTIONS ? Thanks! 28

Motivation Languages have hidden regularities 29

Motivation Languages have hidden regularities ◦ The guy in China ◦ … new leader in China ◦ That’s what I am asking you … ◦ I am telling you … 30

Issues with EM (Carroll and Charniak ’92; Periera and Schabes ’92; de Marcken ’05) (Liang and Klein ’08; Spitkovsky et al. ’10) Phrase-structure ◦ Finds local maxima instead of global ◦ Multiple ordered adjuctions Both phrase-structure & dependency ◦ Disconnect between likelihood and optimal grammar 31

Constituent-Context Model (Klein and Manning ’02) CCM ◦ Only constituent identity ◦ Valid constituents in a tree should not cross 32

Bootstrap phrases (Haghighi and Klein ’06) Bootstrap with seed examples for constituents types ◦ Chosen from most frequent treebank phrases ◦ Induces labels for constituents Integrate with CCM ◦ CCM generates brackets (constituents) ◦ Proto labels them 33 Recall: 59.6 Recall: 68.4

Dependency Model w/ Valence (Klein and Manning ’04) Simple generative model ◦ Choose head; attachment dir (right, left) ◦ Valence (head outward)  End of generation modelled separately 34 Dir Acc: 43.2 VBDDTNNSNN INDT VBN RB Sometimes the bribed became partners in the company

Learn from how not to speak Contrastive Estimation (Smith and Eisner ’05) ◦ Log-linear Model of dependency  Features: f(q, T)  P(Root); P(a | h, dir); P(End | h, dir, v)  Conditional likelihood 35

Learn from how not to speak (Smith and Eisner ’05) Contrastive Estimation  Ex. the brown cat vs. cat brown the ◦ Neighborhoods  Transpose (Trans), delete & transpose (DelOrTrans) 36 Dir Acc: 48.8

DMV Extensions-1 (Cohen and Smith ’08, ’09) Tying parameters ◦ Correlated Topic Model (CTM)  Correlation between different word types ◦ Two types of tying parameters  Logistic Normal (LN)  Shared LN 37 Dir Acc: 61.3

DMV Extensions-2 38 VBD VBN NNS VBN*DT VBD* IN IN* NN DT NN* RB Sometimes the NNS* the company bribed partners became in VBD VBN VBD became VBD NNSVBD NNS (Blunsom and Cohn ’10) NNS IN in NN

DMV Extensions-2 (Blunsom and Cohn ’10) Tree Substitution Grammar (TSG) ◦ Lexicalized trees ◦ Hierarchical prior  Different levels of backoff 39 Dir Acc: 67.7 VBD VBN VBD became VBD NNSVBD NNS IN in NN

A Survey of Unsupervised Grammar Induction Baskaran Sankaran Senior Supervisor: Dr Anoop Sarkar School of Computing Science Simon Fraser University.

Similar presentations

Presentation on theme: "A Survey of Unsupervised Grammar Induction Baskaran Sankaran Senior Supervisor: Dr Anoop Sarkar School of Computing Science Simon Fraser University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Survey of Unsupervised Grammar Induction Baskaran Sankaran Senior Supervisor: Dr Anoop Sarkar School of Computing Science Simon Fraser University.

Similar presentations

Presentation on theme: "A Survey of Unsupervised Grammar Induction Baskaran Sankaran Senior Supervisor: Dr Anoop Sarkar School of Computing Science Simon Fraser University."— Presentation transcript:

Similar presentations

About project

Feedback