Presentation is loading. Please wait.

Presentation is loading. Please wait.

Extracting LTAGs from Treebanks Fei Xia 04/26/07.

Similar presentations


Presentation on theme: "Extracting LTAGs from Treebanks Fei Xia 04/26/07."— Presentation transcript:

1 Extracting LTAGs from Treebanks Fei Xia 04/26/07

2 Q1: How does grammar extraction work?

3 Two types of elementary tree in LTAG VP ADVP ADV still VP* Initial tree:Auxiliary tree: S NP VP VNP draft  Arguments and adjuncts are in different types of elementary trees

4 Adjoining operation Y Y*

5 They still draft policies

6 The treebank tree

7 Step 1: Distinguish head/argument/adjunct

8 Step 2: Insert additional nodes S still they draft policies PRP NP ADVP RB VP NP VBP NNS VP still they draft policies PRP NPADVP RB VP NP NNS S VBP

9 Step 3: Build elementary trees #1: #2: #3 : #4 :

10 Extracted grammar NP PRP they VP ADVPVP* RB still #1:#2: NP NNS policies S NP VP NPVBP draft #3: #4:

11 Q2: What info was missing in the source treebank? Head/argument/adjunct distinction –Use function tags and heuristics Raising verbs (e.g., seem, appear) vs. other verbs. –He seems to be late –He wants to be late  Need a list of raising verbs in that language Features, feature equation (e.g., agreement), …

12 Q3: what methodological lessons can be drawn? The algorithm for extracting LTAGs from treebanks is straightforward. Some missing information can be “recovered” based on heuristics, others cannot.  The extracted LTAGs are not as rich as the ones built by hand. Nevertheless, the grammars have been shown to be useful for parsing, SuperTagging, etc.

13 Q4: What are the advantages of a PS or DS treebank? The original extraction algorithm assumes the input is a PS treebank. But it can be easily extended if the input is a DS treebank. –Extract tree segments from DS –Run DS  PS algorithm on the segments to get elementary trees

14 Q5: Building a treebank for a formalism or building a general treebank? I prefer the latter because –A general treebank can be used for different formalisms. –Different grammars under the same formalisms can be extracted. –Annotating a general treebank is often easier.


Download ppt "Extracting LTAGs from Treebanks Fei Xia 04/26/07."

Similar presentations


Ads by Google