Presentation is loading. Please wait.

Presentation is loading. Please wait.

Example-Based Treebank Querying Liesbeth Augustinus Vincent Vandeghinste Frank Van Eynde CLARIN Sofia, 2012-10-28.

Similar presentations


Presentation on theme: "Example-Based Treebank Querying Liesbeth Augustinus Vincent Vandeghinste Frank Van Eynde CLARIN Sofia, 2012-10-28."— Presentation transcript:

1 Example-Based Treebank Querying Liesbeth Augustinus Vincent Vandeghinste Frank Van Eynde CLARIN Sofia, 2012-10-28

2 NEDERBOOMS Exploitation of Dutch treebanks for research in linguistics CLARIN-VL Project Centre for Computational Linguistics (CCL) Dutch Grammar and Language Use (NGTG) Goals: –User-friendly tools –Access to large data files

3 NEDERBOOMS How can we combine the data-oriented approach of treebank mining with the knowledge-oriented method of theoretical and descriptive linguistics?

4 QUERYING LASSY Existing search tools: dtsearch (Kloosterman 2007) Dact (de Kok 2010) stand-alone tools query language: XPath = standard query language for xml trees

5 QUERYING LASSY Some examples: “look for all NP nodes in which the head noun is modified by the adjective ‘politiek’, e.g. politieke discussies” //node[@cat="np" and node[@rel="mod" and @pos="adj" and @root="politiek"] and node[@rel="hd" and @pos="noun"]

6 QUERYING LASSY Some examples: “look for all NP nodes in which the head noun is modified by the adjective ‘politiek’, e.g. politieke discussies” //node[@cat="np" and node[@rel="mod" and @pos="adj" and @root="politiek"] and node[@rel="hd" and @pos="noun"] “look for verb clusters in which a separable verb particle occurs between two verb forms, e.g. “Hij zegt dat ze spoedig zullen kennis maken” //node[node[@rel="hd" and @pos="verb" and @begin <../node[@rel="vc"]/node[@rel="svp" and @pos="part"]/@begin] and node[@rel="vc" and node[@rel="svp" and @begin <../node[@rel="hd" and @pos="verb"]/@begin]]]

7 QUERYING LASSY XPath –Not user-friendly –Knowledge of Alpino grammar necessary = problematic for non-technical linguists Verify theory through data with corpus or treebank examples Time consuming, requires some effort

8 QUERYING LASSY XPath –Not user-friendly –Knowledge of Alpino grammar necessary = problematic for non-technical linguists Verify theory through data with corpus or treebank examples Time consuming, requires some effort How to make interaction between computational linguistics and theoretical linguistics possible?

9 GrETEL Greedy Extraction of Trees for Empirical Linguistics Search tool based on example sentences Input = natural language No explicit knowledge of formal query language nor Alpino grammar required Bridge gap between descriptive and computational linguistics Available online http://nederbooms.ccl.kuleuven.be (optimised for Mozilla Firefox)http://nederbooms.ccl.kuleuven.be

10 GrETEL - online

11

12 GrETEL - input Green versus red word order in Dutch –green: past participle – auxiliary De NAVO stelt dat ze er alles aan gedaan heeft –red: auxiliary – past participle De NAVO stelt dat ze er alles aan heeft gedaan “The NATO claim that they have done everything in their power” (deredactie.be)

13 GrETEL - input

14 >> parsed with Alpino

15 GrETEL - annotation

16 >> info added to Alpino parse

17 GrETEL – query info

18 input example

19 GrETEL – query info input example Alpino parse

20 GrETEL – query info input example query treeAlpino parse

21 GrETEL – query info input example query treeAlpino parse XPath query

22 GrETEL – query info input example query treeAlpino parse XPath query treebanks

23 GrETEL – query tree

24 >> subtree extraction

25 GrETEL – query tree query tree

26 GrETEL – query tree query tree>> XPath generator >>

27 GrETEL – query tree query tree //node[@cat="ssub" and node[@rel="vc" and @cat="ppart" and node[@rel="hd" and @pos="verb"]] and node[@rel="hd" and @pos="verb" and @root="heb"]] XPath expression>> XPath generator >>

28 GrETEL – treebanks

29 LASSY Small in PostgreSQL database >> no local installation required

30 GrETEL – results

31 >> (adapted) query

32 GrETEL – results >> (adapted) query >> quantitative information

33 GrETEL – results

34 >> tree viewer

35 GrETEL – results >> tree viewer >> list of results

36 Thanks for your attention! Questions? liesbeth@ccl.kuleuven.be http://nederbooms.ccl.kuleuven.be


Download ppt "Example-Based Treebank Querying Liesbeth Augustinus Vincent Vandeghinste Frank Van Eynde CLARIN Sofia, 2012-10-28."

Similar presentations


Ads by Google