Presentation is loading. Please wait.

Presentation is loading. Please wait.

NooJ2008 Budapest 2008-06-08 Verb Valency Enhanced Croatian Lexicon Kristina Vučković, Nives Mikelić Preradović, Zdravko Dovedan

Similar presentations


Presentation on theme: "NooJ2008 Budapest 2008-06-08 Verb Valency Enhanced Croatian Lexicon Kristina Vučković, Nives Mikelić Preradović, Zdravko Dovedan"— Presentation transcript:

1 NooJ2008 Budapest 2008-06-08 Verb Valency Enhanced Croatian Lexicon Kristina Vučković, Nives Mikelić Preradović, Zdravko Dovedan kvuckovi@ffzg.hrkvuckovi@ffzg.hr, nmikelic@ffzg.hr, zdovedan@ffzg.hrnmikelic@ffzg.hrzdovedan@ffzg.hr Faculty of Humanities and Social Sciences University of Zagreb Department of Information Sciences Ivana Lucica 3, Zagreb, Croatia

2 NooJ2008 Budapest 2008-06-08 The Plan OOur agenda? IIncrease # of unambiguos NPs BBy means of? EExisting chunker VVerb valency tags WWhy? TTo raise the chunker performence to a higher level MMake preparations for a Croatian parser

3 NooJ2008 Budapest 2008-06-08 Overview CCroatian verb valency lexicon mmain characteristics sselected data ..xml to.dic conversion hhow we did it pprevious grammars for <<VP> |<NP> | <PP> selection nnew enhanced grammars <<VP+DCobl> <<VP+PCobl> <<VP+PCtyp> rresults comparison pprecision, rrecall, ff-measure

4 NooJ2008 Budapest 2008-06-08 Croatian verb valency lexicon - CROVALLEX  Formal description of verb valency frames  1739 verbs  selected from the Croatian frequency dictionary, 1999.  5118 valency frames (in average: 3 frames per verb)  Each frame entry contains descriptions of  valence frame  frame attributes  frame attributes are either obligatory or optional i.e. obligatory or typical!

5 NooJ2008 Budapest 2008-06-08 Selected data 1.Reflexive particle ‘ se ’  if the verb is derived reflexive (e.g. vratiti se)  reflexiva tantum (e.g. smijati se).

6 NooJ2008 Budapest 2008-06-08 Selected data 2.Pure (prepositionless) case.  7 morphological cases in Croatian.  0 - hidden nominative,  1 - nominative,  2 - genitive,  3 - dative,  4 - accusative,  5 - vocative,  6 - locative,  7 - instrumental.

7 NooJ2008 Budapest 2008-06-08 Selected data 3.Prepositional case.  Lemma of the preposition and  number of the required morphological case are specified, e.g.  od+2,  na+4,  o+6

8 NooJ2008 Budapest 2008-06-08  pjevati,aspect=inf+DC_obl=0+AL_typ+PC_obl=6+… CROVALLEX 2.0008 - *.xml

9 NooJ2008 Budapest 2008-06-08 Converting to *.dic

10 NooJ2008 Budapest 2008-06-08 Croatian lexicon  Nouns – 19 600  Adjectives – 13 899  Verbs - 1133  Adverbs - 535  Proper Nouns – 67 704  S + C + Q + I + PRO - 363

11 NooJ2008 Budapest 2008-06-08 Previous grammars

12 NooJ2008 Budapest 2008-06-08 Perfect

13 NooJ2008 Budapest 2008-06-08 II. Future

14 NooJ2008 Budapest 2008-06-08

15 NooJ2008 Budapest 2008-06-08 New Grammars

16 NooJ2008 Budapest 2008-06-08 Verb + Obligatory DC

17 NooJ2008 Budapest 2008-06-08 Verb + obligatory PC

18 NooJ2008 Budapest 2008-06-08 Verb + typical PC

19 NooJ2008 Budapest 2008-06-08 VP+DCobl=

20 NooJ2008 Budapest 2008-06-08 VP+DCobl=Genitiv

21 NooJ2008 Budapest 2008-06-08 VP+DCobl=Dativ

22 NooJ2008 Budapest 2008-06-08 + agreement

23 NooJ2008 Budapest 2008-06-08 Results By handBefore CROVALLEX After CROVALLEX # of NP 115010991070 # of T unambiguous NP 601729 # of ambiguous NP 437246+49 # of F unambiguous NP 26+20

24 NooJ2008 Budapest 2008-06-08 P-R-F for unambiguous NPs Before CROVALLEX After CROVALLEX Precision 33,3168,13 Recall 52,2663,39 F-measure 40,6965,68

25 NooJ2008 Budapest 2008-06-08 Future work  Subordinating conjunction.  Infinitive construction can appear  with a preposition (e.g. 'nego+inf')  with the morphological case (e.g. 'inf+4').  Construction with adjectives.  e.g. adj-7 ('Osjećam se osvježenim' - 'I feel fresh').  Construction with adverbs.  e.g. adv-hrabro ('Osjećam se hrabro' - 'I feel brave').  Construction with nominative predicate.  e.g. nom_pred ('Historija je postala legendom' - 'History has become a legend').


Download ppt "NooJ2008 Budapest 2008-06-08 Verb Valency Enhanced Croatian Lexicon Kristina Vučković, Nives Mikelić Preradović, Zdravko Dovedan"

Similar presentations


Ads by Google