Presentation is loading. Please wait.

Presentation is loading. Please wait.

A dependency parser for Spanish. David Herrero Marco Language Processing and Computational Linguistics EDA171.

Similar presentations


Presentation on theme: "A dependency parser for Spanish. David Herrero Marco Language Processing and Computational Linguistics EDA171."— Presentation transcript:

1 A dependency parser for Spanish. David Herrero Marco Language Processing and Computational Linguistics EDA171

2 Contents

3 Sintax dependency I

4 Sintax dependency II

5 Sintax dependency III

6 Nivre’s parser

7

8 Example in Spanish 1 2 3 4 5 6 7 8

9 Corpus The corpus CliC-Talp has about 1 million words. The corpus has 500.000 words from a newspaper and 500.000 from LexEsp. Samples from 329 novels and differents articles

10 Corpus Spanish narrative 40% Newspapers 25% Sport newspapers 5% Scientific articles 10% Essays 10% Weekly magazin 10%

11 Experiments 2 models: – The top of the stack and the first word of the input list – The two first words on the top of the stack and the two first words of the input list. 2 columns: – Part of speech – Detailed part of speech

12 Experiments... and... what happens if I use the features? I change the corpus and i create a corpus modified I try to create the model I test it...

13 Experiments What is wrong? – Too many alternatives for every position – The tree is too large – The corpus is small for this model

14 Results Simple part of speech and one word in the stack and in the input--> 60% and 21 variables Detailed part of speech and one word in the stack and in the input--> 65% and 65 variables Simple part of speech and two words in the stack and in the input--> 68% and 21 variables Detailed part of speech and two words in the stack and in the input--> 70% and 65 variables Very Detailed part of speech and two words in the stack and in the input- -> 27% and 2518 variables

15 Experiments What is the way? – Left most – Right most

16 Thank you very much Questions?


Download ppt "A dependency parser for Spanish. David Herrero Marco Language Processing and Computational Linguistics EDA171."

Similar presentations


Ads by Google