Download presentation
Presentation is loading. Please wait.
Published byHorace Jeremy Blake Modified over 9 years ago
1
Dependency Parser for Swedish Project for EDA171 by Jonas Pålsson Marcus Stamborg
2
Dependency Grammar Describes relations between words in a sentence A relation is between a head and its dependent(s) All words have a head except the root of a sentence Thebigbrown beaver brown The beaver big
3
Dependency Parsing Find the links that connects words using a computer. Different algorithms exist. Nivre's parser has reported the best results for swedish.
4
Nivre's Parser Extension to Shift-Reduce. Adds arcs between input and stack. Produces a dependency graph using the following actions: Shift - moves the input to the stack. Reduce - pops the stack. Left arc - creates an arc from input to stack. Right arc - creates an arc from stack to input.
5
More about actions Nivre, J. (2004)
6
Corpus Talbanken05 – modernized and computerized version of Talbanken76 Modified for use in CoNNL-X Shared Task Training set is about 11500 sentences We used a test set containing about 300 sentences Example from the corpus: 1Jag_POPO_2SS__ 2tycker_VVVV_0ROOT__ 3det_POPO_2OO__
7
How we did it Collect data Build model Parse ARFFBuilder Trainer Parser Train Corpus Data Trained Classifier Test Corpus with relations Test Corpus
8
Collect data – Gold Standard Parsing Build Weka compatible data file (arff). Determining the action sequence from an annotated corpus is possible using the following rules. (Gold Standard Parsing) If input has stack as head -> Right Arc else if stack has input as head -> Left Arc else if arc exists between input and any word in stack -> Reduce else Shift
9
Train classifier Weka 3 – Data mining software C4.5 (J48) – Extension to the ID3 algorithm. Generates decision trees Uses features derived from the current state of the parser Outputs a trained classifier used by the parser to decide the next action
10
Parse using trained classifier Uses the trained classifier to determine the head for each word in a sentence Uses Nivre's algorithm with action decided by the classifier Calculates the score as
11
Features All features describe the current state of the parser 1 st set – Input and stack 2 nd set – Input, stack and children. 3 rd set – Input, stack and previous input. 4 th set – Input, stack, children and previous input. We only used POS in the feature sets Using lexical values actually decreased performance For every set we used constraints to model valid actions in the current state of the parser
12
Results Scores using features: Stack_n_POS, Input_n_POS, Children Scores using features: Stack_n_POS, Input_n_POS
13
Results cont. Scores using features: Stack_n_POS, Input_n_POS, Children, Previous_Input_POS Scores using features: Stack_n_POS, Input_n_POS, Previous_Input_POS
14
Conclusions Lexical values didn’t do much. Score even became worse. Might be better with different classifying algorithm or different test corpus Previous input word was a very effective feature, probably the single best addition from only stack and input Difficult to find optimal feature set
15
Future improvements Try other features Siblings Use LEX on specific words More words from original input string Simulations to find the optimum feature set Use SVM instead of C4.5
16
Thank you for listening More to come in the report
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.