Download presentation
Presentation is loading. Please wait.
Published byLambert Lang Modified over 8 years ago
1
Graph-based Dependency Parsing with Bidirectional LSTM Wenhui Wang and Baobao Chang Institute of Computational Linguistics, Peking University
2
Outline Introduction Model Details Experiments Conclusion
3
Introduction
4
Graph-based models are of the most successful solutions to dependency parsing Given a sentence x, graph-based models formulate the parsing process as a searching problem :
5
Introduction
6
The most common choice for Score Function: Problems: – Heavily rely on feature engineering and feature design requires domain expertise. Moreover, millions of hand- crafted features heavily slow down parsing speed – Conventional first-order model limits the scope of feature selection. High-order features are proven to be useful in recovering long-distance dependencies. However, incorporating high-order features is usually done at high cost in terms of efficiency.
7
Introduction Pei et al. (2015) propose a feed-forward neural network to score subgraph Advantages: – Learn feature combinations automatically – Exploit sentence segment information by averaging Problem: – Require large feature set – Context window limits their ability in detecting long- distance information – Still rely on high-order factorization strategy to further improve the accuracy
8
Introduction We propose an LSTM-based neural network model for graph-based parsing Advantages: – Capture long range contextual information and exhibit improved accuracy in recovering long distance dependencies – Reduce the number of features to a minimum level – An LSTM-based sentence segment embedding method LSTM- Minus is utilized to effectively learn sentence-level information – Our model is a first-order model, the computational cost remains at the lowest level among graph-based models
9
Model Details
10
Architecture of our model input token Direction-specific Transformation
11
Model Details Segment embeddings Compared with averaging – LSTM-minus enables our model to learn segment embeddings from information both outside and inside the segments and thus enhances our model’s ability to access to sentence-level information
12
Model Details Direction-specific Transformation – The direction of edge is very important in dependency parsing – This information is bound with model parameters
13
Model Details Learning Feature Combinations – Activation function: tanh-cube – Intuitively, the cube term in each hidden unit directly models feature combinations in a multiplicative way
14
Model Details Features in our model
15
Experiments
16
Dataset English Penn TreeBank (PTB) – Penn2Malt for Penn-YM – Stanford parser for Penn-SD – Use Stanford POS Tagger for POS-tagging Chinese Penn Treebank – Gold segmentation and POS tags Two Models – Basic model – Basic model + Segment features
17
Experiments Compare with previous graph-based Models
18
Experiments
19
Compare with previous state-of-the-art Models
20
Experiments Model performance of different way to learn segment embeddings.
21
Experiments Advantage in recovering long distant dependencies – Using LSTM shows the same effect as high-order factorization strategy
22
Conclusion
23
We propose an LSTM-based neural network model for graph-based dependency parsing and an LSTM- based sentence segment embedding method Our model makes parsing decisions on a global perspective with first-order factorization, avoiding the expensive computational cost introduced by high- order factorization Our model minimize the effort in feature engineering
24
Recent work A better word representation for Chinese
25
Recent work Experiment Result
26
Thank you !
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.