Dependency Model Using Posterior Context

Dependency Model Using Posterior Context
Kiyotaka Uchimoto † Masaki Murata † Satoshi Sekine ‡ Hitoshi Isahara † † Kansai Advanced Research Center, Communications Research Laboratory, Japan ‡ New York University, USA

Background Japanese dependency structure analysis
太郎は赤いバラを買いました。 Taro bought a red rose. dependency 太郎は赤いバラを買いました。 Taro_wa bara_wo kai_mashita Taro rose bought 太郎　はバラ　を買い　ました。赤　い Aka_i red bunsetsu Preparing a dependency matrix Finding an optimal set of dependencies for the entire sentence

Conventional (old) model
:bunsetsu dependency  Statistical approach Each element in the dependency matrix is estimated as a probability. Assigning one of two tags, a “1” or a “0,” to each relationship between two bunsetsus Whether or not there is a dependency between two bunsetsus Considers only the relationship between two bunsetsus. or “1” “0”

New model using posterior context
dependent between beyond  or or “1” “0” “0” “2” A relationship between two bunsetsus The anterior bunsetsu can depend on “0”: a bunsetsu between the two “1”: the posterior bunsetsu “2”: a bunsetsu beyond the posterior one The dependency probability of two bunsetsus Product of the probabilities of the relationship between the left bunsetsu and those to its right in a sentence Overall dependencies in a sentence Product of the probabilities of all the dependencies Identified by analyzing a sentence from right to left

Bunsetsu :Current bunsetsu 1 2 3 4 5 :Modifiee candidates Normalized dependency probability dpnd btwn 0.4 × 0.1 × 1.0 × 1.0 × 0.6 = 0.155 18.0% bynd dpnd btwn 0.6 × 0.3 × 1.0 × 1.0 × 0.6 = 0.329 38.1% Candidate Beyond (bynd) Dependent (dpnd) Between (btwn) 1 0.6 0.4 2 0.3 0.1 3 0.5 0.2 4 5 bynd dpnd btwn 0.6 × 0.6 × 0 × 1.0 × 0.6 = 0 bynd dpnd btwn 0.6 × 0.6 × 1.0 × 0 × 0.6 = 0 bynd dpnd 0.6 × 0.6 × 1.0 × 1.0 × 0.4 = 0.379 43.9%

Experiments Implemented the models within a maximum entropy framework
Features: basically some attributes of a bunsetsu itself or those between bunsetsus Using the Kyoto University text corpus (Kurohashi and Nagao, 1997) a tagged corpus of the Mainichi newspaper Training: 7,958 sentences (Jan. 1st to 8th) Testing: 1,246 sentences (Jan. 9th) The input sentences were morphologically analyzed and their bunsetsus were identified correctly.

Results of dependency analysis
The accuracy of the new model was about 1% better than that of the old model and there was a 3% improvement in sentence accuracy even using exactly the same features.

Relationship between the number of bunsetsus and accuracy
The accuracy of the new model is almost always better than that of the old model.

Amount of training data and accuracy
The accuracy of the new model is about 1% higher than that of the old model for any size of training data.

Conclusion A new model for dependency structure analysis
Learns the relationship between two bunsetsus as three categories; “between,” “dependent,” and “beyond.” Estimates the dependency likelihood by considering not only the relationship between two bunsetsus but also the relationship between the left bunsetsu and all of the bunsetsus to its right. The dependency accuracy of the new model was Almost always better than that of the old model for any sentence length. About 1% higher than that of the old model for any size of training data used. Future work Applying the similar model to English sentences

Dependency Model Using Posterior Context

Similar presentations

Presentation on theme: "Dependency Model Using Posterior Context"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Dependency Model Using Posterior Context

Similar presentations

Presentation on theme: "Dependency Model Using Posterior Context"— Presentation transcript:

Similar presentations

About project

Feedback