Presentation is loading. Please wait.

Presentation is loading. Please wait.

LING/C SC 581: Advanced Computational Linguistics

Similar presentations


Presentation on theme: "LING/C SC 581: Advanced Computational Linguistics"— Presentation transcript:

1 LING/C SC 581: Advanced Computational Linguistics
Lecture 23 April 9th

2 Today's Topics continuing from last time…

3 Bikel Collins Parser Java re-implementation of Collins’ parser (originally in C) easy to train (computationally inexpensive) Paper Daniel M. Bikel Intricacies of Collins’ Parsing Model. (PS) (PDF) 
in Computational Linguistics, 30(4), pp Software (page no longer exists)

4 Bikel Collins Download and install Dan Bikel’s parser
dbp.zip (on course homepage)

5 Bikel Collins Training the parser with the WSJ PTB See guide
userguide/guide.pdf directory: TREEBANK_3/parsed/mrg/wsj chapters 02-21: create one single .mrg file events: wsj obj.gz

6 Bikel Collins Settings:

7 Bikel Collins Parsing Command Input file format (sentences)

8 Java Runtime (JRE) Notes: JDK: Java Development Kit (superset of..)
java -version java version "1.8.0_191" Java(TM) SE Runtime Environment (build 1.8.0_191-b12) Java HotSpot(TM) 64-Bit Server VM (build b12, mixed mode) Notes: JDK: Java Development Kit (superset of..) JRE: Java Runtime Environment

9 Bikel Collins Verify the trainer and parser work on your machine: must have Java installed Let's test it: cd dbp dbp$ ls LICENSE dbparser.jar scorer telescope.lisp README doc settings userguide bin policy-files src dbp$ more telescope.lisp  (I saw a man with a telescope) (I saw a man with a sword) bin/parse 500 settings/collins.properties ../wsj obj.gz telescope.lisp  Executing command \tjava -server -Xms500m -Xmx500m -cp /Users/sandiway/courses/581/ling /dbp/dbparser.jar -Dparser.settingsFile=settings/collins.properties danbikel.parser.Parser - is ../wsj obj.gz -sa telescope.lisp

10 Bikel Collins processing sentence No. 1: (I saw a man with a telescope) danbikel.parser.Decoder: current sentence length: 7 words danbikel.parser.Decoder: cummulative average length: 7.0 words danbikel.parser.Decoder: trying with prune factor of 4.0 danbikel.parser.Decoder: highest probability item for sentence-length span (0,6): (S (NP-A (NPB (PRP I))) (VP (VBD saw) (NP-A (NPB (DT a) (NN man)) (PP (IN with) (NP-A (NPB (DT a) (NN telescope))))))) danbikel.parser.Decoder: top-ranked +TOP+ item: (+TOP+ (S (NP-A (NPB (PRP I))) (VP (VBD saw) (NP-A (NPB (DT a) (NN man)) (PP (IN with) (NP-A (NPB (DT a) (NN telescope))))))))

11 Bikel Collins processing sentence No. 2: (I saw a man with a sword)
danbikel.parser.Decoder: current sentence length: 7 words danbikel.parser.Decoder: cummulative average length: 7.0 words danbikel.parser.Decoder: trying with prune factor of 4.0 danbikel.parser.Decoder: highest probability item for sentence-length span (0,6): (S (NP-A (NPB (PRP I))) (VP (VBD saw) (NP-A (NPB (DT a) (NN man)) (PP (IN with) (NP-A (NPB (DT a) (NN sword))))))) danbikel.parser.Decoder: top-ranked +TOP+ item: (+TOP+ (S (NP-A (NPB (PRP I))) (VP (VBD saw) (NP-A (NPB (DT a) (NN man)) (PP (IN with) (NP-A (NPB (DT a) (NN sword))))))))

12 Bikel Collins File: bin/parse is a shell script that sets up program parameters and calls java

13 Bikel Collins

14 Bikel Collins File: bin/train is another shell script

15 Bikel Collins Relevant WSJ PTB files


Download ppt "LING/C SC 581: Advanced Computational Linguistics"

Similar presentations


Ads by Google