Download presentation
Presentation is loading. Please wait.
Published byMorgan Goodman Modified over 9 years ago
1
LING 581: Advanced Computational Linguistics Lecture Notes February 9th
2
tregex Pattern matching for passives: using variable names and regex group numbering for coindexation matching for passives (NP-SBJ-i and object of VP [NP [ –NONE- [ -*-I ]]])
3
Homework Task Report Bracketing guide – TREEBANK_3/docs/prsguid1.pdf Pattern matching for selected constructions in – wsj-00-24-tregex.mrg
4
Bikel Collins From treebanks search to stochastic parsers trained on the WSJ Penn treebank Java re-implementation of Collins’ parser Paper – Daniel M. Bikel. 2004. Intricacies of Collins’ Parsing Model. (PS) (PDF) in Computational Linguistics, 30(4), pp. 479- 511.PS) (PDF) in Computational Linguistics, 30(4), pp. 479- 511. – http://www.cis.upenn.edu/~dbikel/papers/collins- intricacies.pdf Software – http://www.cis.upenn.edu/~dbikel/software.html#stat- parser
5
Bikel Collins some TCL/TK code (I wrote for research use) makes it easy to work the parser without memorizing the command line options some TCL/TK code (I wrote for research use) makes it easy to work the parser without memorizing the command line options
6
Bikel Collins The wrapper is syntactic sugar for various commands Scripting language is TCL/TK (“tickle T K”) Assume variables – set prefix "/Users/sandiway/research/" – set dbprefix "$prefix/dbparser" – set tbvprefix "/Applications/treebankviewer.app/Contents/MacOS" POS tagging (MXPOST, in directory jmx) – $prefix/jmx/mxpost $prefix/jmx/tagger.project /tmp/err.txt Parsing – $dbprefix/bin/parse 400 $dbprefix/settings/$properties $dbprefix/bin/$ddf /tmp/test2.txt 2>@ stdout Training – $dbprefix/bin/train 800 $dbprefix/settings/$properties $dbprefix/bin/$mrg 2>@ stdout
7
Bikel Collins POS tagging (MXPOST, in directory jmx) – tagger_input – $prefix/jmx/mxpost $prefix/jmx/tagger.project /tmp/err.txt Parsing – set ddf "wsj-02-21.obj.gz” – set properties "collins.properties" – parser_input – $dbprefix/bin/parse 400 $dbprefix/settings/$properties $dbprefix/bin/$ddf /tmp/test2.txt 2>@ stdout Training – set mrg "wsj-02-21.mrg” – set properties "collins.properties" – $dbprefix/bin/train 800 $dbprefix/settings/$properties $dbprefix/bin/$mrg 2>@ stdout Unix file descriptors 0 Standard input (stdin) 1Standard output (stdout) 2Standard error(stderr) GUI components frame.input text.input.t -height 4 -yscrollcommand {.input.s set} scrollbar.input.s -command {.input.t yview} frame.tagged text.tagged.t -height 9 -yscrollcommand {.tagged.s set} scrollbar.tagged.s -command {.tagged.t yview} Code proc tagger_input {} { set lines [.input.t get 1.0 end] set infile [open "/tmp/test.txt" w] puts -nonewline $infile [string trimright $lines] close $infile } proc parser_input {} { set lines [.tagged.t get 1.0 end] set infile [open "/tmp/test2.txt" w] puts -nonewline $infile [string trimright $lines] close $infile } Unix file descriptors 0 Standard input (stdin) 1Standard output (stdout) 2Standard error(stderr) GUI components frame.input text.input.t -height 4 -yscrollcommand {.input.s set} scrollbar.input.s -command {.input.t yview} frame.tagged text.tagged.t -height 9 -yscrollcommand {.tagged.s set} scrollbar.tagged.s -command {.tagged.t yview} Code proc tagger_input {} { set lines [.input.t get 1.0 end] set infile [open "/tmp/test.txt" w] puts -nonewline $infile [string trimright $lines] close $infile } proc parser_input {} { set lines [.tagged.t get 1.0 end] set infile [open "/tmp/test2.txt" w] puts -nonewline $infile [string trimright $lines] close $infile }
8
Bikel Collins There’s also a simple tree viewer I wrote but it may not run on your system…
9
Bikel Collins Relevant files and directories bikeldemo – wrapper2.tcl(prefix set to /Users/sandiway) jmx – mxpost(shell script) – mxpost.jar(Java code) dbparser – dbparser/bin/parse(shell script) – dbparser/bin/train(shell script) – dbparser/dbparser.jar(Java code) – dbparser/userguide/guide.pdf
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.