Download presentation
Presentation is loading. Please wait.
Published byLorena Snow Modified over 9 years ago
1
1 XWindows apps: emacs, xkwic LING 5200 Computational Corpus Linguistics Martha Palmer February 9, 2006
2
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 2 Emacs emacs –nw Control x, control c – exit (C-x,C-c) Control x, control s – save (C-x, C-s) Control x, control v – visit (C-x, C-v) Appropos
3
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 3 Emacs – Hour 12 in book emacs –nw Control x, b – switch to a new buffer, Control x, Control b – show all buffers, Control x, 1 – just show one window, Control g – ignore the last command, Control h – help (works on verbs?)
4
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 4 Preparing to run xkwic: modifying your.cshrc Don't forget to make a back-up copy of your.cshrc file before editing it.
5
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 5 Preparing to run xkwic: modifying your.cshrc ls –a Create an alias for cp to make it prompt you before blowing away a file alias cp 'cp –i'
6
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 6 Echo command Don't forget to make a back-up copy of your.cshrc file before editing it. You can check the value of an environment variable by using the echo command. Try it now: Enter echo $TGREP_CORPUS. What do you see? You shouldn't see anything, because you haven't defined the TGREP_CORPUS variable. If you do see something, ask for help.
7
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 7 Add to.cshrc – See Lab 4 # xkwic stuff setenv CWBHOME /corpora2/imscorpus setenv CORPUS_REGISTRY $CWBHOME/registry setenv MANPATH $CWBHOME/man:$MANPATH setenv UIDPATH "/usr/local/ims-cwb/lib/ X11/uid/ %N/%U" # tgrep stuff #setenv TGREP_CORPUS /corpora/treebank2/tbl_075/tgrepabl/brwn_cmb.c rp setenv TGREP_CORPUS /corpora/treebank2/tgrepabl/wsj_mrg.crp
8
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 8 The PATH variable One very important environment variable is the PATH variable. You can view the current value of your path variable by typing echo $PATH. As you can see, you already have a value defined. We're going to change it. Open your.cshrc file with a text editor ( emacs.cshrc or pico -w.cshrc. Find a line that looks something like this:
9
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 9 The PATH variable (cont.) set path=($HOME/bin /usr/local/bin /usr/local/etc /usr/local/lang/bin /usr/ucb /bin /usr/bin /usr/sbin /usr/local/ssh/bin /usr/local/TeX/bin /usr/local/mh/bin /usr/local/elm/bin /usr/local/metamail/bin /usr/local/gnu/bin /usr/ucb /usr/openwin/bin /usr/local/X11/bin /usr/ccs/bin /etc. )
10
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 10 Adding PATH’s Now you'll define some new environment variables in your.cshrc file. There are two ways to do it. One would be to copy the following lines into your.cshrc file, either by hand or by copying and pasting off of this web page. The other would be by tailing my.cshrc ( /home/mpalmer/.cshrc ), and appending the output to your.cshrc (hint: >> ). Don't forget to make a back-up copy of it first, and don't forget to source.cshrc afterwards!
11
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 11 A PATH for xkwic Now enter the string /usr/local/ims-cwb/bin before the period that precedes the closing parenthesis, so that it looks something like this: set path=($HOME/bin /usr/local/bin /usr/local/etc /usr/local/lang/bin /usr/ucb /bin /usr/bin /usr/sbin /usr/local/ssh/bin /usr/local/TeX/bin /usr/local/mh/bin /usr/local/elm/bin /usr/local/metamail/bin /usr/local/gnu/bin /usr/ucb /usr/openwin/bin /usr/local/X11/bin /usr/ccs/bin /etc /usr/local/ims- cwb/bin. )
12
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 12 Running xkwic Save your file, source it, and check the value of your path variable again. You should see /usr/local/ims-cwb/bin in it now (in addition to the rest of the stuff that was there before). You're now ready to run xkwic! Start it by entering xkwic at the command line.
13
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 13 Fire it up To start xkwic: $babel> xkwic & First step: select a corpus
14
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 14 Select a corpus
15
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 15 Select a corpus BNC is lemmatized… …Brown and WSJ aren't
16
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 16 Select a corpus and a search pattern 1. Select the BNC corpus by clicking on the question-mark next to the Search Space text field. 2. Search for the word research with the query [word = "research"]. How many results do you get?
17
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 17 Word attribute
18
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 18 Output of a search: KWIC
19
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 19 Select a corpus and a search pattern 1. Select the BNC corpus by clicking on the question-mark next to the Search Space text field. 2. Search for the word research with the query [word = "research"]. How many results do you get? 3. Search for the lemma research with the query [lemma = "research"]. How many results do you get? Why the difference?
20
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 20 Lemma attribute output Inflected forms Case differences
21
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 21 Regular expressions in attributes of a position
22
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 22 Searching with POS tags 1. Search for tokens of research that are not verbs with the query [lemma = "research" & pos != "V.*"]. How many results do you get? 2. Modify the display so that you can see the POS of all words: File -> Display Attributes -> Concordance -> Positional Attributes; highlight "word" and "pos", click "update" and "Dismiss". What are two non-verb POS tags that research occurs with?
23
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 23 POS attribute
24
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 24
25
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 25 I am SOOO frustrated…
26
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 26 Basic unit of xkwic: the position Attributes of a position: Word POS Lemma (BNC) Searching for "positions" by attribute…
27
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 27 Multiple attributes of a position [word = "research" & pos = "NN1"]
28
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 28 Multiple attributes of a position [word = "research" & pos = "NN1"] Ampersand to connect the two attributes
29
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 29 Multiple attributes of a position [word = "research" & pos = "NN1"] Single pair of square brackets around all attributes of the single position
30
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 30 Negation [word = "research" & pos != "NN1"] = means "is" or "does match" != means "isn't" or "doesn't match"
31
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 31 Regular expressions in attributes of a position Wildcard:. Character classes: [word = "[Tt]he"] Grouping Alternation: | Quantifiers: Kleene star, Kleene plus
32
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 32 Sequences of positions [lemma = "research"] [word = "the"] Each position gets its own set of square brackets
33
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 33 Sequences of positions [lemma = "research"] [word = "the"] A space between the positions
34
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 34 Regular expressions over positions Wildcard: [] Any single position Quantifier: * [lemma = "research"] []* [word = "funding"]
35
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 35 Resources – Laura is bugging me to make a CU Corpora page… Like this http://www.stanford.edu/dept/linguistics/ corpora/cas-home.html http://www.stanford.edu/dept/linguistics/ corpora/cas-home.html TGREP http://www.stanford.edu/dept/linguistics/ corpora/cas-tut-tgrep.html
36
LING 5200, 2006 BASED on Kevin Cohen’s LING 5200 36 Xkwic resources CQP home page: http://www.ims.uni- stuttgart.de/projekte/CorpusWorkbench/http://www.ims.uni- stuttgart.de/projekte/CorpusWorkbench/ CQP User's Manual: http://www.ims.uni- stuttgart.de/projekte/CorpusWorkbench/ CQPUserManual/HTML/ (html version)http://www.ims.uni- stuttgart.de/projekte/CorpusWorkbench/ CQPUserManual/HTML/
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.