1 / 36

XWindows apps: emacs, xkwic

XWindows apps: emacs, xkwic. LING 5200 Computational Corpus Linguistics Martha Palmer February 9, 2006. Emacs. emacs –nw Control x, control c – exit (C-x,C-c) Control x, control s – save (C-x, C-s) Control x, control v – visit (C-x, C-v) Appropos. Emacs – Hour 12 in book. emacs –nw

truly
Download Presentation

XWindows apps: emacs, xkwic

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XWindows apps: emacs, xkwic LING 5200 Computational Corpus Linguistics Martha Palmer February 9, 2006

  2. Emacs • emacs –nw • Control x, control c – exit (C-x,C-c) • Control x, control s – save (C-x, C-s) • Control x, control v – visit (C-x, C-v) • Appropos BASED on Kevin Cohen’s LING 5200

  3. Emacs – Hour 12 in book • emacs –nw • Control x, b – switch to a new buffer, • Control x, Control b – show all buffers, • Control x, 1 – just show one window, • Control g – ignore the last command, • Control h – help (works on verbs?) BASED on Kevin Cohen’s LING 5200

  4. Preparing to run xkwic: modifying your .cshrc Don't forget to make a back-up copy of your .cshrc file before editing it. BASED on Kevin Cohen’s LING 5200

  5. Preparing to run xkwic: modifying your .cshrc • ls –a • Create an alias for cp to make it prompt you before blowing away a file • alias cp 'cp –i' BASED on Kevin Cohen’s LING 5200

  6. Echo command Don't forget to make a back-up copy of your .cshrc file before editing it. You can check the value of an environment variable by using the echo command. Try it now: • Enter echo $TGREP_CORPUS. What do you see? You shouldn't see anything, because you haven't defined the TGREP_CORPUS variable. If you do see something, ask for help. BASED on Kevin Cohen’s LING 5200

  7. Add to .cshrc – See Lab 4 • # xkwic stuff setenv CWBHOME /corpora2/imscorpus setenv CORPUS_REGISTRY $CWBHOME/registry setenv MANPATH $CWBHOME/man:$MANPATH setenv UIDPATH "/usr/local/ims-cwb/lib/ X11/uid/ %N/%U" • # tgrep stuff #setenv TGREP_CORPUS /corpora/treebank2/tbl_075/tgrepabl/brwn_cmb.crp setenv TGREP_CORPUS /corpora/treebank2/tgrepabl/wsj_mrg.crp BASED on Kevin Cohen’s LING 5200

  8. The PATH variable • One very important environment variable is the PATH variable. You can view the current value of your path variable by typing echo $PATH. As you can see, you already have a value defined. We're going to change it. • Open your .cshrc file with a text editor (emacs .cshrc or pico -w .cshrc. Find a line that looks something like this: BASED on Kevin Cohen’s LING 5200

  9. The PATH variable (cont.) • set path=($HOME/bin /usr/local/bin /usr/local/etc /usr/local/lang/bin /usr/ucb /bin /usr/bin /usr/sbin /usr/local/ssh/bin /usr/local/TeX/bin /usr/local/mh/bin /usr/local/elm/bin /usr/local/metamail/bin /usr/local/gnu/bin /usr/ucb /usr/openwin/bin /usr/local/X11/bin /usr/ccs/bin /etc . ) BASED on Kevin Cohen’s LING 5200

  10. Adding PATH’s • Now you'll define some new environment variables in your .cshrc file. There are two ways to do it. One would be to copy the following lines into your .cshrc file, either by hand or by copying and pasting off of this web page. • The other would be by tailing my .cshrc (/home/mpalmer/.cshrc), and appending the output to your .cshrc (hint: >>). Don't forget to make a back-up copy of it first, and don't forget to source .cshrc afterwards! BASED on Kevin Cohen’s LING 5200

  11. A PATH for xkwic • Now enter the string /usr/local/ims-cwb/bin before the period that precedes the closing parenthesis, so that it looks something like this: • set path=($HOME/bin /usr/local/bin /usr/local/etc /usr/local/lang/bin /usr/ucb /bin /usr/bin /usr/sbin /usr/local/ssh/bin /usr/local/TeX/bin /usr/local/mh/bin /usr/local/elm/bin /usr/local/metamail/bin /usr/local/gnu/bin /usr/ucb /usr/openwin/bin /usr/local/X11/bin /usr/ccs/bin /etc /usr/local/ims-cwb/bin . ) BASED on Kevin Cohen’s LING 5200

  12. Running xkwic • Save your file, source it, and check the value of your path variable again. You should see /usr/local/ims-cwb/bin in it now (in addition to the rest of the stuff that was there before). • You're now ready to run xkwic! Start it by entering xkwic at the command line. BASED on Kevin Cohen’s LING 5200

  13. Fire it up • To start xkwic: • $babel> xkwic & • First step: select a corpus BASED on Kevin Cohen’s LING 5200

  14. Select a corpus BASED on Kevin Cohen’s LING 5200

  15. BNC is lemmatized… …Brown and WSJ aren't Select a corpus BASED on Kevin Cohen’s LING 5200

  16. Select a corpus and a search pattern • Select the BNC corpus by clicking on the question-mark next to the Search Space text field. • Search for the word research with the query [word = "research"]. How many results do you get? BASED on Kevin Cohen’s LING 5200

  17. Word attribute BASED on Kevin Cohen’s LING 5200

  18. Output of a search: KWIC BASED on Kevin Cohen’s LING 5200

  19. Select a corpus and a search pattern • Select the BNC corpus by clicking on the question-mark next to the Search Space text field. • Search for the word research with the query [word = "research"]. How many results do you get? • Search for the lemma research with the query [lemma = "research"]. How many results do you get? Why the difference? BASED on Kevin Cohen’s LING 5200

  20. Lemma attribute output Inflected forms Case differences BASED on Kevin Cohen’s LING 5200

  21. Regular expressions in attributes of a position BASED on Kevin Cohen’s LING 5200

  22. Searching with POS tags • Search for tokens of research that are not verbs with the query [lemma = "research" & pos != "V.*"]. How many results do you get? • Modify the display so that you can see the POS of all words: File -> Display Attributes -> Concordance -> Positional Attributes; highlight "word" and "pos", click "update" and "Dismiss". What are two non-verb POS tags that research occurs with? BASED on Kevin Cohen’s LING 5200

  23. POS attribute BASED on Kevin Cohen’s LING 5200

  24. BASED on Kevin Cohen’s LING 5200

  25. I am SOOO frustrated… BASED on Kevin Cohen’s LING 5200

  26. Basic unit of xkwic:the position • Attributes of a position: • Word • POS • Lemma (BNC) • Searching for "positions" by attribute… BASED on Kevin Cohen’s LING 5200

  27. Multiple attributes of a position • [word = "research" & pos = "NN1"] BASED on Kevin Cohen’s LING 5200

  28. Multiple attributes of a position • [word = "research" & pos = "NN1"] Ampersand to connect the two attributes BASED on Kevin Cohen’s LING 5200

  29. Multiple attributes of a position • [word = "research" & pos = "NN1"] Single pair of square brackets around all attributes of the single position BASED on Kevin Cohen’s LING 5200

  30. Negation • [word = "research" & pos != "NN1"] = means "is" or "does match" != means "isn't" or "doesn't match" BASED on Kevin Cohen’s LING 5200

  31. Regular expressions in attributes of a position • Wildcard: . • Character classes: [word = "[Tt]he"] • Grouping • Alternation: | • Quantifiers: Kleene star, Kleene plus BASED on Kevin Cohen’s LING 5200

  32. Sequences of positions [lemma = "research"] [word = "the"] Each position gets its own set of square brackets BASED on Kevin Cohen’s LING 5200

  33. Sequences of positions [lemma = "research"] [word = "the"] A space between the positions BASED on Kevin Cohen’s LING 5200

  34. Regular expressions over positions • Wildcard: [] • Any single position • Quantifier: * • [lemma = "research"] []* [word = "funding"] BASED on Kevin Cohen’s LING 5200

  35. Resources – Laura is bugging me to make a CU Corpora page… • Like this http://www.stanford.edu/dept/linguistics/corpora/cas-home.html • TGREP http://www.stanford.edu/dept/linguistics/corpora/cas-tut-tgrep.html BASED on Kevin Cohen’s LING 5200

  36. Xkwic resources CQP home page: http://www.ims.uni-stuttgart.de/projekte/CorpusWorkbench/ CQP User's Manual: http://www.ims.uni-stuttgart.de/projekte/CorpusWorkbench/CQPUserManual/HTML/ (html version) BASED on Kevin Cohen’s LING 5200

More Related