1 / 26

LING/C SC/PSYC 438/538

LING/C SC/PSYC 438/538. Lecture 5 9/8 Sandiway Fong. Administrivia. Homework 1 (from lecture 3) was due last night (at midnight). Today’s Topics. Review Homework 1 We’ll go through it in class today Chapter 2 of JM Section 2.1 on regular expressions ( which you’ve already read … ).

allan
Download Presentation

LING/C SC/PSYC 438/538

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LING/C SC/PSYC 438/538 Lecture 5 9/8 Sandiway Fong

  2. Administrivia • Homework 1 (from lecture 3) • was due last night (at midnight)

  3. Today’s Topics • Review • Homework 1 • We’ll go through it in class today • Chapter 2 of JM • Section 2.1 on regular expressions • (which you’ve already read…)

  4. Safari Book available online (Thanks! Don Merson) • UA Library has been given access to the full Safari Books Online service. • This allows you to read a vast number of technical books via your browser. • However, it is currently only a trial. http://proquest.safaribooksonline.com.ezproxy1.library.arizona.edu/

  5. Homework Review • Question 1: 438 and 538 (7 points) • Given • @sentence1 = (I, saw, the, the, cat, on, the, mat); • @sentence2 = (the, cat, sat, on, the, mat); • Write a simple Perl program which detects repeated words (many spell checker/grammar programs have this capability) • It should print a message stating the repeated word and its position if one exists • e.g. word 3 “the” is repeated in the case of sentence1 • No repeated words found in the case of sentence2 • note: output multiple messages if there are multiple repeated words • Hint: use a loop • Submit your Perl code and show examples of your program working

  6. Homework Review • Thinking algorithmically… w1 w2 w3 w4 w5 Compare w1 with w2 Compare w2 with w3 Compare w3 with w4 Compare w4 with w5

  7. Homework Review • Turning an algorithm into Perl code: Array indices start from 0… array @words words0 ,words1 … wordsn-1 Compare w1 with w1+1 for ($i=0; $i<$#words; $i++) { compare word indexed by $i to word indexed by $i+1 if same string, print message } Compare w2 with w2+1 Compare wn-2 with wn-2+1 Array indices end at $#words… Compare wn-1 with wn “for” loop implementation

  8. Homework Review • First iteration (there are many ways to do this…) • (the basic for-loop) my @sentence1 = (I, saw, the, the, cat, on, the, mat); my @sentence2 = (the, cat, sat, on, the, mat); my @words = @sentence1; for ($i=0; $i<$#words; $i++) { if ($words[$i] eq $words[$i+1]) { print "word $i \"$words[$i]\" is repeated\n" } }

  9. Homework Review • 2nd iteration • (setting a flag when a repeated word is found) • (condition the output based on the value of the flag) my $flag = 0; for ($i=0; $i<$#words; $i++) { if ($words[$i] eq $words[$i+1]) { print "word $i \"$words[$i]\" is repeated\n"; $flag = 1 } } print "No words repeated\n" unless $flag

  10. Homework Review • 3rd iteration • (encapsulating the loop in a subroutine) sub check_repeated { my @words = @_; my $flag = 0; for ($i=0; $i<$#words; $i++) { if ($words[$i] eq $words[$i+1]) { print "word $i \"$words[$i]\" is repeated\n"; $flag = 1 } } print "No words repeated\n" unless $flag } print "@sentence1\n"; check_repeated(@sentence1); print "@sentence2\n"; check_repeated(@sentence2);

  11. Homework Review • Question 2: 438 and 538 (3 points) • Describe what would it take to stop a repeated word program from flagging legitimate examples of repeated words in a sentence • (No spell checker/grammar program that I know has this capability) • Examples of legitimately repeated words: • I wish that that question had an answer • Because he had had too many beers already, he skipped the Friday office happy hour

  12. Homework Review • Question 3: 538 (10 points), (438 extra credit) • Write a simple Perl program that outputs word frequencies for a sentence • E.g. given • @sentence1 = (I, saw, the, cat, on, the, mat, by, the, saw, table); • output a summary that looks something like: • the occurs 4 times • saw occurs twice • I, car, mat, on, by, table occurs once only • Hint: build a hash keyed by word with value frequency • Submit your Perl code and show examples of your program working

  13. Homework Review • Thinking algorithmically… w0 w0 w1 w2 w3 w4 w5 foreach $word (@sentence) hash data structure = “labeled medicine cabinet”

  14. Homework Review • Sample answer @sentence = (the, cat, sat, on, the, mat, that, the, cat, likes, most); %freq = (); foreach $word (@sentence) { if (exists $freq{$word}) { $freq{$word}++; } else { $freq{$word} = 1; } } foreach $word (keys %freq) { print "$word occurs $freq{$word} time(s)\n"; } perl e2.prl on occurs 1 time(s) the occurs 3 time(s) cat occurs 2 time(s) most occurs 1 time(s) sat occurs 1 time(s) likes occurs 1 time(s) that occurs 1 time(s) mat occurs 1 time(s) Further simplifications to the code are possible but the basic logic remains

  15. Chapter 2: JM • Today • using your Perl skills on • Section 2.1 Regular Expressions • Online tutorials • http://perldoc.perl.org/perlrequick.html • http://perldoc.perl.org/perlretut.html

  16. Pattern Matching JM, Chapter 2, pg 17 Merriam-Webster online

  17. Chapter 2: JM • Perl regular expression (re) matching: • $a =~ /foo/ • /…/ contains a regular expression • will evaluate to true/false depending on what’s contained in $a • Perl regular expression (re) match and substitute: • $a =~ s/foo/bar/ • s/…match… /…substitute… / contains two expressions • will modify $a by looking for a single occurrence of match and replacing that with substitute • s/…match… /…substitute… /gglobal match and substitute

  18. Chapter 2: JM • Most useful with code for reading in a file line-by-line: open($txtfile,$ARGV[0]) or die "$ARGV[0] not found!\n"; while ($line = <$txtfile>) { do RE stuff with $line }

  19. Chapter 2: JM

  20. Chapter 2: JM

  21. Chapter 2: JM

  22. Chapter 2: JM Sheeptalk

  23. Chapter 2: JM

  24. Chapter 2: JM • Precedence of operators • Example: Column 1 Column 2 Column 3 … • /Column [0-9]+ */ • /(Column [0-9]+ *)*/ • /house(cat(s|)|)/ • Perl: • In a regular expression the pattern matched by within the pair of parentheses is stored in $1 (and $2 and so on) • Precedence Hierarchy:

  25. Chapter 2: JM http://perldoc.perl.org/perlretut.html A shortcut: list context for matching

  26. Chapter 2: JM • s/([0-9]+)/<\1>/

More Related