1 / 56

Introduction to PERL

Introduction to PERL. ICS 215. About Perl. P(ractical) E(xtraction and) R(eporting) L(anguage) P(athologically) E(clectic) R(ubbish) L(ister) many simple one-line scripts with useful command-line arguments perl single step interpreter (or compiler). Perl History.

verity
Download Presentation

Introduction to PERL

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to PERL ICS 215

  2. About Perl • P(ractical) E(xtraction and) R(eporting) L(anguage) • P(athologically) E(clectic) R(ubbish) L(ister) • many simple one-line scripts with useful command-line arguments • perl • single step interpreter (or compiler)

  3. Perl History • 1987: original version by Larry Wall • goal: a Unix scripting language • 1994: standard version 5.x • now two separate languages • Perl 5 • 5/2013: current latest stable revision 5.18 • Perl 6, • 2000: redesign of Perl

  4. Perl 6 • goal • remove "historical warts” • Motto • easy things should stay easy,hard things should get easier, andimpossible things should get hard • cleanup of APIs (and the internal design) • for our purposes • no substantial differences

  5. Perl's Influencers • syntax • C • lists • LISP • hashes • AWK • regular expressions • sed

  6. New in Perl 5 • data structures • functional programming • first-class functions • i.e. "closures" as values • see later • object-oriented programming

  7. Perl as Programming Language • C-like syntax • common features in procedural languages • variables • expressions • assignments • control structures: branching, loops • brace-delimited blocks • subroutines • modules

  8. Data Types • data typing is automatic • automatic type conversions, e.g., • from number to string • from string to number • illegal type conversions are fatal errors • automatic memory management • storage for data types is allocated and freed automatically • using reference counting • circular data structures can't be deallocated

  9. Readability of Perl • canbe VERY unreadable ($_=q|cevag"znkvzhz:";@_=(2..<>);juvyr($a=fuvsg@_){cevag"$a";@_=terc{$_%$a}@_;}|)=~tr|a-z|n-za-m|;eval"$_"; • don't do that • Perl'smotto: • There'smore thanoneway to do it.

  10. Readability of Perl cont. • typically much better: # prompt and read a number from userprint "maximum range; 2..: ";$maximum = <STDIN>; #what the user entered# make array of numbers up to maximum@numbers = (2..$maximum);# keep finding primeswhile ($prime = shift @numbers) { # print the next prime print "$prime\n"; # remove multiples of $prime @numbers = grep {$_ % $prime} @numbers;}

  11. Pragma • compiler directive • changes how program is compiled • mandatory • use strictand use warnings • use strict • mandatory declaration of variables • prevents common errors • disallows unsafe constructs • more formal, less casual • use warnings • helpful diagnostics

  12. Convention • whitespace is ignored • use spaces for readability • statements end with semicolons ; • sequence of statements can be combined into blocks with curly braces {} • good practices • single statement per line • indent statements in a block • use empty lines to separate subroutines • comment logical "blocks", rather than separate then with empty lines (hard to scroll)

  13. Comments • inline comments start with # • no multiline comments (out of the box) as in other languages • use comments for documentation, be • terse, • but expressive

  14. Sigil Convention • variables’ name start with a sigil • sigil identifies the data type • scalar: $ • $my_course = 'ICS 215'; • array: @ • @courses = ('ICS 665', $my_course); • hash: % • %age = {"Jan" => 61, "Parvina" => 9, "Kian" => 4}; • but:$age("Jan"); # because it's a scalar ! • subroutines: & (optional) • &quicksort(@numbers_to_sort); • typically, use underscore _ rather than camel naming

  15. Strings • in single quotes • 'ICS 215' • allowed special characters • \' single quote • \\ backslash • or in double quotes • "ICS 215" • other special characters • \n new line • \" double quote • \' single quote • \x77 hex value • etc.

  16. Interpolation in Strings • a variable can occur in a double quoted string $hi ="Hi. My name is $first_name $last_name"; • the current value of the variable is inserted into the string where it occures • Note: • not allowed in single-quoted string

  17. Simple Data Structures • scalar • single value • holds the value that is assigned until reassigned • array • multiple ordered values • access by index • hash • mulitpleunordered values • access by key • HashMapin Java • my qualifier limits a variable to the local scope

  18. Scalar Constants • number • integer 12 1E+100 • real 3.14159 2.71828182845905 • octal, hexadecimal 017, 0xF • string • a single character "a" • many characters "A quick brown..." • unicode "\x{263A}", UTF-8 format • reference

  19. Scalar Variables • can't begin with a digit (after the sygil$) • case-sensitive • reserved special names $_ $1 $/ • any scalar value or variable can be assigned to another scalar variable • a variable can hold a number at one point and a string at another time • Perl is a losely, dynamically typed language • unlike in Java, a strongly typed language • declared but not assigned variable is undefined my $favorite_food; # undefined

  20. Arrays • stores any number of ordered scalars • numbers, strings, references • any combination thereof • indexed by number • starting at 0 • accessed by index in square brackets [] • each item is a scalar • e.g.$array[0] • last index is $#array • last item is $array[$#array] • negative numbers count from end of list

  21. Arrays cont. • are made longer or shorter dynamically • pushadds an item after the last one • popremoves the last item • recall stack • unshiftadds an item before the first one • shiftremoves the first item • deleteremoves the item given its index

  22. Array Slices • sub-arrays or "slices" • made with @array[@indices] • @array[0]is slice • contains a single scalar • contrast with $array[0]- a scalar

  23. Array Construction • enumerating items @array = (215, "215", '215', 3.14); • range @numbers = (215..665); @letters = ("a" .. "z"); • combination thereof @array = (215..665, "215");

  24. foreach and Arrays • foreach loop iterates over entire array my @fruits = qw(papaya pineapple guava);foreach my $fruit(@fruits) { print "Let's have $fruit smoothie!\n";} • output Let's have papaya smoothie!Let's have pineapple smoothie!Let's have guava smoothie!

  25. Hashes • a set of key/value pairs – a "map" • values can be any scalars • numbers, strings, references • accessed by the key • item values assigned scalars • assigning a new value to a key overwrites the old value • items (key/value pairs) can be added or removed • can be sliced • simple to iterate

  26. Hash Construction • use => (rocket) %capitals = ( us => 'Washington',ch => 'Bern',cz => 'Prague'); • possible, but not recommended %capitals = ('us', 'Washington', 'ch', 'Bern'); • using variables $usa = 'us';$swiss= 'cz';@cities = ('Washington', 'Bern');%capitals = ( $usa => $cities[0], $swiss => $cities[0]);

  27. Hash Construction cont. • using hashes %new_capitals= (ca => 'Ottawa',hi => 'Honolulu');%all_capitals= (%capitals, %new_capitals);print "The capital of Hawaii is $all_capitals{hi}"; • output The capital of Hawaii is Honolulu

  28. Accessing Hashes • items retrieved as scalars $hi_capital = $all_capitals{hi}; • items assigned to scalars $all_capitals{cz} = 'Prague'; • assigning value to a new key adds the item pair to the hash

  29. Hashes as a Set • dynamically assumes the size needed • hashes can grow or shrink • canbe empty (but defined) %capitals = (); • items can be deleted by deleting the key delete $capitals{hi}; • we can check whether an items with a key exists print "exists" if exists $capitals{hi};

  30. Hash Slices • a sub-set of a hash's values – "slice" • a slice is an array • constructed with @hash{@some_keys} • e.g.: @foreign = @all_capitals{'ch', 'cz', 'ca'}; • watch out: • @hash{$key}is an array with one item – the key's value • $hash{$key}is a scalar

  31. List of Hash Components • keysreturns an array of hash keys • keys returned in random order • values returns a list of hash values • values returned in random order • each returns a list of hash key/value pairs • items returned in random order • used in while loop

  32. whileand Hashes • while-eachloops over entire hash while (my ($country, $capital) = each %capitals) { print "The capital of $country is $capital\n";} • output The capital of ch is BernThe capital of cz is PragueThe capital of hi is Washington

  33. Control Statements • Branches • if • unless • Loops • while • until • do • for • foreach • modifiers • all of the above, • but as suffix of a simple statement

  34. if Statement • if (condition) {statements } elsif (condition) {statements } else {statements } • else and elsif are optional • semantics as in Java • unless is the opposite of if

  35. Loop Statements • while (condition) {} • loops while condition is true • possibly not at all • until(condition) {} • loops until condition becomes true • the opposite of while • do {} while (condition) • loops least once while condition is true • do {} until (condition) • loops least once until condition becomes true • for (initialization; condition;increment) • as in Java • foreach • loops over a list or array

  36. Modifiers • if, unless, while, until, foreach • following a statement • to be used only with single statement • e.g. • attend_215() unless $is_holliday • print print $_ . "\n" foreach @weekdays • advantages • make programs more readable • emphasize the statement, rather than the control • parentheses () may be unnecessary

  37. Operators • Numeric • +, -, *, /, % • assignment: =, *=, -=, etc., ++, -- • bitwise:<<, >> • String • concatenation:. • repetition:x • assignment:.=, x= • Boolean • <, >, <=, >=, ==, != • lt, gt, le, gt, eq, ne(on strings) • &&, ||, ! (high precedence); and, or, not(low precedence) • ternary conditional ? : • my $max = $x > $y ? $x : $y;

  38. Array Functions • push, pop, unshift, shift • split my @pets = split(", ", "cat, dog, bird"); # ("cat","dog","bird") • join my $pets = join(", ", @pets); # "cat, dog, bird" • sort • sorts alphabetically by default • reverse • as list: reverses it my @pets = reverse sort qw(cat, dog, bird); # ("dog","cat","bird") • as scalar: concatenates into a string then reverses it my $semordnilap = reverse "deliver no evil"; # "live on reviled" • grep

  39. grep • Finds matching array items • typically based on a regular expression my @courses = qw(ics111 art211 ics215 com415);my @ics = grep(/ics/, @courses); # ["ics111","ics215"] • or based on a condition my @numbers = (22, 13, 51, 70, 111, 33, 22);my @odds= grep {$_ % 2 == 1} @numbers; # (13, 51, 111, 33) • Note: • grepassigns consecutive items of @numbers to $_ • $_ is the current item in @numbers

  40. References • refer to other data • syntax: \ my $courses_ref= \@courses; • references are scalars • dereferencing yields the data • syntax: -> my $course = $courses_ref->[1]; • allows to • build hierarchical data structures • pass arguments by reference • create anonymous data

  41. Hierarchical/Anonymous Data • via references my $courses = {ics => 215, lis => 699};print "I need ICS $courses->{ics}"; • output I need ICS 215 • for array and hashes -> is optional my @ics= (215, 311, 465);my @lis= (699, 691);my @courses = (\@ics, \@lis);print "I added ICS $courses[0]->[0]" . " and LIS $courses[1][1]\n"; • output I added ICS 215 and LIS 691

  42. Subroutines • declared with sub • return a value • returncommand • otherwise, the last statement's value • scalar or an array • wantarraytells what context is wanted • sygil&is optional • recursive calls are ok • subroutines are a data type, too

  43. Arguments • arguments in() parentheses • often not needed • but good pra tice • arguments are are available in @_ array • readability alert: assign @_ to local variables • also accessible via $_[i] • whereiis the index in @_ • passing by value • passing by reference

  44. Subroutine Example sub fibonacci { my ($n) = @_; die "Argument must be > 0" if $n < 1; return 1 if $n <= 2; fibonacci($n - 1) + fibonacci($n - 2); } my @series = (); foreach my $n (1..5) { push(@series, fibonacci($n)); } print "fibonacci numbers\n@series\n";

  45. Regular Expressions • abbreviated as "regex" (also regexp) • textual pattern matching • based on regular automata theoretical concepts – they are a language! • matching • letters, numbers, white-space, other characters • can exclude specific characters from a matche • match boundaries between words, line begin, line end, etc. • match subpatterns • match repetitions

  46. RegEx Gotchas • gotchas • typically "line oriented" • difficult to match across line-end characters • some characters have special meaning • if you mean the actual character you must "escape" it • use \ • numerous regex operators in Perl

  47. Regex Samples • ask questions about a string my $string = "Did the fox jump over the dog?"; • whether it • contains the substring "fox"? print "$string\n" if $string =~ /fox/; # yes • doesn't contain the letter "q"? print "$string\n" if $string !~ /q/; # yes • begins with the letters "z" or "1"? print "$string\n" if $string =~ /^[z1]/; # no • ends with a question mark? print "$string\n" if $string =~ /\?$/; # yes • contains only letters or digits? print "$string\n" if $string =~ /^[a-zA-Z0-9]*$/; # no • contains only digits? print "$string\n" if $string =~ /^\d*$/; # no

  48. Regex Operators & Quoting • use regex comparison operators to test for a match string=~regex# true if matches string!~regex# true if doesn't match • regexes may be quoted in several ways • default slashes /regex/ • may use other quotes with the match operator m print "$string\n" if $string =~ m(bird); print "$string\n" if $string !~ m|cat|; • if regex contains slash /use another quote with m

  49. Regex Language • sets of character to match enclosed in [] print "$string\n" if $string =~ /[bf]ox/; # box or fox • exclude characters in a set with ^ print "$string\n" if $string =~ m/fo[^g]/;# fox matches • predefined character sets (there are others) \swhite-space \Sanything but white-space \ddigits \Danything but digits \wword characters: letters, digits, underscore \Wanything but word characters .anything except "end-of-line" print "$string\n" if $string =~ m/\s\w.x\s/;# matches " fox " • if you need to match.(dot) escape it:\.

  50. Regex Boundaries • start of the string/line ^ • end of the string/line $ • word boundary \b my $string = "Did the fox jump over the dog?"; print "$string\n" if $string =~ /^[^D]/; # not print "$string\n" if $string =~ /\?$/; # yes print "$string\n" if $string =~ /\bfox\b/; # yes • if you need to match^or$ escape it: \^, \$

More Related