introduction to perl n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Introduction to PERL PowerPoint Presentation
Download Presentation
Introduction to PERL

Loading in 2 Seconds...

play fullscreen
1 / 56

Introduction to PERL - PowerPoint PPT Presentation


  • 122 Views
  • Uploaded on

Introduction to PERL. ICS 215. About Perl. P(ractical) E(xtraction and) R(eporting) L(anguage) P(athologically) E(clectic) R(ubbish) L(ister) many simple one-line scripts with useful command-line arguments perl single step interpreter (or compiler). Perl History.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Introduction to PERL' - verity


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
about perl
About Perl
  • P(ractical) E(xtraction and) R(eporting) L(anguage)
  • P(athologically) E(clectic) R(ubbish) L(ister)
  • many simple one-line scripts with useful command-line arguments
  • perl
    • single step interpreter (or compiler)
perl history
Perl History
  • 1987: original version by Larry Wall
    • goal: a Unix scripting language
  • 1994: standard version 5.x
  • now two separate languages
  • Perl 5
    • 5/2013: current latest stable revision 5.18
  • Perl 6,
    • 2000: redesign of Perl
perl 6
Perl 6
  • goal
    • remove "historical warts”
  • Motto
    • easy things should stay easy,hard things should get easier, andimpossible things should get hard
  • cleanup of APIs (and the internal design)
  • for our purposes
    • no substantial differences
perl s influencers
Perl's Influencers
  • syntax
    • C
  • lists
    • LISP
  • hashes
    • AWK
  • regular expressions
    • sed
new in perl 5
New in Perl 5
  • data structures
  • functional programming
    • first-class functions
      • i.e. "closures" as values
      • see later
  • object-oriented programming
perl as programming language
Perl as Programming Language
  • C-like syntax
  • common features in procedural languages
    • variables
    • expressions
    • assignments
    • control structures: branching, loops
    • brace-delimited blocks
    • subroutines
    • modules
data types
Data Types
  • data typing is automatic
  • automatic type conversions, e.g.,
    • from number to string
    • from string to number
    • illegal type conversions are fatal errors
  • automatic memory management
  • storage for data types is allocated and freed automatically
    • using reference counting
    • circular data structures can't be deallocated
readability of perl
Readability of Perl
  • canbe VERY unreadable

($_=q|cevag"znkvzhz:";@_=(2..<>);juvyr($a=fuvsg@_){cevag"$a";@_=terc{$_%$a}@_;}|)=~tr|a-z|n-za-m|;eval"$_";

    • don't do that
  • Perl'smotto:
    • There'smore thanoneway to do it.
readability of perl cont
Readability of Perl cont.
  • typically much better:

# prompt and read a number from userprint "maximum range; 2..: ";$maximum = <STDIN>; #what the user entered# make array of numbers up to maximum@numbers = (2..$maximum);# keep finding primeswhile ($prime = shift @numbers) { # print the next prime print "$prime\n"; # remove multiples of $prime @numbers = grep {$_ % $prime} @numbers;}

pragma
Pragma
  • compiler directive
    • changes how program is compiled
  • mandatory
    • use strictand use warnings
  • use strict
    • mandatory declaration of variables
    • prevents common errors
    • disallows unsafe constructs
    • more formal, less casual
  • use warnings
    • helpful diagnostics
convention
Convention
  • whitespace is ignored
    • use spaces for readability
  • statements end with semicolons ;
  • sequence of statements can be combined into blocks with curly braces {}
  • good practices
    • single statement per line
    • indent statements in a block
    • use empty lines to separate subroutines
    • comment logical "blocks", rather than separate then with empty lines (hard to scroll)
comments
Comments
  • inline comments start with #
  • no multiline comments (out of the box) as in other languages
  • use comments for documentation, be
    • terse,
    • but expressive
sigil convention
Sigil Convention
  • variables’ name start with a sigil
  • sigil identifies the data type
    • scalar: $
      • $my_course = 'ICS 215';
    • array: @
      • @courses = ('ICS 665', $my_course);
    • hash: %
      • %age = {"Jan" => 61, "Parvina" => 9, "Kian" => 4};
      • but:$age("Jan"); # because it's a scalar !
    • subroutines: & (optional)
      • &quicksort(@numbers_to_sort);
  • typically, use underscore _ rather than camel naming
strings
Strings
  • in single quotes
    • 'ICS 215'
    • allowed special characters
      • \' single quote
      • \\ backslash
  • or in double quotes
    • "ICS 215"
    • other special characters
      • \n new line
      • \" double quote
      • \' single quote
      • \x77 hex value
      • etc.
interpolation in strings
Interpolation in Strings
  • a variable can occur in a double quoted string

$hi ="Hi. My name is $first_name $last_name";

  • the current value of the variable is inserted into the string where it occures
  • Note:
    • not allowed in single-quoted string
simple data structures
Simple Data Structures
  • scalar
    • single value
    • holds the value that is assigned until reassigned
  • array
    • multiple ordered values
    • access by index
  • hash
    • mulitpleunordered values
    • access by key
    • HashMapin Java
    • my qualifier limits a variable to the local scope
scalar constants
Scalar Constants
  • number
    • integer

12

1E+100

    • real

3.14159

2.71828182845905

    • octal, hexadecimal

017, 0xF

  • string
    • a single character

"a"

    • many characters

"A quick brown..."

    • unicode

"\x{263A}", UTF-8 format

  • reference
scalar variables
Scalar Variables
  • can't begin with a digit (after the sygil$)
  • case-sensitive
  • reserved special names

$_

$1

$/

  • any scalar value or variable can be assigned to another scalar variable
    • a variable can hold a number at one point and a string at another time
    • Perl is a losely, dynamically typed language
    • unlike in Java, a strongly typed language
  • declared but not assigned variable is undefined

my $favorite_food; # undefined

arrays
Arrays
  • stores any number of ordered scalars
    • numbers, strings, references
    • any combination thereof
  • indexed by number
    • starting at 0
  • accessed by index in square brackets []
  • each item is a scalar
    • e.g.$array[0]
  • last index is $#array
  • last item is $array[$#array]
  • negative numbers count from end of list
arrays cont
Arrays cont.
  • are made longer or shorter dynamically
  • pushadds an item after the last one
  • popremoves the last item
    • recall stack
  • unshiftadds an item before the first one
  • shiftremoves the first item
  • deleteremoves the item given its index
array slices
Array Slices
  • sub-arrays or "slices"
    • made with @array[@indices]
    • @array[0]is slice
    • contains a single scalar
    • contrast with $array[0]- a scalar
array construction
Array Construction
  • enumerating items

@array = (215, "215", '215', 3.14);

  • range

@numbers = (215..665);

@letters = ("a" .. "z");

  • combination thereof

@array = (215..665, "215");

foreach and arrays
foreach and Arrays
  • foreach loop iterates over entire array

my @fruits = qw(papaya pineapple guava);foreach my $fruit(@fruits) { print "Let's have $fruit smoothie!\n";}

  • output

Let's have papaya smoothie!Let's have pineapple smoothie!Let's have guava smoothie!

hashes
Hashes
  • a set of key/value pairs – a "map"
  • values can be any scalars
    • numbers, strings, references
  • accessed by the key
  • item values assigned scalars
    • assigning a new value to a key overwrites the old value
  • items (key/value pairs) can be added or removed
  • can be sliced
  • simple to iterate
hash construction
Hash Construction
  • use => (rocket)

%capitals = ( us => 'Washington',ch => 'Bern',cz => 'Prague');

  • possible, but not recommended

%capitals = ('us', 'Washington', 'ch', 'Bern');

  • using variables

$usa = 'us';$swiss= 'cz';@cities = ('Washington', 'Bern');%capitals = ( $usa => $cities[0], $swiss => $cities[0]);

hash construction cont
Hash Construction cont.
  • using hashes

%new_capitals= (ca => 'Ottawa',hi => 'Honolulu');%all_capitals= (%capitals, %new_capitals);print "The capital of Hawaii is $all_capitals{hi}";

  • output

The capital of Hawaii is Honolulu

accessing hashes
Accessing Hashes
  • items retrieved as scalars

$hi_capital = $all_capitals{hi};

  • items assigned to scalars

$all_capitals{cz} = 'Prague';

  • assigning value to a new key adds the item pair to the hash
hashes as a set
Hashes as a Set
  • dynamically assumes the size needed
    • hashes can grow or shrink
  • canbe empty (but defined)

%capitals = ();

  • items can be deleted by deleting the key

delete $capitals{hi};

  • we can check whether an items with a key exists

print "exists" if exists $capitals{hi};

hash slices
Hash Slices
  • a sub-set of a hash's values – "slice"
    • a slice is an array
  • constructed with

@hash{@some_keys}

  • e.g.:

@foreign = @all_capitals{'ch', 'cz', 'ca'};

  • watch out:
    • @hash{$key}is an array with one item – the key's value
    • $hash{$key}is a scalar
list of hash components
List of Hash Components
  • keysreturns an array of hash keys
    • keys returned in random order
  • values returns a list of hash values
    • values returned in random order
  • each returns a list of hash key/value pairs
    • items returned in random order
    • used in while loop
while and hashes
whileand Hashes
  • while-eachloops over entire hash

while (my ($country, $capital) = each %capitals) {

print "The capital of $country is $capital\n";}

  • output

The capital of ch is BernThe capital of cz is PragueThe capital of hi is Washington

control statements
Control Statements
  • Branches
    • if
    • unless
  • Loops
    • while
    • until
    • do
    • for
    • foreach
  • modifiers
    • all of the above,
    • but as suffix of a simple statement
if statement
if Statement
  • if (condition) {statements } elsif (condition) {statements } else {statements }
  • else and elsif are optional
  • semantics as in Java
  • unless is the opposite of if
loop statements
Loop Statements
  • while (condition) {}
    • loops while condition is true
    • possibly not at all
  • until(condition) {}
    • loops until condition becomes true
    • the opposite of while
  • do {} while (condition)
    • loops least once while condition is true
  • do {} until (condition)
    • loops least once until condition becomes true
  • for (initialization; condition;increment)
    • as in Java
  • foreach
    • loops over a list or array
modifiers
Modifiers
  • if, unless, while, until, foreach
    • following a statement
    • to be used only with single statement
  • e.g.
    • attend_215() unless $is_holliday
    • print print $_ . "\n" foreach @weekdays
  • advantages
    • make programs more readable
    • emphasize the statement, rather than the control
    • parentheses () may be unnecessary
operators
Operators
  • Numeric
    • +, -, *, /, %
    • assignment: =, *=, -=, etc., ++, --
    • bitwise:<<, >>
  • String
    • concatenation:.
    • repetition:x
    • assignment:.=, x=
  • Boolean
    • <, >, <=, >=, ==, !=
    • lt, gt, le, gt, eq, ne(on strings)
    • &&, ||, ! (high precedence); and, or, not(low precedence)
    • ternary conditional ? :
      • my $max = $x > $y ? $x : $y;
array functions
Array Functions
  • push, pop, unshift, shift
  • split

my @pets = split(", ", "cat, dog, bird"); # ("cat","dog","bird")

  • join

my $pets = join(", ", @pets); # "cat, dog, bird"

  • sort
    • sorts alphabetically by default
  • reverse
    • as list: reverses it

my @pets = reverse sort qw(cat, dog, bird); # ("dog","cat","bird")

    • as scalar: concatenates into a string then reverses it

my $semordnilap = reverse "deliver no evil"; # "live on reviled"

  • grep
slide39
grep
  • Finds matching array items
    • typically based on a regular expression

my @courses = qw(ics111 art211 ics215 com415);my @ics = grep(/ics/, @courses); # ["ics111","ics215"]

    • or based on a condition

my @numbers = (22, 13, 51, 70, 111, 33, 22);my @odds= grep {$_ % 2 == 1} @numbers; # (13, 51, 111, 33)

  • Note:
    • grepassigns consecutive items of @numbers to $_
    • $_ is the current item in @numbers
references
References
  • refer to other data
    • syntax: \

my $courses_ref= \@courses;

    • references are scalars
  • dereferencing yields the data
    • syntax: ->

my $course = $courses_ref->[1];

  • allows to
    • build hierarchical data structures
    • pass arguments by reference
    • create anonymous data
hierarchical anonymous data
Hierarchical/Anonymous Data
  • via references

my $courses = {ics => 215, lis => 699};print "I need ICS $courses->{ics}";

  • output

I need ICS 215

  • for array and hashes -> is optional

my @ics= (215, 311, 465);my @lis= (699, 691);my @courses = (\@ics, \@lis);print "I added ICS $courses[0]->[0]" . " and LIS $courses[1][1]\n";

  • output

I added ICS 215 and LIS 691

subroutines
Subroutines
  • declared with sub
  • return a value
    • returncommand
    • otherwise, the last statement's value
    • scalar or an array
      • wantarraytells what context is wanted
  • sygil&is optional
  • recursive calls are ok
  • subroutines are a data type, too
arguments
Arguments
  • arguments in() parentheses
    • often not needed
    • but good pra tice
  • arguments are are available in @_ array
    • readability alert: assign @_ to local variables
  • also accessible via $_[i]
    • whereiis the index in @_
  • passing by value
  • passing by reference
subroutine example
Subroutine Example

sub fibonacci {

my ($n) = @_;

die "Argument must be > 0" if $n < 1;

return 1 if $n <= 2;

fibonacci($n - 1) + fibonacci($n - 2);

}

my @series = ();

foreach my $n (1..5) {

push(@series, fibonacci($n));

}

print "fibonacci numbers\n@series\n";

regular expressions
Regular Expressions
  • abbreviated as "regex" (also regexp)
  • textual pattern matching
  • based on regular automata theoretical concepts – they are a language!
  • matching
    • letters, numbers, white-space, other characters
    • can exclude specific characters from a matche
    • match boundaries between words, line begin, line end, etc.
    • match subpatterns
    • match repetitions
regex gotchas
RegEx Gotchas
  • gotchas
    • typically "line oriented"
      • difficult to match across line-end characters
    • some characters have special meaning
    • if you mean the actual character you must "escape" it
      • use \
  • numerous regex operators in Perl
regex samples
Regex Samples
  • ask questions about a string

my $string = "Did the fox jump over the dog?";

  • whether it
    • contains the substring "fox"?

print "$string\n" if $string =~ /fox/; # yes

    • doesn't contain the letter "q"?

print "$string\n" if $string !~ /q/; # yes

    • begins with the letters "z" or "1"?

print "$string\n" if $string =~ /^[z1]/; # no

    • ends with a question mark?

print "$string\n" if $string =~ /\?$/; # yes

    • contains only letters or digits?

print "$string\n" if $string =~ /^[a-zA-Z0-9]*$/; # no

    • contains only digits?

print "$string\n" if $string =~ /^\d*$/; # no

regex operators quoting
Regex Operators & Quoting
  • use regex comparison operators to test for a match

string=~regex# true if matches

string!~regex# true if doesn't match

  • regexes may be quoted in several ways
    • default slashes

/regex/

    • may use other quotes with the match operator m

print "$string\n" if $string =~ m(bird);

print "$string\n" if $string !~ m|cat|;

      • if regex contains slash /use another quote with m
regex language
Regex Language
  • sets of character to match enclosed in []

print "$string\n" if $string =~ /[bf]ox/; # box or fox

  • exclude characters in a set with ^

print "$string\n" if $string =~ m/fo[^g]/;# fox matches

  • predefined character sets (there are others)

\swhite-space

\Sanything but white-space

\ddigits

\Danything but digits

\wword characters: letters, digits, underscore

\Wanything but word characters

.anything except "end-of-line"

print "$string\n" if $string =~ m/\s\w.x\s/;# matches " fox "

  • if you need to match.(dot) escape it:\.
regex boundaries
Regex Boundaries
  • start of the string/line

^

  • end of the string/line

$

  • word boundary

\b

my $string = "Did the fox jump over the dog?";

print "$string\n" if $string =~ /^[^D]/; # not

print "$string\n" if $string =~ /\?$/; # yes

print "$string\n" if $string =~ /\bfox\b/; # yes

  • if you need to match^or$ escape it:

\^, \$

regex quantifiers
Regex Quantifiers

quantifiers determine how many times a match can/must occur

*

    • 0 or more times

print "$string\n" if $string =~ /q*/;# yes 0 times

+

    • 1 or more times

print "$string\n" if $string =~ /q+/;

# no, there is no z

?

    • 0 or 1 time

print "$string\n" if $string =~ /fog?/;

# yes, q doesn't need to be there

  • if you need to, match*, +or?escape it:

\*, \+, \?

regex quantifiers cont
Regex Quantifiers cont.

{3}

    • exactly 3 times

print "$string\n" if $string =~ /\b\w{2}\b/;# no, all words have more that 2 letters

{2,5}

    • 2 to 5 times

print "$string\n" if $string =~ /\b\w{4,6}\b/;# yes "jump"

  • if you need to match{ or , or}escape it:

\{, \, \}

regex groups
Regex Groups
  • a group defines a subpattern
    • to be matched
    • match(es) can be retrieved
    • may be repeated
  • enclosed in parentheses ()

(group)

  • \1, \2
    • refer to the 1st and 2nd group, etc.
  • if you need to match(or)escape it:

\(, \)

regex group examples
Regex Group Examples

my $string = "Did the fox jump over the dog?";print "$string\n" if $string =~ /(fox){2}/; # "foxfox"print "$string\n" if $string =~ /(the\s).*\2/; # yes, 2nd "the"

regex operations
Regex Operations
  • m/match-pattern/ matches, returns boolean (true, false)
  • s/match-pattern/replacement-string/ substitutes
  • tr/match-pattern/replacement-pattern/ transliterates

my $string = "Did the fox jump over the dog?";$string =~ s/dog/cat/; # substitutes "cat" for "dog"$string =~ s(fox)(bird) # substitutes "bird" for "fox"print "$string\n";$string =~ tr/tu/12/; # substitutes t for 1 and u for 2print "$string\n";

  • output

Did the bird jump over the cat?

Did 1he bird j2mp over 1he ca1?

regex modifiers
Regex Modifiers

g

    • multiple substitutions (global)

i

    • case insensitive

s

    • treat string as one line.
  • Examples

my $breakfast = 'tea Tea coffee TEA & bagels';$breakfast =~ s/tea //gi;print "$string\n"; my $text = "line 1\nline 2\nline 3\n"; my ($line_1,$line_2) = ($text =~ g/(^.*$)(^.*$)/);print "$line_1 $line_2\n";

  • output

coffee & bagelsline 1 line 2