1 / 10

Markov Chain Algorithm in Perl

Markov Chain Algorithm in Perl. Michael Conway CS 265 May 4, 2011. Markov Chain Algorithm. Goal: Mimic proper English composition. 1. Populate prefix hash table with suffix lists. 2. Start at the beginning and jump from prefix to prefix, printing suffixes. Perl Implementation.

sherry
Download Presentation

Markov Chain Algorithm in Perl

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Markov Chain Algorithmin Perl Michael Conway CS 265 May 4, 2011

  2. Markov Chain Algorithm Goal: Mimic proper English composition. 1. Populate prefix hash table with suffix lists. 2. Start at the beginning and jump from prefix to prefix, printing suffixes.

  3. Perl Implementation # markov.pl: markov chain algorithm for 2-word prefixes $MAXGEN = 10000; $NONWORD = "\n"; $w1 = $w2 = $NONWORD; # initial state while (<>) { # read each line of input foreach (split) { push(@{$statetab{$w1}{$w2}}, $_); ($w1, $w2) = ($w2, $_); # multiple assignment } } push(@{$statetab{$w1}{$w2}}, $NONWORD); # add tail $w1 = $w2 = $NONWORD; for ($i = 0; $i < $MAXGEN; $i++) { $suf = $statetab{$w1}{$w2}; # array reference $r = int(rand @$suf); # @$suf is number of elems exit if (($t = $suf->[$r]) eq $NONWORD); print "$t\n"; ($w1, $w2) = ($w2, $t); # advance chain }

  4. Hash Generation $w1 = $w2 = $NONWORD; # initial state while (<>) { # read each line of input foreach (split) { push(@{$statetab{$w1}{$w2}}, $_); ($w1, $w2) = ($w2, $_); # multiple assignment } } push(@{$statetab{$w1}{$w2}}, $NONWORD); # add tail • Iterate over words in stdin, store suffixes • IMPORTANT code segment: @{$statetab{$w1}{$w2}} -> $statetab is implicitly declared hash -> $statetab{$w1} is i.d. reference to hash -> @{ } gets array “referenced” by $statetab{$w1}{$w2} • Note: <>, foreach, push(), multiple assignment

  5. Output Generation $w1 = $w2 = $NONWORD; for ($i = 0; $i < $MAXGEN; $i++) { $suf = $statetab{$w1}{$w2}; # array reference $r = int(rand @$suf); # @$suf is number of elems exit if (($t = $suf->[$r]) eq $NONWORD); print "$t\n"; ($w1, $w2) = ($w2, $t); # advance chain } • Same $statetab{$w1}{$w2}construction used for array reference • Note: rand, exit line, ->, interpolated string in print, multiple assignment

  6. Relative Performance

  7. Pros and Cons • Pros: • Very short source code • Necessary structures (array, hash) are built-in • Decent performance • Cons: • Can be confusing, especially to new users • Outperformed by some (like C) • Difficult to extend to different prefix sizes

  8. Extension: Different Prefix Sizes # markov_n.pl: markov chain algorithm for n-word prefixes $PREFLEN = 5; # or whatever $MAXGEN = 80; $NONWORD = "\n"; foreach $i (0..$PREFLEN-1) { $words[$i] = $NONWORD; # initial state } while (<>) { # read each line of input foreach (split) { push(@{hash_lookup(\@words)}, $_); @words = (@words[1..$#words],$_); } } push(@{hash_lookup(\@words)}, $NONWORD); # add tail

  9. Extension: Different Prefix Sizes @words = (); foreach $i (0..$PREFLEN-1) { $words[$i] = $NONWORD; } for ($i = 0; $i < $MAXGEN; $i++) { $suf = hash_lookup(\@words); # array reference $r = int(rand @$suf); # @$suf is number of elems exit if (($t = $suf->[$r]) eq $NONWORD); print "$t\n"; @words = (@words[1..$#words],($t)); # advance chain } sub hash_lookup { my $ref = \%statetab; my @wds = @{@_[0]}; for ($i = 0;$i < $#wds;$i++) { $ref = \%{${$ref}{$wds[$i]}}; } $ref = \@{${$ref}{$wds[$#wds]}}; return $ref; }

  10. Questions?

More Related