slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
YOLT PowerPoint Presentation
Download Presentation
YOLT

Loading in 2 Seconds...

play fullscreen
1 / 13

YOLT

0 Views Download Presentation
Download Presentation

YOLT

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. YOLT Yuan Zheng Omar Ahmed Lukas Dudkowski T. Mark Kuba

  2. Overview of YOLT • Simple scripting language • Easy for coding and maintenance. • Regular expression support • := and @ • “Web-scraping” uses • Natural Language Processing • Generating RSS Feeds • Reformatting HTML for other uses (XML,etc)

  3. A Useful YOLT Program

  4. Semantics • YOLT Semantic checker is extremely simple. It serves a few main tasks: • Make sure that functions are declared properly, i.e. function declarations match functions, and function calls match the declarations • Make sure that variables are initialized before they are used (or, in some cases, un-initialized) • (redundant) Make sure that the tree is properly formed (i.e. make sure that an if-then-else node has exactly three children, etc) *note*: there was once basic type-checking, but no longer.

  5. Semantics Lessons Learned • It is very easy to do too much in semantic checking • Either there are types, or no types (NO MIDDLE GROUND) • Scripting languages are an enormous relief to a semantic checker--they take away the biggest hassles • The tree walker should know EXACTLY what the structure of the AST will look like and cannot make ANY assumptions--things, as evident, can break down when you least expect them to.

  6. Code Generation • Written in Java • Input: correct AST • Output: Perl program AST Perl Program Code generator Java

  7. Implementation • Walk AST • According to the information of the node, generate code or go down to the child node e.g.: := $a http://www.columbia.edu Go down to the tree at node “:=“ Generate code at node “$a” and “http://www.columbia.edu”

  8. Implementation (tricks) • The httpget := • invoke UNIX system call “wget” to download the web page into a temp file • Read the file line by line and store them into an perl array • Invoke another UNIX system call “rm” to remove the temp file • Keep the web address in an perl scalar • Scalar and arrays use same syntax • Compiler (code generator) “guesses” whether the variable is a scalar or an array • Arrays can only appears in certain places (e.g.. Foreach)

  9. Documentation and Testing Lexer/Parser - Semantic Checker Log result: Good should be good. Bad should be bad. Test Cases • Good • Bad Lexer/Parser Semantic Checker Diff Reference File: What I think it should produce

  10. Integration Testing Trying little YOLT programs to see functionality, code generation, etc. Working out bugs in implementation & design. Example: Generated Perl • Goal: display any comics that have the word hamster in the URL of www.toothpastefordinner.com, Summer 2002 archive. $toothpaste_home ="http://www.toothpastefordinner.com/"; system('wget -q -O - http://www.toothpastefordinner.com/archives-sum02.php > toothpaste.txt'); open INFILE, "toothpaste.txt"; @toothpaste=<INFILE>; close INFILE; system ('rm toothpaste.txt'); $toothpaste = "http://www.toothpastefordinner.com/archives-sum02.php"; $tags ="<a href=\"(.*)\">.*hamster.*</a>"; @tmp1=(); foreach ( @toothpaste) { if ($_=~m/($tags)/i){ push @tmp1, $2} } @elements = @tmp1; foreach $x ( @elements ) { print "<img src=\"".$toothpaste_home.$x."\""."><br>"; print "\n"; } Yolt Program begin $toothpaste_home="http://www.toothpastefordinner.com/"; $toothpaste:="http://www.toothpastefordinner.com/archives-sum02.php"; $tags="<a href=\"(.*)\">.*hamster.*</a>"; $elements = $tags @ $toothpaste; foreach $x in $elements { echo "<img src=\"".$toothpaste_home.$x."\""."><br>"; echo "\n"; } end Resultant HTML <img src="http://www.toothpastefordinner.com/072802/hamster-table-tennis.gif"><br> <img src="http://www.toothpastefordinner.com/072502/even-hamsters.gif"><br> <img src="http://www.toothpastefordinner.com/060602/hamsters-are-the-best.gif"><br>

  11. The Result The source site The end result

  12. Lessons Learned • Develop and test incrementally • There are ALWAYS bugs, you just haven’t found them yet • CLIC is not designed to be lived in

  13. One More Example