1 / 33

An Evaluation of Automata Algorithms for String Analysis

An Evaluation of Automata Algorithms for String Analysis. Pieter Hooimeijer University of Virginia Margus Veanes Microsoft Research. VMCAI 2011. TL;DR. We evaluate several existing approaches for explicit-state string constraint solving, fixing external factors as much as possible.

isanne
Download Presentation

An Evaluation of Automata Algorithms for String Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Evaluation of Automata Algorithms for String Analysis Pieter Hooimeijer University of Virginia MargusVeanes Microsoft Research VMCAI 2011

  2. TL;DR We evaluate several existing approaches for explicit-state string constraint solving, fixing external factors as much as possible.

  3. Outline • Motivation • string constraint solvers • string-related programming idioms • This Paper • benchmark and study design • results

  4. Motivation Reasoning about strings is difficult: • for programmers • for automated tools

  5. Constraint Solvers Hampi Kaluza Rex

  6. Constraint Solvers Hampi Kaluza Rex ✔ String a;//... R = Regex("^ab$"); R.IsMatch(a) = true; String a;//... R = Regex("^ab$"); R.IsMatch(a) = true;

  7. Constraint Solvers Hampi Kaluza Rex String a;// ...R = Regex("^ab$"); if (R.IsMatch(a)){ // ... } String a;//... R = Regex("^ab$"); R.IsMatch(a) = true;

  8. what (not) to model

  9. Example 1 char *sp = (char *) strchr(cmd , ’ ’); char *slash; while(sp && (slash = (char *) strchr(cmd, ’/’)) && (slash < sp)) { cmd= slash + 1; }

  10. Example 1 char *sp = (char *) strchr(cmd , ’ ’); char *slash; while(sp && (slash = (char *) strchr(cmd, ’/’)) && (slash < sp)) { cmd= slash + 1; }

  11. Example 2 How hard is regexmatching in Perl?

  12. Example 2 perl–wle 'print"Prime" if(1 xshift) !~ /^1?$|^(11+?)\1+$/' http://montreal.pm.org/tech/neil_kandalgaonkar.shtml

  13. Example 2 perl–wle 'print"Prime" if(1 xshift) !~ /^1?$|^(11+?)\1+$/'

  14. Example 2 • Anchors • Non-eager matching • Backreferences /^1?$|^(11+?)\1+$/

  15. In this paper

  16. Motivation • Existing work provides tool-to-tool performance comparisons • Confounds: Performance gains may be due to external factors

  17. The Framework • Based on Rex • Fixes external factors: • front-end parser • regex-to-automaton conversion • implementation language • search tree

  18. Character Sets binary decision diagramssymbolic bitvector ranges in DNF concrete set of character ranges concrete set of individual characters BDDPred Range Hash

  19. Charset Interface

  20. Minterms

  21. Study Design Task 1 (55x): Task 2 (100x):

  22. Study Design Lazy Eager Task 1 (55x): Task 2 (100x):

  23. Study Design Lazy Eager Task 1 (55x): Unicode Unicode ASCII ASCII Task 2 (100x): Unicode Unicode ASCII ASCII

  24. Results

  25. Regular Difference Lazy Eager Task 1 (55x): Unicode Unicode ASCII ASCII Task 2 (100x): Unicode Unicode ASCII ASCII

  26. Eager Lazy Regular Intersection ASCII BDD Pred Range Hash BDD Pred Range Hash ASCII Unicode

  27. Eager Lazy Regular Intersection ASCII BDD Pred Range Hash BDD Pred Range Hash ASCII Unicode

  28. In Aggregate Lazy Eager Task 1 (55x): Unicode Unicode ASCII ASCII Task 2 (100x): Unicode Unicode ASCII ASCII

  29. In Aggregate

  30. In Aggregate

  31. Conclusion • For Unicode: BDD-based approach and lazy search are fastest • SMT-based Pred approach outperforms concrete Range version

  32. Thanks!

More Related