parsing with boost spirit n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Parsing with Boost.Spirit PowerPoint Presentation
Download Presentation
Parsing with Boost.Spirit

Loading in 2 Seconds...

play fullscreen
1 / 89

Parsing with Boost.Spirit - PowerPoint PPT Presentation


  • 156 Views
  • Uploaded on

Parsing with Boost.Spirit. Rob Stewart robert.stewart@sig.com. Overview. Introduction to Boost.Spirit Parsing with Qi Parsing ping command output Problems using Qi. Introduction to Boost.Spirit. Introduction to Boost.Spirit. Three sub-libraries Lex : Lexical analysis Qi: Parsing

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

Parsing with Boost.Spirit


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
    Presentation Transcript
    1. Parsing with Boost.Spirit Rob Stewart robert.stewart@sig.com

    2. Overview • Introduction to Boost.Spirit • Parsing with Qi • Parsing ping command output • Problems using Qi

    3. Introduction to Boost.Spirit

    4. Introduction to Boost.Spirit • Three sub-libraries • Lex: Lexical analysis • Qi: Parsing • Karma: Generating output • DSELs • Clear, readable because targeted to domain • Use within your C++ code • No external tools required

    5. Boost.Spirit.Lex • Tokenizes input • Parses character sequence • Produces tokens • Applies your grammar • Separates tokenization from analysis • Reduces complexity of parser • Not covered in this presentation

    6. Boost.Spirit.Qi • Converts sequence of tokens or characters • Implements a recursive descent parser • Parsing Expression Grammar (PEG) based • Similar to Extended Backus-Naur Form (EBNF) • Not ambiguous • Well-suited to computer languages • Ill-suited to natural languages • Replaces uses of scanf(), regular expressions, and tokenizers • Much more powerful and flexible than common tools

    7. Boost.Spirit.Karma • Produces character sequence from data • Can replace uses of printf(), std::ostream, boost::format(), etc. • Much more powerful and flexible than common output tools • Inverse of Qi • Not covered in this presentation

    8. Parsing with Qi

    9. Parsing Basics • Iterate input sequence • Optionally tokenize • Apply grammar • Indicate a match • Produce side effects • Save text • Convert text to another type • Call a function

    10. Parsers like Function Objects • Arguments: Inherited Attributes • Return value: Synthesized Attribute • State

    11. Parser Concept boolparse(FwdIt, FwdIt, Context, Skipper, Attribute); infowhat(Context);

    12. Kinds of Parsers • Primitive • char_, float_, int_, lit, etc. • Rule • Placeholder for one or more parsers • Reusable • Support recursion • Have a name (empty by default) • Grammar: • Encapsulates a set of rules, parsers, and nested grammars • High level abstraction • Offers modularization and composition

    13. Parsers for doubles • To parse one double: boost::spirit::qi::double_ • To parse two whitespace-delimited doubles: double_ >> double_ • Parsing zero or more doubles: *double_ • Parsing a comma-delimited list of doubles: double_ >> *(lit(',') >> double_)

    14. Parsing a Comma-delimited List of doubles double_ >> *(lit(',') >> double_)

    15. Parsing a Comma-delimited List of doubles double_ >> *(lit(',') >> double_) Matches sign, mantissa, and exponent

    16. Parsing a Comma-delimited List of doubles double_ >> *(lit(',') >> double_) Left side might be followed by right side

    17. Parsing a Comma-delimited List of doubles double_ >> *(lit(',') >> double_) Kleene star: zero or more

    18. Parsing a Comma-delimited List of doubles double_ >> *(lit(',') >> double_) Matches a comma which won’t be added to the synthesized attribute

    19. Parsing a Comma-delimited List of doubles double_ >> *(lit(',') >> double_)

    20. Parsing a Comma-delimited List of doubles double_ >> *(lit(',') >> double_)

    21. Parsing a Comma-delimited List of doubles double_ >> *(lit(',') >> double_) double_ % ',' Qi extends PEG operators for convenience

    22. Parsing Functions • boost::spirit::qi::parse() • Parses exactly what’s described by the supplied parser • Provides complete control over where whitespace may occur • Appropriate when parsing token sequences from Lex • boost::spirit::qi::phrase_parse() • Applies a skip parser between parsers comprising the main parser • Simplifies delimiter handling • Can disable for specific parts of the main parser

    23. Using parse() template <class It> bool matches(It _first, It _last) { return parse(_first, _last, double_ % ','); }

    24. Using phrase_parse() template <class It> bool matches(It _first, It _last) { return phrase_parse(_first, _last, double_ % ',', space); }

    25. Reality Isn’t Quite So Pretty #include <boost/spirit/include/qi.hpp> template <class It> bool matches(It _first, It _last) { using boost::spirit::qi::double_; using boost::spirit::qi::lit; using boost::spirit::qi::phrase_parse; using boost::spirit::ascii::space; return phrase_parse(_first, _last, double_ % ',', space); }

    26. Reality Isn’t Quite So Pretty #include <boost/spirit/include/qi.hpp> namespace qi = boost::spirit::qi; template <class It> bool matches(It _first, It _last) { using boost::spirit::ascii::space; return qi::phrase_parse(_first, _last, qi::double_ % ',', space); }

    27. Deconstructing phrase_parse() Calls template <class It> bool matches(It _first, It _last) { return phrase_parse( _first, _last, double_ % ',', space) && _first == _last; }

    28. Deconstructing phrase_parse() Calls template <class It> bool matches(It _first, It _last) { return phrase_parse( _first, _last, double_ % ',', space) && _first == _last; } Half open input range of characters

    29. Deconstructing phrase_parse() Calls template <class It> bool matches(It _first, It _last) { return phrase_parse( _first, _last, double_ % ',', space) && _first == _last; } The parser to apply

    30. Deconstructing phrase_parse() Calls template <class It> bool matches(It _first, It _last) { return phrase_parse( _first, _last, double_ % ',', space) && _first == _last; } The skip parser

    31. Deconstructing phrase_parse() Calls template <class It> bool matches(It _first, It _last) { return phrase_parse( _first, _last, double_ % ',', space) && _first == _last; } Check that the entire input range was consumed

    32. Example: Parsing ping Command Output

    33. ping Command Output PING www.google.com (74.125.131.147) 56(84) bytes of data. 64 bytes from vc-in-f147.1e100.net (74.125.131.147): icmp_seq=1 ttl=39 time=24.6 ms 64 bytes from vc-in-f147.1e100.net (74.125.131.147): icmp_seq=2 ttl=39 time=20.5 ms 64 bytes from vc-in-f147.1e100.net (74.125.131.147): icmp_seq=3 ttl=39 time=18.9 ms --- www.google.com ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2003ms rtt min/avg/max/mdev = 18.984/21.411/24.697/2.410 ms

    34. Creating the ping Parser template <class It, class Skipper>class ping::parser : public qi::grammar<It,Skipper>{public:parser() { // grammar here } private: // rules here};

    35. Creating the ping Parser template <class It, class Skipper>class ping::parser : public qi::grammar<It,Skipper>{public:parser() { // grammar here } private: // rules here};

    36. Creating the ping Parser template <class It, class Skipper>class ping::parser : public qi::grammar<It,Skipper>{public:parser() { // grammar here } private: // rules here};

    37. Creating the ping Parser template <class It, class Skipper>class ping::parser : public qi::grammar<It,Skipper>{public:parser() { // grammar here } private: // rules here};

    38. Creating the ping Parser public: parser() : parser::base_type(start, "ping parser") { } private: qi::rule<It,Skipper> start;

    39. Creating the ping Parser public: parser() : parser::base_type(start, "ping parser") { } private:qi::rule<It,Skipper> start;

    40. Creating the ping Parser public: parser() : parser::base_type(start, "ping parser") { } private: qi::rule<It,Skipper> start;

    41. start Rule PING www.google.com (74.125.131.147) 56(84) bytes of data. start = lit("PING") …

    42. start Rule PING www.google.com (74.125.131.147) 56(84) bytes of data. start = lit("PING")> host …

    43. start Rule PING www.google.com (74.125.131.147) 56(84) bytes of data. start = lit("PING")> host> ip_address …

    44. start Rule PING www.google.com (74.125.131.147) 56(84) bytes of data. start = lit("PING")> host> ip_address> +(char_ - '.') > '.' …

    45. start Rule PING www.google.com (74.125.131.147) 56(84) bytes of data. start = lit("PING")> host> ip_address> +(omit[char_] - '.') > '.' …

    46. start Rule PING www.google.com (74.125.131.147) 56(84) bytes of data. start = lit("PING")> host> ip_address> +(omit[char_] - '.') > '.' > eol …

    47. start Rule PING www.google.com (74.125.131.147) 56(84) bytes of data. start = lit("PING")>host>ip_address>+(omit[char_] - '.') >'.'>eol …

    48. start Rule PING www.google.com (74.125.131.147) 56(84) bytes of data. 64 bytes from vc-in-f147.1e100.net (74.125.131.147): icmp_seq=1 ttl=39 time=24.6 ms 64 bytes from vc-in-f147.1e100.net (74.125.131.147): icmp_seq=2 ttl=39 time=20.5 ms 64 bytes from vc-in-f147.1e100.net (74.125.131.147): icmp_seq=3 ttl=39 time=18.9 ms start = lit("PING")>host> ip_address> +(omit[char_] - '.') > '.' > eol >> *(reply > eol) …

    49. start Rule PING www.google.com (74.125.131.147) 56(84) bytes of data. 64 bytes from vc-in-f147.1e100.net (74.125.131.147): icmp_seq=1 ttl=39 time=24.6 ms 64 bytes from vc-in-f147.1e100.net (74.125.131.147): icmp_seq=2 ttl=39 time=20.5 ms 64 bytes from vc-in-f147.1e100.net (74.125.131.147): icmp_seq=3 ttl=39 time=18.9 ms --- www.google.com ping statistics ---start = lit("PING") … >> *(reply > eol) > eol > +(omit[char_("A-Za-z0-9.-")]) > eol …

    50. start Rule PING www.google.com (74.125.131.147) 56(84) bytes of data. 64 bytes from vc-in-f147.1e100.net (74.125.131.147): icmp_seq=1 ttl=39 time=24.6 ms 64 bytes from vc-in-f147.1e100.net (74.125.131.147): icmp_seq=2 ttl=39 time=20.5 ms 64 bytes from vc-in-f147.1e100.net (74.125.131.147): icmp_seq=3 ttl=39 time=18.9 ms --- www.google.com ping statistics ---start = lit("PING") … >> *(reply > eol) > eol > +(omit[char_] - eol) > eol …