1 / 18

Strings

Strings. Definition. A string is a sequence of symbols. Examples. “Hi, Mom.”. “YAK”. “abbababba”. “#@?_!L!!”. Question. In what ways do programmers use strings?. String Terminology. A string is also called a __________ .

feng
Download Presentation

Strings

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Strings Definition A string is a sequence of symbols. Examples “Hi, Mom.” “YAK” “abbababba” “#@?_!L!!” Question In what ways do programmers use strings?

  2. String Terminology A string is also called a __________. An ____________ is the set of all possible symbols that can be included in a valid string. The pattern of symbols that make up a string or a set of strings is referred to as __________. The length of a string is the number of symbols in the sequence. The ______ string of a string of length zero (0). Some Special Notation denotes a blank (space) denotes a null string (alternative notations:  )

  3. Languages A language is set of strings. Examples Bit2Strings = {, 0, 1, 00, 01, 10, 11} Integers = { s | s is is a valid string representing an integer inn Arabic form} GoldiesComments = { theporridgeistoohot. , theporridgeistoocold. , theporridgeisjustright. } AdaIntegerOps = { +, - *, /, **, mod, rem, <, <=, >, >=, =, /= } SheepSpeak = { ba, baa, baaa, baaaa, ... } AdaIdentifiers = { s | s is a valid identifier of the Ada programming language } Ada identifiers must begin with an alphabetic letter followed by zero or more alphabetic, numeric and/or underscore characters. FortranIdentifiers = { s | s is a valid identifier of the Fortran IV language } Fortran IV identifiers are like Ada’s, with two exceptions: (1) no underscores are permitted, (2) maximum length is 8 characters

  4. Questions (Using languages from the previous slide...) 1) Which languages are finite (and which are infinite)? Bit2Strings? GoldiesComments? AdaIntegerOps? SheepSpeak? AdaIdentifiers? FortranIdentifiers? 2) What is the “smallest” alphabet for each language? Bit2Strings GoldiesComments AdaIntegerOps SheepSpeak AdaIdentifiers FortranIdentifiers

  5. Metalanguages A language that is used to define other languages is called a metalanguage. Metalanguages that are widely used by computer scientists: • _____________ (especially context-free grammars) • ___________ • _________ ___________ • _________ ____________ • _________ ___________ __________

  6. Backus Naur Form BNF is a tool used to define the syntax of programming languages by using a collection of rules. Each________ (production) defines one part of the syntax. The syntax defined by a rule is named with a ___________identifier. The form of a rule is as follows: nonterminal ::= syntaxOfNonterminal alternately written as shown below: nonterminalsyntaxOfNonterminal

  7. BNF Notation Example <byte><bit> <bit> <bit> <bit> <bit> <bit> <bit> <bit> <bit> 0 | 1  can be read “is defined to be” | separates alternative syntactic forms < > often used to enclose nonterminals ( ) can be used as grouping symbols

  8. A second example <upTo3DigitNum><digit> | <digit> <digit> | <digit> <digit> <digit> <digit> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

  9. BNF & Recursion Example <integer><digit> | <digit> <integer> <digit> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 Another Example <instructionSequence>  | <instructionSequence> <instr> <instr> ; | while ( <condition> ) <instr> | if ( <condition> ) <instr> | ( <instructionSequence> ) BNF language definitions make frequent use of recursive rules.

  10. Extended BNF Example Number[ ‘+’ | ‘-’] Unsigned_Number Unsigned_NumberInteger | Integer ‘.‘ Integer BNF notation has been extended in many ways . Common EBNF Extensions [ ] encloses syntax that is optional { } encloses syntax that can be repeated consecutively 0 or more times < > eliminated from nonterminals (capitalized word phrases are used) Single-character symbols are generally quoted.

  11. A second example Modula_If IF Cond THEN Instruction_Sequence { Elseif_Part } [ Else_Part ] END Elseif_Part ELSIF Cond THEN Instruction_Sequence Else_Part ELSE Instruction_Sequence

  12. Generators A generator defines a metalanguage by stating rules that can be applied to generate each string of the language (and no other strings). The rules of a generator can be applied to provide a derivation of any string in the language. <realConstant><integer> . <integer>  <integer><digit>  | <digit> <integer>  <digit> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9  Sample Derivation rule <realConstant>  <integer> . <integer>  <integer> . <digit> <integer>  <digit> . <digit> <integer>  3 . <digit> <integer>  3 . 2 <integer>  3 . 2 <digit>  3 . 2 1

  13. Derivation Trees A derivation tree captures a derivation by showing each applied rule from the nonterminal (parent node) to its derived string (ordered children). <realConstant><integer> . <integer>  <integer><digit>  | <digit> <integer>  <digit> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9  <realConstant> Frequently, a single derivation tree captures several different derivations.

  14. Regular Expressions A regular expression (RE) is a textual metalanguage. Operations concatenation RE1 RE2 a b denotes alternation RE1| RE2 a | b denotes repetition RE1* or RE1+ a* denotes a+ denotes any symbol . . denotes the alphabet range [1-2 ] [a-e] denotes Note: There are many BNF languages that cannot be expressed as an RE.

  15. Syntax Diagrams A syntax diagram is a graphical metalanguage with the following characteristics: • complete syntax diagram consists of one or more disconnected components of a directed graph. • Every node in a syntax diagram contains a symbol(s) or the name of another component. • There is one arc entering each component from the left and one leaving each component on the right Usage Each component represents one part of the syntax. A valid string is formed by following the arrows through the diagram, from left to right.

  16. Syntax Diagrams phoneNumber digit 0 1 2 3 4 5 6 7 7 8 5 digit digit digit digit 8 9 Example

  17. More About Metalanguages Metalanguages fall into two categories: • generators that can be applied to generate all possible strings. • recognizers that can be applied to any string to discover whether or not it is included in the language. BNF is a ___________. REs are ________________. Syntax diagrams can be applied _________________________________.

  18. Metalanguages - What they don't say Formatting Issues • fixed format or free format? • where are ends of lines/blanks permitted? • are there line break requirements? Comments • single line comments: -- or // • Multi-line comments: /* … */ or { … } • nested comments allowed? Limitations • maximum line/identifier length? • maximum number of components? Reserved words • identifiers with default behavior that can be overridden (keywords)? • predefined identifiers (reserved words) that cannot be otherwise used? Semantic Issues • metalanguages generally ignore data typing restrictions • metalanguages often ignore constant limitations

More Related